DCP representation of $−1/2 log(z/(1−z))(log(z)−2z−1)$

Hi there!

Is there a DCP compliant representation of

T(z) = -\frac{1}{2}\log\left(\frac{z}{1-z}\right)\left(\log(z) - 2z-1\right),

where z \in (0,1)?

This function is concave, as we can see from the second derivative plot in Mathematica:

This shows up in a trajectory optimization/ statistical inference problem I’m trying to solve.

I’m aware this might not be possible, but it’s worth trying.

Thanks in advance!

I will be surprised if this can be made DCP compliant.

This post is related to convex optimization - How to write this objective in CVXPY for quasiconvex programming? - Operations Research Stack Exchange.

Indeed, I suggested in that OR SE post to post it here, but with my assessment that I doubt there is a DCP representation for T; and I also suggested it is a likely candidate for @Erling 's Challenge

Thanks for the swift responses @Mark_L_Stone and @Erling!

I haven’t yet found a representation, but I found some interesting things which I detail in the OR stack Exchange post I made, which was linked by @Erling.

I found a way to write T as the sum of three concave functions.
Two of them are easy to write as a DCP function, by using the partial_optimize transform.

The third one is kind of tricky, but I think I might be almost there

If someone answers there, or if I find the result I’ll be sure to post it here for completion, but the question now essentially boils down to finding a DCP representation of -\log(t)\log(1-t), for t\in(0,1).

What I did was write: 2\log(t)\log(1-t) = \log(t(1-t))^2 - \log(t/(1-t))^2, and try to write it as another partial_optimize construct. Essentially:

I think something like this might work, since it’s the composition of simpler convex functions

I’d really appreciate if you could take a look!

Thanks once again!

If I understand correctly, ptp(x) = max(x) -min(x). So I don’t see how ptp could accept a non-affine argument, as you are proposing. because max requires a convex argument and min requires a concave argument. Am I wrong?

https://www.cvxpy.org/api_reference/cvxpy.atoms.other_atoms.html#ptp doesn’t indicate argument requirements. Are they documented somewhere?

One similar function 1/(logx)(logy),x>1,y>1 can be well expressed by CVX, but -(logx)(log(1-x)), 0<x<1 is very different.

I think I might have figured it out.

For \log^2(t) we just take the partial problem

\min_{z} z \qquad \text{s.t.: } w\geq -\log(t),\,w\geq 0,\,z\geq w^2

For -\log(t)\log(1-t) we can approximate it numerically and DCP by using the fact:

\text{Li}_2(t) + \text{Li}_2(1-t) = \frac{\pi^2}{6} - \log(t)\log(1-t).

There is an integral representation for \text{Li}_2(t):

\text{Li}_2(t) = -\int_{0}^{t}\frac{\log(1-z)}{z}dz = -\int_{0}^{1}\frac{\log(1-t z)}{z}dz,

so using a trapezoidal rule, we can write:

\text{Li}_2(t) \approx -\frac{1}{N_z}\sum_{k=1}^{N_t}\frac{1}{2}\left(\frac{\log(1-t z_k)}{z_k} + \frac{\log(1-t z_{k-1})}{z_{k-1}}\right),

where z_k for k\in\{0,N_t\} discretizes the interval [0,1].

I’ve tried this for a simple, univariate problem and it worked beautifully!

Now, is there a way to make a sort of elementwise version of this in CVXPY?
i.e.: receive a vector/matrix \mathbf{t} and return -\log(\mathbf{t})\log(1-\mathbf{t}) where the evaluation is elementwise?

1 Like

Easy to do vectorized elementwise n MATLAB/CVX, the subject of this forum.

A CVXPY forum would be more appropriate as to how to do it in CVXPY.

I followed the recipe from my talk at https://youtu.be/2G6xbQZHDHk?t=684 and found a representation of t \geq -\log(x) \log(1-x) which satisfies all composition rules, namely

t \geq - s \log\left( 1 - \exp(r/s) \right),\quad r \geq p^2,\quad s=p,\quad p \leq \log(x),

representable as

s\exp(-t/s) + s\exp(r/s) \leq s,\quad r \geq p^2,\quad s=p,\quad p \leq \log(x).

The problem is that the domain on which this representation is exact is limited to x=1, where t \geq 0 is implied, since s = p \leq \log(x) \leq 0 on x \leq 1, and s \geq 0 by its use as a perspective variable. This is a show stopper. Hope your approximation turns out to be useful.