How to formulate sum of square efficiently?


(Jianqiao Huang) #1

Hi,

I have attached the original problem and my model. I used a lot of 2*2 semidefinte matrix inequality to express the sum of square. It takes cvx 9 seconds to model a small problem, and it takes MOSEK 0.2 seconds to solve the problem.

I want to express the sum of square in rotated quadratic cones. However, someone posted a problem that it takes cvx a lot of time to build a lot of rotated quadratic cones. The problem was solved by using geo_mean.

How can I formulate this problem more efficiently? Is geo_mean better than rotated quadratic cone for my problem?


(Mark L. Stone) #2

Are you referring to this post Many rotated_lorentz(1) constrains - Very slow ?

I will defer to someone else on possible use of geo_mean , but I can tell you that CVX converts 2 by 2 LMIs (and some other LMIs for which it figures out how) into SOC constraints before sending the problem to the solver. So I suppose that if you enter them as SOC constraints rather than LMIs, you may save whatever time it takes CVX to do the conversion - I don;t know whether that is a significant time or negligible.

I think what mcg is getting at in the link is that you will save a lot of CVX modeling time if you can enter a vectorized formulation which avoids for loops to the maximum extent possible, whether that involves use of geo_mean or otherwise. One or tow big rotated cones or geo_mean cones are better than larger numbers of “little” ones in terms of CVX modeling time.


(Jianqiao Huang) #3

Hi,

Yes. I refer to the post [Many rotated_lorentz(1) constrains - Very slow]. I have 3 questions.

(1) Is rotated quadratic cone a suitable or good SOC for my model?

(2) How to form a vectorized formulation?

(3) If I combine a lot of rotated quadratic cones to a big cone, will cvx decompose it to a couple of smaller cones before sending it to MOSEK?

Thanks,
Jianqiao Huang


(Mark L. Stone) #4

Did you look at this link from the other link?

mcg wrote:
I’m sure there’s a way to build all of the second-order cones in one single command—or perhaps two, one for the a/b constraint and one for the c/d constraint. See the help section [here ](https://github.com/cvxr/CVX/blob/master/sets/rotated_lorentz.m) for more information.

Specifically,
% ROTATED_LORENTZ(SX,DIM), where SX is a valid size vector and DIM is a
% positive integer, creates an array variable of size SX and two
% array variables of size SY (see below) and applies the second-order cone
% constraint along dimension DIM.

%ROTATED_LORENTZ Rotated real second-order cone.
% ROTATED_LORENTZ(N), where N is a positive integer, creates a column
% variable of length N and two scalar variables, and constrains them
% to lie in a rotated second-order cone. That is, given the declaration
% variables x(n) y z
% the constraint
% {x,y,z} == rotated_lorentz(n)
% is equivalent to
% norm(x,2) <= geo_mean([y,z])
% except that using ROTATED_LORENTZ is more efficient.
%
% ROTATED_LORENTZ(SX,DIM), where SX is a valid size vector and DIM is a
% positive integer, creates an array variable of size SX and two
% array variables of size SY (see below) and applies the second-order cone
% constraint along dimension DIM. That is, given the declarations
% sy = sx; sy(min(dim,length(sx)+1))=1;
% variables x(sx) y(sy) z(sz)
% the constraint
% {x,y,z} == rotated_lorentz(sx,dim)
% is equivalent to
% norms(x,2,dim) <= geo_mean(cat(dim,y,z),dim)
% except, again, ROTATED_LORENTZ is more efficient. DIM is optional; if
% it is omitted, the first non-singleton dimension is used.


(Jianqiao Huang) #5

Hi Mark

Thanks. In cvx document, rotated_lorentz(n) is ||x||<=sqrt(yz). But my formula is min ( w1(z1-trace(H1X))^2+w2(z2-trace(H2X))^2+…+wm(zm-trace(Hm*X))^2), where X is a matrix variable, r is a variable and z is scalar constant, H is a constant matrix. I formulate in the following way.

min( w’*sum_square( r )),
st. {z-y,1,r} == rotated_lorentz(m),
where z is m * 1 constant vector, y is m * 1 variable array, yi=trace(Hi * X).

Is it correct? Is it better than the original 2*2 semidefinite constraint?

I am not sure about the above formula, as my objective is sum of squares, but rotated square cone is norm form, not square form.

Thanks,
Jianqiao Huang


(Mark L. Stone) #6

Square root is monotonically increasing, so minimize the square root of your original objective. The w_i's must be nonnegative for your original objective to be convex, so there is no difficulty in taking their square root.

As for how “good” it is, try running it and see how long it takes.


(Jianqiao Huang) #7

Hi Mark,

I said something not clearly. I want to formulate ri>=(z-trace(HiX))^2. But {z-trace(HiX),ri,1} == rotated_lorentz(1) only gives me ri>=||z-trace(Hi * X)||. It is not what I want.

Is there any expression that can directly express r1>=(z1-trace(H1 * X))^2 instead of r1>=||z1-trace(H1 * X)||?

My objective function is to minimize ( min ( w1* (z1-trace(H1 * X)) ^2 +w2* (z2-trace(H2 X)) ^2 +…+wm (zm-trace(Hm*X)) ^2).

If I have to use rotated_lorentz, then is it correct to formulate in the following way.

min square(r1)+square(r2)+…square(rm)
st. {zi-trace(Hi * X),1,r} == rotated_lorentz(1),

(In cvx, square(r1)=r1^2. )

Thanks,
Jianqiao Huang


(Mark L. Stone) #8

{sqrt(w)*x,1,r} == rotated_lorentz(1) is equivalent to norm(sqrt(w)*x) <= sqrt(r), r >= 0 which is equivalent to w*sum_square(x) <= r


(Jianqiao Huang) #9

Hi Mark

Thanks. I just replace a lot of rotated quadratic cones with one command, but I need to introduce a lot of extra scalar variables F. Is the following formulation correct? Is it possible to eliminate the extra variables.

original rotated quadratic cone constraints:
{z1-trace(H1 * X),1,r1} == rotated_lorentz(1);
{z2-trace(H2 * X),1,r2} == rotated_lorentz(1);
.
.
.
{zm-trace(Hm * X),1,rm} == rotated_lorentz(1);

New formulation:
{F1,r,ones(m,1)} == rotated_lorentz(m);
where r=[r1;r2;…rm];

extra constraints:
F1(1)==z1-trace(H1 * X),1,r1} ;
F1(2)==z1-trace(H2 * X),1,r2} ;
.
.
.
F1(m)==z1-trace(Hm * X),1,rm} ;


(Mark L. Stone) #10

I think you need to use the form

ROTATED_LORENTZ(SX,DIM), where SX is a valid size vector and DIM is a
% positive integer, creates an array variable of size SX and two
% array variables of size SY (see below) and applies the second-order cone
% constraint along dimension DIM. That is, given the declarations
% sy = sx; sy(min(dim,length(sx)+1))=1;
% variables x(sx) y(sy) z(sz)
% the constraint
% {x,y,z} == rotated_lorentz(sx,dim)
% is equivalent to
% norms(x,2,dim) <= geo_mean(cat(dim,y,z),dim)
% except, again, ROTATED_LORENTZ is more efficient. DIM is optional; if
% it is omitted, the first non-singleton dimension is used.

I’ve never used this, so I’ll let you figure it out and check your work. Develop and test your code with small value of m. When that checks out, scale up to your full problem size.