Have the hinge loss function and nuclear norm problems in cvx?

Jason_Ko · December 12, 2013, 4:30am

Hi, i face this problem which i try find the solutions in cvx. But i failed.
Could anyone see this before?

problems:

argmin_A(sum(hinge_loss(theta(x,y)'*A*theta(x,y) - theta(u,v)'*A*theta(u,v)))+miu*nuclear_norm(A))

where (x,y) in positive sets, (u,v) in negative sets, miu is the trad_off parameter.
And the variable is A, theta(z) is the distance, ie, Hamming distance or Euclidean distance;
hinge_loss(z) = max(z,100), nuclear_norm(z) = ||z||_* = trace(Z).

A is positive semi-definitive, because of A is w’*w. Then A is convex, hinge loss function is convex, and nuclear norm is convex. So it may be solved by cvx.

I try cvx code:

miu = logspace(-2,2,10);
z = 10000;
I=eye(100,100);

for k =1 : length(miu)
    fprintf(1,'%8.4e',miu(k));
    cvx_begin
       variable A(30,30) hermitian semidefinite
       minimize(sum(max(thetaX'*A*thetaX - thetaU'*A*thetaU + z*I,0) + miu(k)*trace(A)*I))
    cvx_end
    l1 = thetaX'*A*thetaX - thetaU'*A*thetaU;
    l2 = trace(A);
    fprintf(1,'    %8.4e    %8.4e\n',l1(k),l2(k));
end

It is wrong. Somebody can help me? and it shows:
Error using cvx/max (line 88)
Disciplined convex programming error:
Cannot perform the operation max( {complex
affine}, {constant} )

Mark_L_Stone · December 12, 2013, 11:48pm

At the MATLAB prompt,

help max

click on Overloaded methods max/cvx

“Disciplined convex/geometric programming information:
max is convex, log-log-convex, and nondecreasing in its first
two arguments. Thus when used in disciplined convex programs,
both arguments must be convex (or affine). In disciplined
geometric programs, both arguments must be log-convex/affine.”

The first argument to max is

thetaX'*A*thetaX - thetaU'*A*thetaU + z*I

which is not DCP-compliant, on account of

- thetaU'*A*thetaU

So you have violated the DCP rules as written. Based on values of thetaX and thetaU, which you have not provided, can you prove that

thetaX'*A*thetaX - thetaU'*A*thetaU + z*I

is convex, and then reformulate consistent with DCP rules?

mcg · December 13, 2013, 5:33am

The problem is that, due to small numerical errors, the first argument in your max expression is complex. It’s not, of course, in exact arithmetic, but we do not have that here. Try taking the real part of that expression.

mcg · December 13, 2013, 5:39am

Actually, what I just said is only true if thetaX and thetaU are vectors. If they are matrices, then your problem isn’t well-posed, as you have a non-scalar objective. In fact, you do anyway, due to the z*I term.

Jason_Ko · December 13, 2013, 9:56am

The background of this problem is minimize the distances between positive sample-pairs(x-y pair) and negative sample-pairs(u-v pair). The first term of max function is D(x,y) + 100 < D(u,v), the square it and get D^2(x,y) + 10000 < D^2(u,v). where D(x,y) = w*G(x,y). Here A = w’*w

mcg · December 13, 2013, 2:04pm

The background is not helpful. The point is, your objective function is ill-posed. It’s not even a scalar. Figure out what it is you’re actually trying to express, and you’ll be closer to your answer. And if necessary, take the real part, to eliminate any small imaginary portion due to roundoff.

Jason_Ko · December 13, 2013, 4:19pm

The background of this problem is minimize the distances between positive sample-pairs(x-y pair) and negative sample-pairs(u-v pair). The first term of max function is

D(x,y) + 100 < D(u,v)

then square it and get

D^2(x,y) + 10000 < D^2(u,v)

where

D(x,y) = w*theta(x,y),theta(x,y) = Hamming_distance(x,y)

Deduce D^2(x,y), i get

D^2(x,y) = (w*theta(x,y))^2   =  (w*theta(x,y))'(w*theta(x,y))
         =theta'(x,y)*w'*w*theta(x,y)  =  theta'(x,y)*A*theta(x,y)

Here A =w’w, A is positive semi-definite matrix and this is convex.

There are 100 positive sample-pairs(x-y pairs) and 100 negative sample-pairs(u-v pairs),x,y are 1*100 vectors, there are 30 groups of G(x,y).so here

thetaXY, and thetaUV are 30*100 matrices, A is 30*30 matrix

so the final optimization problem is

argmin_A(sum(hinge_loss(theta(x,y)'*A*theta(x,y) - theta(u,v)'*A*theta(u,v))+miu*nuclear_norm(A)))

where

hinge_loss(z)= max(z+10000,0)

So i modify it to cvx:

minimize(sum(max(thetaXY'*A*thetaXY - thetaUV'*A*thetaUV + z*I,0) + miu(k)*trace(A)*I))

Jason_Ko · December 13, 2013, 4:24pm

I am so sorry that i didn’t post the whole background in time and so happy for your answer. i have post it clear bellow. i prove that the thetaXY’*A*thetaXY - thetaUV’*A*thetaUV is convex on A. So i think it may be worked by cvx, am i right?

mcg · December 13, 2013, 4:33pm

The quantity thetaXY'*A*thetaXY - thetaUV'*A*thetaUV + z*I is a complex Hermitian matrix. It doesn’t make sense to try and perform the operation max(.,0) on it, because its elements are complex. Furthermore, your objective function is not a scalar. Your problem remains ill-posed, as I said.

mcg · December 13, 2013, 4:37pm

You’re going to have to think carefully about what you’re trying to accomplish here. I don’t know the application, so I can’t tell you how to fix it, but you are not creating the model you think you are—yet!

Jason_Ko · December 16, 2013, 2:26am

oh. let me think it deeply. so thanks to answer me so patiently.