Optimizing on an Identity Matrix and tracking the convergence

mihirmalladi · January 31, 2024, 4:08pm

Given the cost function in the picture above I need to minimize this such that norm(R,1)<=1 and std(diag(R))<=0.1

my initialization for the R matrix is an identity matrix so after the optimization I must obtain a matrix that is as close as the identity matrix.

D(X) is a difference operator given by the D(X)=X-RXB.

Here is the function I made:

function R=RLearning(R,alpha,gamma,kappa,X,L,B)
N=max(size(R));
[numnode,numhour]=size(X);
gamma=1e-3;
kappa=kappa;
% Obj_var=alpha*trace(transpose(X)*L*X)+gamma*(norm(R,'fro'))^2;
Obj_var=norm(Diff(X,R,B),'fro')^2+alpha*trace(transpose(Diff(X,R,B))*L*Diff(X,R,B))+norm(L,'fro')+gamma*(norm_nuc(R-eye(N)))^2;
cvx_precision high;

cvx_begin 

 variable R(numnode,numnode) diagonal
  % variable L(numnode,numnode)
  % variable X(numnode,numhour)
  % variable B(numhour,numhour)
 minimize(Obj_var)
 subject to 
    norm(R,1)<=1;
    %issymmetric(Z);
    std(diag(R))<=kappa;
    %trace(Z)==N;
    % R==eye(N);
cvx_solver_settings('dumpfile', 'cvx_debug.txt');
cvx_solver_settings('max_iters', 100); 
% dumpfile_contents = fileread('cvx_debug.txt');
% disp(dumpfile_contents);
cvx_end
disp('Optimization finished.');
disp(['Exit status: ' cvx_status]);
disp(['Objective function value: ' num2str(Obj_var)]);
disp(['Number of iterations: ' num2str(cvx_slvitr)]);
disp(['Convergence tolerance: ' num2str(cvx_slvtol)]);
difference_term = Diff(T, R, B);
frobenius_term = norm(difference_term, 'fro')^2;
trace_term = alpha * trace(transpose(difference_term) * L * difference_term);
nuclear_term = gamma * norm_nuc(R - eye(numnode))^2;

disp(['Frobenius norm term: ' num2str(frobenius_term)]);
disp(['Trace term: ' num2str(trace_term)]);
disp(['Nuclear norm term: ' num2str(nuclear_term)]);

scaled_R = 0.7 * eye(size(R));
scaling_factor = norm(scaled_R, 'fro') / norm(R, 'fro');
adjusted_R = scaling_factor * R;
disp(['Scaling Factor: ' num2str(scaling_factor)]);
disp('Adjusted R:');
disp(adjusted_R);
end

I tried several times but the results I procure are 0.3s scaled to 1e-14 values in the optimized R matrix. Please let me know where there is an issue and how do I track and fix it.

These are the results that I am getting

Optimization finished.
Exit status: Solved
Objective function value: 235.7059
Number of iterations: 26
Convergence tolerance: 1.6851e-12
Frobenius norm term: 480264
Trace term: 1910.89
Nuclear norm term: 625

R

R =

1.0e-14 *

(1,1) 0.3144
(2,2) 0.3144
(3,3) 0.3144
(4,4) 0.3144
(5,5) 0.3144
(6,6) 0.3144
(7,7) 0.3144
(8,8) 0.3144
(9,9) 0.3144
(10,10) 0.3144
(11,11) 0.3144
(12,12) 0.3144
(13,13) 0.3144
(14,14) 0.3144
(15,15) 0.3144
(16,16) 0.3144
(17,17) 0.3144
(18,18) 0.3144
(19,19) 0.3144
(20,20) 0.3144
(21,21) 0.3144
(22,22) 0.3144
(23,23) 0.3144
(24,24) 0.3144
(25,25) 0.3144

Mark_L_Stone · January 31, 2024, 5:01pm

I don’t see how your program would have run as written. Perhaps you are using CVX 3.0beta? If so, don’t; use CVX 2.2.

I don’t see how Obj_var even gets set within your CVX program, because it never calls Rlearning. So I don’t know what’s going on. I don’t know where the value of Obj_var which your program displays comes from.

Also, don’t use cvx_precision high…

Nevertheless, it appears that the solution is (essentially) the zero matrix. If that is not an acceptable solution, perhaps that indicates your model is not suited for your purpose and should be modified in a way for you to determine. But then again, you show a collection of code which doesn’t make sense, and you don’t show the CVX and solver output, so I have no idea what you’ve done. Maybe Obj_var is set as a double precision number prior to cvx_begin`, and therefore the optimization problem is to minimize a constant, subject to constraints, and the solver produced zero as an optimal solution, because it is feasible, and therefore optimal for optimizing a constant objective.

mihirmalladi · January 31, 2024, 5:33pm

“I don’t see how Obj_var even gets set within your CVX program, because it never calls Rlearning . So I don’t know what’s going on. I don’t know where the value of Obj_var which your program displays comes from.”

*I have not completely understood the paragraph of yours. I am not using any recursion. I am showing the snippet of the code. Obj_var is a scalar term. Even if I use minimize(all the terms) I get the same results
cvx_precision high is the suggestion from ChatGPT.

I am not a thorough expert of CVX.

Here is the output if the CVX

Calling SDPT3 4.0: 105 variables, 52 equality constraints

NOTE: custom settings have been set for this solver.
Saving output to: cvx_debug.txt.mat

num. of constraints = 52
dim. of socp var = 76, num. of socp blk = 26
dim. of linear var = 28
dim. of free var = 1 *** convert ublk to lblk

SDPT3: Infeasible path-following algorithms

version predcorr gam expon scale_data
NT 1 0.000 1 0
it pstep dstep pinfeas dinfeas gap prim-obj dual-obj cputime

0|0.000|0.000|1.7e+01|9.3e+01|3.4e+03| 0.000000e+00 0.000000e+00| 0:0:00| chol 1 1
1|0.510|0.946|8.5e+00|5.1e+00|6.4e+02| 0.000000e+00 -3.535035e+01| 0:0:00| chol 1 1
2|0.968|0.971|2.7e-01|1.5e-01|5.9e+01| 0.000000e+00 -4.159703e+01| 0:0:00| chol 1 1
3|1.000|0.991|3.4e-07|2.3e-03|3.1e+00| 0.000000e+00 -3.051053e+00| 0:0:00| chol 1 1
4|1.000|0.989|2.9e-07|1.2e-04|3.4e-02| 0.000000e+00 -3.351579e-02| 0:0:00| chol 1 1
5|1.000|0.988|7.7e-09|1.1e-05|4.1e-04| 0.000000e+00 -3.712334e-04| 0:0:00| chol 1 1
6|1.000|0.980|1.0e-10|3.0e-06|8.2e-06| 0.000000e+00 -4.368849e-06| 0:0:00| chol 1 1
7|1.000|0.989|7.7e-12|5.6e-08|1.4e-07| 0.000000e+00 -9.564096e-08| 0:0:00| chol 1 1
8|1.000|0.627|5.1e-14|5.0e-09|7.4e-08| 0.000000e+00 -5.760504e-08| 0:0:00| chol 1 1
9|1.000|0.638|1.5e-14|1.8e-09|3.9e-08| 0.000000e+00 -3.328155e-08| 0:0:00| chol 1 1
10|1.000|0.640|4.2e-15|6.9e-10|2.1e-08| 0.000000e+00 -1.863093e-08| 0:0:00| chol 1 1
11|1.000|0.641|1.2e-15|2.7e-10|1.1e-08| 0.000000e+00 -1.022397e-08| 0:0:00| chol 1 1
12|1.000|0.642|4.1e-16|1.1e-10|5.8e-09| 0.000000e+00 -5.539693e-09| 0:0:00| chol 1 1
13|1.000|0.642|3.7e-16|5.0e-11|3.1e-09| 0.000000e+00 -2.977163e-09| 0:0:00| chol 1 1
14|1.000|0.642|2.6e-16|2.4e-11|1.6e-09| 0.000000e+00 -1.591697e-09| 0:0:00| chol 1 1
15|1.000|0.642|1.5e-16|1.2e-11|8.7e-10| 0.000000e+00 -8.483282e-10| 0:0:00| chol 1 1
16|1.000|0.642|3.8e-16|6.6e-12|4.6e-10| 0.000000e+00 -4.514772e-10| 0:0:00| chol 1 1
17|1.000|0.642|3.0e-16|3.8e-12|2.5e-10| 0.000000e+00 -2.403233e-10| 0:0:00| chol 1 1
18|1.000|0.642|2.1e-16|2.5e-12|1.3e-10| 0.000000e+00 -1.282242e-10| 0:0:00| chol 1 1
19|1.000|0.642|3.1e-16|1.9e-12|7.4e-11| 0.000000e+00 -6.880024e-11| 0:0:00| chol 1 1
20|1.000|0.641|2.4e-16|1.7e-12|4.3e-11| 0.000000e+00 -3.732943e-11| 0:0:00| chol 1 1
21|1.000|0.640|7.2e-16|1.3e-12|2.5e-11| 0.000000e+00 -2.067102e-11| 0:0:00| chol 1 1
22|1.000|0.640|2.1e-16|8.7e-13|1.4e-11| 0.000000e+00 -1.166715e-11| 0:0:00| chol 1 1
23|1.000|0.639|1.7e-16|5.5e-13|8.5e-12| 0.000000e+00 -6.668652e-12| 0:0:00| chol 1 1
24|1.000|0.639|2.0e-16|3.3e-13|4.9e-12| 0.000000e+00 -3.842801e-12| 0:0:00| chol 1 1
25|1.000|0.639|1.8e-16|2.0e-13|2.9e-12| 0.000000e+00 -2.225930e-12| 0:0:00| chol 1 1
26|1.000|0.639|3.0e-16|1.2e-13|1.7e-12| 0.000000e+00 -1.293594e-12| 0:0:00|
stop: max(relative gap, infeasibilities) < 1.82e-12

number of iterations = 26
primal objective value = 0.00000000e+00
dual objective value = -1.29359438e-12
gap := trace(XZ) = 1.69e-12
relative gap = 1.69e-12
actual relative gap = 1.29e-12
rel. primal infeas (scaled problem) = 2.95e-16
rel. dual " " " = 1.19e-13
rel. primal infeas (unscaled problem) = 0.00e+00
rel. dual " " " = 0.00e+00
norm(X), norm(y), norm(Z) = 3.5e+00, 5.3e-12, 7.7e-12
norm(A), norm(b), norm(C) = 3.8e+01, 6.0e+00, 1.0e+00
Total CPU time (secs) = 0.09
CPU time per iteration = 0.00
termination code = 0
DIMACS: 8.9e-16 0.0e+00 1.2e-13 0.0e+00 1.3e-12 1.7e-12

Status: Solved
Optimal value (cvx_optval): +23.7559

Mark_L_Stone · January 31, 2024, 5:54pm

Looking at your snippets, I have no idea what was run.

cvx_optval of +23.7559 doesn’t match Obj_val of 235.7059. Again, I have no idea what was actually run.

Did ChatGPT write your program? I suggest you forget everything ChatGPT suggested about how to write CVX code. Please read the CVX Users’ Guide. Then write code which follows the rules described there.

As for cvx_precision high that is a capability documented in the CVX Users’ Guide. But the collective wisdom of the posters on this forum is that, although well-intentioned, it is a bad idea, no matter which solver is used.

mihirmalladi · January 31, 2024, 6:56pm

Hi. ChatGPT did not write the code. The original code was developed by me when I was not able to identify the issue then as a last resort I had taken help of ChatGPT. Then it has suggested me to include cvx_precision high. I will try to identify the issue one more time and then I will get back.

I am not sure if it is feasible to put break points inside the cvx to observe the process.

Thank you so much for your response.

mihirmalladi · January 31, 2024, 8:07pm

Also I have a couple of questions

How do I track the minimization process inside cvx_begin and cvx_end?
How to constantly track the variable R which is cvx real affine expression (25x25 matrix, 25 d.o.f.) to a double variable
What does relative gap mean here? Because the cvx does 11 iterations and here is the stop criteria
max(relative gap, infeasibilities) < 1.49e-08
which I quiet didnt understand

The ideal value of R is an identity matrix so it has to be optimized to get a matrix as good as an identity. It will be meaningless if it is all zeros. The worst case scenario (in ideal case) will be 0.1*eye(size_of_R)

Mark_L_Stone · January 31, 2024, 8:17pm

As it solves the problem , the solver tries to drive the gap between the primal objective and dual objective to within tolerance of zero. if it succeeds, the solver will tell CVX via its termination code, and CVX will reports the problem is solved.

If you’re not a solver expert, I recommend you focus on the final CVX status. if it says the problem is solved, don’t worry about the intermediate steps.

As I wrote before, I have no idea what code you actually provided CVX, so I have no idea what CVX’s solution means, and whether it has any actual relationship to the problem you want to solve. For all I know, you may have provided CVX with the objective to minimize a constant, in which case, any feasible solution is considered to be optimal.

mihirmalladi · January 31, 2024, 10:00pm

Here is my objective function:

norm(Diff(X,R,B),‘fro’)^2+alphatrace(transpose(Diff(X,R,B))LDiff(X,R,B))+gamma(norm(R,‘fro’))^2

R is the variable through which the objective function has to be minimized.

Diff(X,R,B) is a function that I made that does the difference based on the formula D(X)=X-RXB

X is my input data N-by-M (N not equal to M)
B is my shift identity matrix having the dimension M-by-M
R ~=identity matrix N-by-N
and L is a laplacian matrix N-by-N

the last term is a regularization term on R matrix.

I dont think there are any terms independent of the variable R.

Mark_L_Stone · January 31, 2024, 10:29pm

You haven’t shown a complete program. That may be what you think the objective function is, but I’m not sure that it is. if your supposed objective function does not appear between cvx_begin and cvx_end, it is not the objective function in the problem solved by CVX.

Have you actually carefully read the CVX Users’ Guide yet? Please do so. I recommend you try some of the examples in it, then modify them to make things a little more complicated and make sure things still work as expected.

BTW, norm(Diff(X,R,B),‘fro’)^2 will trigger an error in any version of CVX except for 3.0beta, which I told you not to use (it is riddled with bugs which will never be fixed). In CVX 2.2, you need to use square_pos(norm(....)), where ... is the argument of norm.

mihirmalladi · February 1, 2024, 7:49pm

Here is my code:

function [R,Rnorm1,trtrm1]=RLearning(R,alpha,gamma,kappa,X,L,B)
N=max(size(R));
[numnode,numhour]=size(X);
gamma=1e-3;
kappa=kappa;
cvx_begin

D=Diff(X,R,B);
% Obj_var=alphatrace(transpose(X)LX)+gamma(norm(R,‘fro’))^2;
Obj_var=square_pos(norm(D,‘fro’))+alphatrace(transpose(D)LD)+gamma(square_pos(norm(R,‘fro’)));

variable R(numnode,numnode)

% variable L(numnode,numnode)
% variable X(numnode,numhour)
% variable B(numhour,numhour)
minimize square_pos(norm(Diff(X,R,B),‘fro’))+alphatrace(transpose(Diff(X,R,B))L(Diff(X,R,B)))+gamma(square_pos(norm(R,‘fro’)))
% keyboard;
subject to
norm(R,1)<=1;
%issymmetric(Z);
std(diag(R))<=kappa;
% norm(R-eye(numnode),‘fro’)/norm(eye(numnode),1)<=0.25;
%trace(Z)==N;
% R==eye(N);
Rnorm1=norm(R,‘fro’);
trtrm1=trace(transpose(D)LD);

cvx_end
% Rnorm1=norm(R,‘fro’);
% trtrm1=trace(transpose(D)LD);
% disp(R);
% disp(kappa);

end

I am trying to use minimize square_pos(norm(Diff(X,R,B),‘fro’))+alphatrace(transpose(Diff(X,R,B))L(Diff(X,R,B)))+gamma(square_pos(norm(R,‘fro’)))

Cost function: square_pos(norm(Diff(X,R,B),‘fro’))+alphatrace(transpose(Diff(X,R,B))L(Diff(X,R,B)))+gamma(square_pos(norm(R,‘fro’)))

when I call this function in my main program I get an error saying that the objective function must be a scalar.
Hence I declared a variable inside cvx_begin called Obj_var like this

Obj_var=square_pos(norm(D,‘fro’))+alphatrace(transpose(D)LD)+gamma(square_pos(norm(R,‘fro’)))

alpha and gamma are regularization parameters.

I am trying to do like this because in the paper that I am implementing does this

D(X)=X-RXB

so except the forbenius norm of L others are all dependent on the R term.

Mark_L_Stone · February 1, 2024, 8:47pm

Why do you show RLearning? It is never used.

The statement D=Diff(X,R,B);, even though it appears after cvx_begin, has nothing to do with CVX, and must be executed based on whatever MATLAB variable R had already been set as R. Similarly, the statement Obj_var=square_pos(norm(D,‘fro’))+alphatrace(transpose(D)LD)+gamma(square_pos(norm(R,‘fro’))); has nothing to do with CVX, because it does not involve any CVX variable or expressions.

When R is later declared as a variable, that variable R has NO relation to any previous R, and supersedes it. You can;t initialize CVX variables. CVX and the solver figure out what their optimal values are.

As for your used of Diff, I have no idea how that’s supposed to relate to the problem (P1). Is that what the script D is? But what about D(X)=X-RXB? BTW, P1 is written in a strange way, listing Q = ... after s.t, even though it is not a constraint,. Fortunately, you seemed to handle that correctly.

Problem P1 appears to have L, Omega, and R as variables.yet R is your only declared CVX variable. When you add L as a variable, you need to also include trace(L) == N in your program.

Bottom line: I have no idea what P1 means. it looks very strange. If I believe all of L, Omega, R are CVX (optimization) variables, it appears to not be convex, unless the true meaning is different than what it looks like to me.

So I’d say at this point, you need to understand the problem, including what are optimization variables vs. input data. And to use CVX, that optimization problem needs to be convex. Figuring that out is your homework assignment.

Before embarking on that assignment, please carefully read, and take to heart,

Optimizing on an Identity Matrix and tracking the convergence

Calling SDPT3 4.0: 105 variables, 52 equality constraints

NOTE: custom settings have been set for this solver. Saving output to: cvx_debug.txt.mat

version predcorr gam expon scale_data NT 1 0.000 1 0 it pstep dstep pinfeas dinfeas gap prim-obj dual-obj cputime

NOTE: custom settings have been set for this solver.
Saving output to: cvx_debug.txt.mat

version predcorr gam expon scale_data
NT 1 0.000 1 0
it pstep dstep pinfeas dinfeas gap prim-obj dual-obj cputime