How to use Schur complement correctly?

(YL) #1

In my SDP problem, a constraint is Y-xx’>=0; where Y is 2x2 symmetric matrix and x is 2x1 vector.

Since CVX does not support to directly write the constraint, I use Schur complement and write the constraint equivalently as [Y,x’;x,1]>=0.

But when I run the program I found that the CVX give the solutions sometimes Y-xx’ does not be a semi-definite matrix. That means , sometimes min(eig(Y-xx’))<0, why this happened? Or how can I specify the Y-xx’>=0 (semidefinite positive) constraint in CVX code?

Could someone give me some advice? The whole code block is listed below, thank you very much!

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%our method
cvx_begin sdp quiet
variable t(1,1)
variable Y(2,2) semidefinite
variable x(2,1)
variable K(M,M) semidefinite
variable k(M,1)

minimize(t)
for i=1:M
trace(Y)-2z_hat(:,i)'x+z_hat(:,i)'z_hat(:,i)+2Xik(i)+Xi^2<=tbeta(i)^2;
end

for i=1:M
[ trace(Y)-2*z_hat(:,i)'x+z_hat(:,i)'z_hat(:,i)-2Xik(i)+Xi^2,beta(i);beta(i),t]>=0;
end

for i=1:M
K(i,i)==[1;-z_hat(:,i)]’[trace(Y),x’;x,eye(2)][1;-z_hat(:,i)];
end

for i=1:M
for j=1:M
if i~=j
K(i,j)>=[1;-z_hat(:,i)]’[trace(Y),x’;x,eye(2)][1;-z_hat(:,j)];
end
end
end

[K,k;k’,1]>=0;
[Y,x;x’,1]>=0; %This constraint seems does not work??
cvx_end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

(Mark L. Stone) #2

What is the value of min(eig([Y,x;x’,1])) after CVX completes? If it is -1e-8, or a smaller magnitude negative number, that just reflects solver tolerance. If you need the solution to be strictly positive definite, you can instead use the constraint
[Y,x;x’,1] >= small_number*eye(3)
where small_number is a small positive number, such as 1e-6.

You should not use quiet mode until you have determined your program is working correctly. You have not provided a reproducible problem (haven’t provided input data), and you have not shown the solver and CVX output (which wouldl be displayed if quiet option weren’t used). In fact, we don;t even know whether the solver.CVX claim to have found the optimal solution.

Also note that the Schur complement in your text is written incorrectly from a dimensionality (conformal) perspective, but is correct in your code.

(YL) #3

Thank you very much for your kind help. The advises are very illuminating to my work.

Now I used small number 1e-6 plus eye(3) to replace 0. And sometimes the violation still happen. And if I change the small number to 1e-4 then the problem disappeared. Does the threshold change with the problem and need to be adjusted case by case?

I have wrote a test program and code is listed below. At the end of the test program I used min(eig(sa)) to check whether sa=Y-x*x’ is semi definite positive. Here eig(sa) is the matlab function which return the eigenvalues of the matrix sa. The result is negative, so that sa is not a semi-definite positive matrix and the schur complement equivalent constraint is violated.

The input of the problem are generated randomly to perform Monte carlo simulation. In the test program, beta, z_hat ,and L are computed from some random generated scenario parameters. For some scenarios this violation appears while others not. The code just show a violation case. If the smallnumber parameter is large enough (1e-4) then the violation will never happen.

The result solution x and Y are used for a rounding procedure to produce some solution that is feasible with the constraint Y=x*x' and L=l*l'( rank one constraint which is not convex). In the rounding procedure I used Y-x*x' as a covariance matrix and use x as the mean. That means I generated some random vectors \xi_x(t)\sim N(x,Y-x*x') and used them to construct some feasible solutions to the problem with the nonconvex rank-one constraints. So that Y-x*x’ must be PSD.

May I trouble you for checking this test program and give me some advice ? The solver is SDPT3 4.0
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

clear;
clc;

M=4;
smallnumber=1e-6;

z_hat=[349.049118988982,273.191449032727,37.1430167833719,28.2125744474959;343.322423198038,121.879112283012,324.779715601743,112.393700815180];
L=[109.006172583561,90.4708901710594,115.813983123910,114.041111701073];
Xi=5;
beta=[199.620782097572;48.1241426764543;336.614462093126;293.790032116872];

cvx_begin sdp
variable t(1,1)
variable Y(2,2) semidefinite
variable x(2,1)
variable K(M,M) semidefinite
variable k(M,1)

minimize(t)
for i=1:M
    trace(Y)-2*z_hat(:,i)'*x+z_hat(:,i)'*z_hat(:,i)+2*Xi*k(i)+Xi^2<=t*beta(i)^2;
end

for i=1:M
    [ trace(Y)-2*z_hat(:,i)'*x+z_hat(:,i)'*z_hat(:,i)-2*Xi*k(i)+Xi^2,beta(i);beta(i),t]>=0;
end

for i=1:M
    K(i,i)==[1;-z_hat(:,i)]'*[trace(Y),x';x,eye(2)]*[1;-z_hat(:,i)];
end

for i=1:M
    for j=1:M
        if i~=j
            K(i,j)>=[1;-z_hat(:,i)]'*[trace(Y),x';x,eye(2)]*[1;-z_hat(:,j)];
        end
    end
end

[K,k;k',1]>=smallnumber*eye(M+1);
[Y,x;x',1]>=smallnumber*eye(3);

cvx_end
sa1=Y-x*x';
min(eig(sa1))

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The output of this test program is

Calling SDPT3 4.0: 66 variables, 20 equality constraints
For improved efficiency, SDPT3 is solving the dual problem.

num. of constraints = 20
dim. of sdp var = 22, num. of sdp blk = 8
dim. of linear var = 16
dim. of free var = 4 *** convert ublk to lblk


SDPT3: Infeasible path-following algorithms


version predcorr gam expon scale_data
HKM 1 0.000 1 0
it pstep dstep pinfeas dinfeas gap prim-obj dual-obj cputime

0|0.000|0.000|1.2e+06|6.0e+00|2.1e+08|-8.375836e+06 0.000000e+00| 0:0:00| chol 1 1
1|0.001|0.419|1.2e+06|3.5e+00|8.8e+07|-8.347404e+06 -5.022124e+00| 0:0:00| chol 1 1
2|0.394|0.634|7.3e+05|1.3e+00|5.2e+07|-2.506617e+06 -7.763444e+00| 0:0:00| chol 1 1
3|0.766|0.434|1.7e+05|7.2e-01|3.0e+07| 3.013552e+06 -9.487711e+00| 0:0:00| chol 1 1
4|0.795|0.919|3.5e+04|5.8e-02|5.7e+06| 2.730554e+06 -1.438932e+01| 0:0:00| chol 1 1
5|0.739|0.537|9.1e+03|2.7e-02|2.3e+06| 1.331237e+06 -2.048347e+01| 0:0:00| chol 1 1
6|0.759|0.334|2.2e+03|1.8e-02|7.3e+05| 3.823449e+05 -2.477292e+01| 0:0:00| chol 1 1
7|0.621|0.614|8.4e+02|6.9e-03|3.2e+05| 1.893795e+05 -3.795200e+01| 0:0:00| chol 1 1
8|0.839|0.798|1.3e+02|1.4e-03|7.0e+04| 4.627635e+04 -5.433469e+01| 0:0:00| chol 1 1
9|0.740|0.331|3.5e+01|9.4e-04|2.4e+04| 1.720520e+04 -6.818425e+01| 0:0:00| chol 1 1
10|0.975|0.952|8.7e-01|4.5e-05|7.4e+02| 5.142954e+02 -7.290390e+01| 0:0:00| chol 1 2
11|0.970|0.978|2.6e-02|1.0e-06|5.1e+01| 1.765441e+01 -3.142451e+01| 0:0:00| chol 2 2
12|1.000|0.094|6.6e-07|5.3e-06|3.6e+01| 3.435643e+00 -2.870878e+01| 0:0:00| chol 2 2
13|1.000|0.726|6.7e-07|1.6e-06|1.4e+01| 2.934840e+00 -9.748479e+00| 0:0:00| chol 2 2
14|0.899|0.827|2.3e-07|4.1e-07|2.4e+00|-1.318272e-01 -2.317013e+00| 0:0:00| chol 2 2
15|1.000|0.110|9.5e-09|4.1e-07|2.0e+00|-3.670402e-01 -2.197370e+00| 0:0:00| chol 2 3
16|1.000|0.499|4.3e-06|2.1e-07|9.8e-01|-7.351979e-01 -1.648034e+00| 0:0:00| chol 2 2
17|1.000|0.442|1.8e-06|1.2e-07|5.3e-01|-8.657176e-01 -1.361772e+00| 0:0:00| chol 2 2
18|1.000|0.595|4.8e-07|5.2e-08|2.0e-01|-9.396068e-01 -1.127610e+00| 0:0:00| chol 2 2
19|1.000|0.726|6.6e-08|2.1e-08|5.3e-02|-9.594094e-01 -1.007562e+00| 0:0:00| chol 2 2
20|1.000|0.267|1.1e-08|2.5e-08|4.1e-02|-9.642802e-01 -9.969930e-01| 0:0:00| chol 2 2
21|1.000|0.618|9.0e-09|1.2e-08|1.6e-02|-9.683979e-01 -9.802903e-01| 0:0:00| chol 2 2
22|1.000|0.777|6.2e-09|3.5e-09|3.6e-03|-9.704330e-01 -9.727611e-01| 0:0:00| chol 2 2
23|0.986|0.865|1.8e-09|5.9e-10|4.8e-04|-9.707177e-01 -9.709803e-01| 0:0:00|# chol 2 2
24|0.978|0.924|5.7e-08|5.4e-11|2.9e-05|-9.786030e-01 -9.707748e-01| 0:0:00|# chol 2 2
stop: progress in duality gap has deteriorated, 3.1e-05
25|0.009|0.008|5.7e-08|5.4e-11|2.9e-05|-9.786030e-01 -9.707748e-01| 0:0:00|

number of iterations = 25
primal objective value = -9.78602968e-01
dual objective value = -9.70774785e-01
gap := trace(XZ) = 2.92e-05
relative gap = 9.90e-06
actual relative gap = -2.65e-03
rel. primal infeas (scaled problem) = 5.74e-08
rel. dual " " " = 5.39e-11
rel. primal infeas (unscaled problem) = 0.00e+00
rel. dual " " " = 0.00e+00
norm(X), norm(y), norm(Z) = 4.0e+00, 2.3e+05, 4.3e+05
norm(A), norm(b), norm© = 1.5e+05, 2.0e+00, 6.2e+05
Total CPU time (secs) = 0.17
CPU time per iteration = 0.01
termination code = -8
DIMACS: 5.7e-08 0.0e+00 1.4e-10 0.0e+00 -2.7e-03 9.9e-06


Status: Inaccurate/Solved
Optimal value (cvx_optval): +0.970775

ans =

-0.1969

(Mark L. Stone) #4

Your problem is numerically difficult, and is not reliably solved by any of the solvers. Your run with SDPT3 results in a lack of complete success for the solver, and Inaccurate/solved assessment reported by CVX. Therefore, it is not surprising that there is such a large constraint violation as occurred.

I haven’t look closely enough at your problem to determine why it is numerically difficult, but the reported (sort of) optimal Y has elements in the 1e4 or 1e5 magnitude, which is causing numerical non-goodness. The zhat and beta elements are somewhat large in magnitude. It is worth re-scaling these to be closer to one and see if that helps. You should strive to get the non-zero elements of the optimal Y to be much closer to one in magnitude. But I don;t know that the scaling is the only, or even, major, difficulty.

I believe in part because the elements in Y are large, the minimum eigenvalue of Y-x*x' has a significantly different magnitude than that of [Y x;x'1]. So a very small magnitude negative min eigenvalue of [Y x;x'1] can, and in your case does, correspond to a much larger magnitude negative min eigenvalue of Y-x*x'. The solver only “knows” your constraint [Y x;x'1] >= smallnumber*eye(3), and therefore works to satisfy that within tolerance (but maybe doesn’t achieve that with your problem). And then the “violation” in terms of min eigenvalue of Y-x*x' can be much larger.

(YL) #5

Many thanks.

Yes it works when the rescaling advice was followed. All scenario parameters were scaling down at the same proportion and the problem resolved. For every round the CVX gave a Solved assessment report and the PSD constraint violation does not appear any more.

There is still some confusion for me . At first , for a special optimization problem, how to define and evaluate “numerically difficult” property? In many textbooks about convex optimization the authors tell us that once a problem is formulated as convex form, then it is relatively straightforward to solve it. But here looks like people still need to pay attention about the problem element’s magnitude.Can’t imagine how long I will struggle with this point without your expert advice.

Could you please give us some references title, or alternatively, just some keywords (We can google them ) , so that we can further learn how to evaluate the numerical property of the problem to be solved. Does there exist mature and complete theory of this topic? My optimization problem here is standard, the object function is simple and linear, the constraints are either affine or linear or convex. Each term in the polynomial of constraints seems diverse in the same order of magnitude. I also think the difficulty is not because of the machine word-length limit. Sorry for not reviewing the source code of sdpt3 solver because of the limit of my knowledge background. In the literature of the interior point algorithm I also did not find the corresponding description. If I switched to some commercial solvers, would the situation improve?

Secondly, the problem is listed below. The objective function is k but I really care about the variable x. Then does there exit any theoretical analysis about the quantitative impact of different parameters to x. Do I need to read some literature about the perturbation analysis of convex optimization? what direction to go?

\begin{split} &x_P=\arg\min_{x,k,l,Y,L}k \\&s.t. \\&\mathrm{tr}(Y)+2x^T\hat{z_i}+\hat{z_i}^T\hat{z_i}+2\zeta l_i+\zeta^2 \leq k\beta_i^2 \\&\begin{pmatrix} \mathrm{tr}(Y)+2x^T\hat{z_i}+\hat{z_i}^T\hat{z_i}-2\zeta l_i+\zeta^2 & \beta_i\\ \beta_i &k \end{pmatrix} \succeq 0 \\ &L(i,i)=\mathrm{tr}(Y)-2x^T\hat{z_i}+\hat{z_i}^T\hat{z_i} \\ &L(i,j)\geq 0 \\&\begin{pmatrix}Y&x \\ x^T& 1\end{pmatrix}\succeq 0 \\&\begin{pmatrix}L&l \\ l^T& 1\end{pmatrix}\succeq 0 \\&k\geq 0 \\&i,j=1,\ldots,M \end{split}

Another question is about the ‘ill-conditioned’ description in some literature. What is the generally description of an ‘ill-conditioned’ problem even if the problem satisfy all the rules of DCP? How can a CVX user to judge wheither or not his problem is ‘ill-conditioned’ before he put his problem into CVX?

Finally, why should the parameters be re-scaling to be closer to one. I just re-scale the original scenario parameter and all others are scaling down with equal ratio. Maybe they are far less than one. Do I need to deliberately scale one of them to close to one and that will improve the final output quality?

Please excuse for my bothering you with so many questions. Thank you again .

(Mark L. Stone) #6

Start out by reading a general purpose numerical analysis book, such as Nicholas J. Highan " Accuracy and Stability of Numerical Algorithms" https://epubs.siam.org/doi/book/10.1137/1.9780898718027?mobileUi=0

I’m not sure there is a good general modern reference for numerical issues in optimization. When you get up to speed and become an expert, maybe you can write one.

(YL) #7

Thank you very much.