How an underdetermined system of linear equations are solved by CVX?

An underdetermined system of linear equations can have many solutions as any set of linearly independent vectors that span the solution space of the system forms a new solution. But solving such systems in CVX, gives a unique solution (which is not sparse) all the time when the objective function is ||\mathbf{Ax}-\mathbf{b}||_2 to be minimized (\mathbf{A} \in \mathcal{C}^{m\times n}, m\lt n). Which algorithm is used in CVX to deal with the underdetermined system of equations? Are extra constraints (like \ell_1 norm constraints) added by CVX?


This is a very interesting question! But in fact, CVX simply passes the linear system in question to the chosen solver. So the answer depends on the solver begin used.

As you have observed, with SDPT3 and SeDuMi, the solution does indeed seem to be the minimum-norm solution. This is very likely a consequence of their use of an interior-point method. In fact, those solvers actually add inequality constraints, by converting each free variable x_i into the difference of nonnegative variables. They then solve the resulting feasibility problem

\begin{array}{ll} \text{find} & x_+, x_-\in\mathbb{R}^n \\ \text{subject to} & A(x_+-x_-) = b \\ & x_+,x_- \geq 0 \end{array}

using their standard interior-point approach. Now in fact, they both happen to arrive at the same value of x_+-x_-, but the precise values of x_+ and x_- they converge to are different.

However, this is not the case for Gurobi or MOSEK. The employ a two-stage interior-point/simplex approach, and they both yield the same result—but a different one than is obtained by SDPT3 or SeDuMi. In my tests, they produce a solution with only the first m elements of x nonzero.

So the bottom line is that you should not rely on the minimum-norm behavior, because not all solvers do that in this case. If you want a particular solution to the undetermined system, you will have to provide your merit function as the objective to your model.

I am stuck at a similar problem in cvx to solve for Laplacain matrix L in
maximize {trace(SX) - log det X + lambda||X||_1}

where S = covariance matrix
X = precision matrix
||X||_1 = 1-norm of X defined as: norm(L,1) -(1/variance)*N)
N = no. of points in the data matrix

The problem is to be solved using ADMM algorithm in cvx.
I used the following code:
[X, history] = covsel(Covariance_matrix, lambda, 1, 1);

K = length(history.objval);
X_admm = X;

h = figure;
plot(1:K, history.objval, ‘k’, ‘MarkerSize’, 10, ‘LineWidth’, 2);
ylabel(‘f(x^k) + g(z^k)’); xlabel(‘iter (k)’);

g = figure;
semilogy(1:K, max(1e-8, history.r_norm), ‘k’, …
1:K, history.eps_pri, ‘k–’, ‘LineWidth’, 2);

semilogy(1:K, max(1e-8, history.s_norm), ‘k’, …
1:K, history.eps_dual, ‘k–’, ‘LineWidth’, 2);
ylabel(’||s||_2’); xlabel(‘iter (k)’);

function [Z, history] = covsel(Covariance_matrix, lambda, rho, alpha)
% covsel Sparse inverse covariance selection via ADMM
% [X, history] = covsel(D, lambda, rho, alpha)
% Solves the following problem via ADMM:
% minimize trace(SX) - log det X + lambda||X||_1
% with variable X, where S is the empirical covariance of the data
% matrix D (training observations by features).
% The solution is returned in the matrix X.
% history is a structure that contains the objective value, the primal and
% dual residual norms, and the tolerances for the primal and dual residual
% norms at each iteration.
% rho is the augmented Lagrangian parameter.
% alpha is the over-relaxation parameter (typical values for alpha are
% between 1.0 and 1.8).

t_start = tic;
% Global constants and defaults
QUIET = 0;
MAX_ITER = 1000;
ABSTOL = 1e-4;
RELTOL = 1e-2;

S = Covariance_matrix;
N = size(S,1);
X = Precision_matrix;
% ADMM solver
X = zeros(N);
Z = zeros(N);
U = zeros(N);

fprintf(’%3s\t%10s\t%10s\t%10s\t%10s\t%10s\n’, ‘iter’, …
‘r norm’, ‘eps pri’, ‘s norm’, ‘eps dual’, ‘objective’);

for k = 1:MAX_ITER

% x-update
[Q,L] = eig(rho*(Z - U) - S);
es = diag(L);
xi = (es + sqrt(es.^2 + 4*rho))./(2*rho);
X = Q*diag(xi)*Q';

% z-update with relaxation
Z_old = Z;
X_hat = alpha*X + (1 - alpha)*Z_old;
Z = shrinkage(X_hat + U, lambda/rho);

U = U + (X_hat - Z);

% diagnostics, reporting, termination checks

history.objval(k)  = objective( X, S, Z, lambda);

history.r_norm(k)  = norm(X - Z, 'fro');
history.s_norm(k)  = norm(-rho*(Z - Z_old),'fro');

history.eps_pri(k) = sqrt(N*N)*ABSTOL + RELTOL*max(norm(X,'fro'), norm(Z,'fro'));
history.eps_dual(k)= sqrt(N*N)*ABSTOL + RELTOL*norm(rho*U,'fro');

    fprintf('%3d\t%10.4f\t%10.4f\t%10.4f\t%10.4f\t%10.2f\n', k, ...
        history.r_norm(k), history.eps_pri(k), ...
        history.s_norm(k), history.eps_dual(k), history.objval(k));

if (history.r_norm(k) < history.eps_pri(k) && ...
   history.s_norm(k) < history.eps_dual(k))



function obj = objective(X, Covariance_matrix, Z, lambda)
obj = log(det(X)) - trace(Covariance_matrixX) - lambdanorm(Z(:), 1);

function y = shrinkage(a, kappa)
y = max(0, a-kappa) - max(0, -a-kappa);

Use this from :

But there is an error in : history.objval(k) = objective( X, S, Z, lambda)
and in [X, history] = covsel(Covariance_matrix, lambda, 1, 1) ,
stated below:
“Unable to perform assignment because the indices on the left side are not compatible with the size of the right side”.

Any suggestions as to where the mistake in the code is ?

That solution by ADMM doesn’t have anything to do with CVX, and therefore its off topic for this forum.

If you wanted to solve it using CVX, it would be on topic for this forum.

Instead of using
function obj = objective(X, Covariance_matrix, Z, lambda)
obj = log(det(X)) - trace(Covariance_matrixX) - lambdanorm(Z(:), 1);
if I use
cvx_begin quiet
cvx_precision low
variable L
norm(L,1) -lambda*(1/variance)*N));

where lambda = .1:.22:2

still the problem does not get solved for L

What is wrong with this code:

cvx_begin quiet
cvx_precision low
variable L
(lambda*norm(L,1) -lambda*(1/variance)*N); 


I don’t think
- (lambda*norm(L,1) -lambda*(1/variance)*N)
winds up being part of the objective function.because it is not within the parentheses for maximize.

doesn’t even involve CVX variables or expressions, the way you’ve written the program. And Precision_matrix s dangling by itself (misplaced parentheses?) without trace or log_det being applied to it, which doesn’t make sense in an objective function. Is Precision_matrix supposed to be a CVX (optimization) variable? And if so, how does it relate to L? What is the relation (inverse?) between Covariance_matrix and Precision_matrix? You need to be very clear what is input data and what are the optimization variables and their relations. And for whatever you decide on that, it better be a convex optimization problem with respect to the optimization variables.