Change norm(abs(x) - b, 2) problem into DCP compliant problem using constraints

Hi all,

I currently have a minimization problem I am trying to solve that involves the norm of an absolute value function (see below), where both Z and alpha are fixed and have non-negative entries.

min_\beta ||y - X\beta||^2_2 + \lambda|| |\beta| - Z\alpha||^2_2

I know that directly plugging this in will not be DCP compliant as the norm expects an affine function. My thinking at this point is that I need to rewrite the optimization using appropriate constraints, but I am not sure of the right constraints and have only considered one basic (but wrong) form of the problem that CVX at least accepts shown below.

%set seed
rng(123)

% sample size, number of 1st/2nd level vars
n = 100;
p = 30;
q = 2;

% external data
z1 = [ones(5,1); zeros(p-5, 1)];
z2 = [zeros(5,1); ones(10, 1); zeros(p - 15, 1)];
Z = [z1, z2];

% 1st level coef
b0 = 0.1;
b = [normrnd(0.2, 0.1, [5,1]); normrnd(0.5, 0.1, [5,1]);
    normrnd(-0.5, 0.1, [5,1]); normrnd(0, 0.1, [p - 15,1])]; 

% simulate predictors and outcome
mu_x = zeros(p, 1);
sigma_x = diag(ones(p, 1));
X = mvnrnd(mu_x, sigma_x, n);
y = b0 + X * b + normrnd(0, 1, [100, 1]);

%{
    Problem we are interested in solving
    minimize_betas (1/2n)||y - X * betas||^2 + (lambda/2)||abs(betas) - Z * alphas||^2
    Z and alphas have non-negative entries, lambda > 0
%}

% fix alphas / lambda
alphas = [0.1; 0.3];
lambda = 0.1;

% fit in cvx (need to figure out how to represent abs(betas))
cvx_begin
    variables betas(p) betas_abs(p);
    minimize(square_pos(norm(y - X*betas)) / (2*n) + lambda * square_pos(norm(betas_abs - Z * alphas)) / 2);
    subject to
        abs(betas) <= betas_abs;
cvx_end

[betas, betas_abs]

% are we getting the right coef. (do all abs(betas) == betas_abs) --> no
disp(sum(round(abs(betas), 4) == round(betas_abs, 4)) == p);

Thanks for any assistance you might be able to provide!

I think you can do this by using the MIDCP capability of CVX and declaring a binary variable beta_binary(p), and adding suitable constraints, as shown in the MOSEK Modeling Cookbook version 3.0, section 9.1.6 “Exact absolute value” https://docs.mosek.com/MOSEKModelingCookbook-letter.pdf .

To do this, you will need to use an upper bound on the absolute value of the beta components, which you should not make larger than necessary. If the optimal solution hits a bound, you will need to re-do.

You will need a version of CVX capable of handling binary variables (MIDCP), and the problem should be formulated by CVX and submitted to the solver (you will need MOSEK or Gurobi) as an MISOCP.

Thanks Mark! The following code snippet with the MOSEK solver worked for me, where I used M > 0 to define the upper bound.

% fit in cvx using mosek solver
cvx_begin
    variables betas(p) betas_abs(p) betas_pos(p) betas_neg(p);
    variable betas_binary(p) binary;
    minimize(square_pos(norm(y - X * betas)) / (2 * n) + lambda * square_pos(norm(betas_abs - Z * alphas)) / 2);
    subject to
        betas == betas_pos - betas_neg;
        betas_abs == betas_pos + betas_neg;
        betas_pos >= 0;
        betas_neg >= 0;
        betas_pos <= M * betas_binary;
        betas_neg <= M * (1 - betas_binary);
cvx_end
1 Like