TFOCS for mixed trace and $\ell_1$-norm minimization


I would like to solve
\min_{X\succcurlyeq 0}& &\mathsf{trace}\left(X\right)+\lambda \lVert X \rVert_1\
\text{subject to} & &\lVert \mathcal{A}\left(X\right)-b\rVert_2 \leq \varepsilon
numerically, where \lambda\geq0, \varepsilon\geq 0, b, and the linear operator \mathcal{A} are given. Also \lVert X\rVert_1 denotes the \ell_1-norm of X. I wonder if it’s possible to implement this program in TFOCS.

I tried to apply tfunc_sum to get a mixture of prox_trace and prox_l1 as the objective function, but that doesn’t seem to work. (the error message was saying there was too many input arguments when the objective was called in one of the subroutines.)

I’d appreciate it if you can tell me if and how I can implement the optimization above in TFOCS.

(Stephen Becker) #2

tfunc_sum creates the right object, but it’s not usable because you’d have to know the proximity operator of two functions, which is not generally known. So the standard approach would be to pick one of those functions and dualize it, since then we can handle as many of these terms as we like. In that case, no need to use tfunc_sum, just add it to the cell array of dual variables.

Note that if it was just the l1 norm and the trace, then because the trace is linear, we could do the prox in closed form. However, the PSD constraint interferes.

To see how to work with this, look at my RPCA demo as a starting point. You can keep either trace (and PSD) as primal, and dualize the l1 norm (so use proj_l1), or vice-versa (so keep prox_l1 and use proj_spectral(1,'symm') ).


Thanks. It seems to be working, although I guess you meant proj_linf rather than proj_l1 to dualize the \ell_1-term. I have also added an affine constraint with identity operator corresponding to the extra dual variable.


@Stephen_Becker I have a follow up question. Suppose that we keep the TFOCS’ options (e.g., tolerance, max iterations, etc) and the smoothing parameter \mu the same. Is it reasonable to expect that the solution to the above convex program, as solved by TFOCS, tends to the solution of the convex program

\begin{align*} \min_{X} \ \lVert X \rVert_1 \tag{P$_\infty$}\\ \text{subject to } & \lVert \mathcal{A}\left(X\right)-b\rVert_2 \leq \varepsilon, \end{align*}

when \lambda tends to \infty?

I think that is the ideal behaviour, but it seems that for any \lambda<\infty, TFOCS’ solution is significantly different from that of \mathrm{(P_\infty)}.

UPDATE: I have to clarify that I encountered the issue described above when I keep the trace norm (i.e., prox_trace) in the objective and dualize the \ell_1-norm. When \ell_1-norm is in the objective and I dualize the trace norm via proj_maxEig everything seems reasonable. Is there something special about having objective functions that operate on PSD matrices?