Using TFOCS for Scaled Basis Pursuit Denoising

Stephen_Becker · April 19, 2013, 3:07pm

Yes, TFOCS can handle this. You have two options:

include the f(x)=\|Ax-b\|_2^2 term in the smooth part. You express worry that tfocs_SCD.m cannot handle gradients with an affine term. It’s true that we don’t include a specific method to generally handle affine terms, but that doesn’t mean you can’t do it (unlike the projections, there’s no benefit for us explicitly handling it). So you can either

– write your own function that computes \nabla f(x), which is very easy since it’s just \nabla f(x) = A^T( Ax-b), or…

– use our pre-built function smooth_quad.m which does this for you (and handles affine terms). This is my recommendation. To use smooth_quad, we don’t separate out “A” and “b” the same way we did for projections, rather we use it like:

smooth_op = smooth_quad( A'*A, -2*A'*b, b'*b );

[x , out] = tfocs_SCD(smooth_op, {B, 0}, prox_linf, mu ,[] ,[] , opt)

The smoothing term \mu\|x-x_0\|_2^2 is added to smooth_op automatically.

you could use the quadratic term as a proximal term if you wished, since we can compute the proximity operator of a quadratic, but it is likely that this approach will be a bit slower than taking advantage of its smoothness and using derivatives

Update, in response to question 703)

I misspoke earlier: while tfocs.m takes advantage of smooth terms, tfocs_SCD.m does not. But we can still solve the problem. As mentioned in the question, we can solve

\min \,\beta\|B x\|_1 \;\mathrm{such\,that}\; \|Ax - b\|_2 \le \epsilon

using solver_sBPDN_W.m, which in turn calls TFOCS in the following fashion:

tfocs_SCD( [], { A, -b; W, 0 }, prox, mu, x0,...)

where prox = { prox_l2( epsilon ), proj_linf(beta) }

Now, to solve
\min \, \alpha\|Ax - b\|_2^2 + \beta\|B x\|_1, \; x \in R^n

keep almost the same calling sequence, but with AtA = A'*A and Atb = -2A'*b, i.e.

tfocs_SCD( [], { AtA, -Atb; W, 0 }, prox, mu, x0,...)

and then change the prox to
prox = { smooth_quad( alpha ), proj_linf(beta) }

For more efficiency, see solver_sBPDN_W.m for some scaling tricks that make sure A and B have roughly the same scale.

The previous solution I posted works, but it actually relies on the ability of smooth_quad to compute a proximity operator (rather than taking a gradient). Using that method works, but it will need to solve the linear system of equations I + \tau A^TA every time it is called. When A is a scalar or diagonal, this is trivial, but for large matrices it will be slow. Our new release will make this faster by computing a one-time factorization (a bit slow) to make all subsequent calls very fast. But still, for very large systems, this new approach (in the updated response) is the way to go.