Yes, TFOCS can handle this. You have two options:

- include the f(x)=\|Ax-b\|_2^2 term in the smooth part. You express worry that
`tfocs_SCD.m`

cannot handle gradients with an affine term. It’s true that we don’t include a specific method to generally handle affine terms, but that doesn’t mean you can’t do it (unlike the projections, there’s no benefit for us explicitly handling it). So you can either

– write your own function that computes \nabla f(x), which is very easy since it’s just \nabla f(x) = A^T( Ax-b), or…

– use our pre-built function `smooth_quad.m`

which does this for you (and handles affine terms). This is my recommendation. To use `smooth_quad`

, we don’t separate out “A” and “b” the same way we did for projections, rather we use it like:

`smooth_op = smooth_quad( A'*A, -2*A'*b, b'*b );`

`[x , out] = tfocs_SCD(smooth_op, {B, 0}, prox_linf, mu ,[] ,[] , opt)`

The smoothing term \mu\|x-x_0\|_2^2 is added to `smooth_op`

automatically.

- you could use the quadratic term as a proximal term if you wished, since we can compute the proximity operator of a quadratic, but it is likely that this approach will be a bit slower than taking advantage of its smoothness and using derivatives

Update, in response to question 703)

I misspoke earlier: while `tfocs.m`

takes advantage of smooth terms, `tfocs_SCD.m`

does not. But we can still solve the problem. As mentioned in the question, we can solve

\min \,\beta\|B x\|_1 \;\mathrm{such\,that}\; \|Ax - b\|_2 \le \epsilon

using `solver_sBPDN_W.m`

, which in turn calls TFOCS in the following fashion:

`tfocs_SCD( [], { A, -b; W, 0 }, prox, mu, x0,...)`

where `prox = { prox_l2( epsilon ), proj_linf(beta) }`

Now, to solve

\min \, \alpha\|Ax - b\|_2^2 + \beta\|B x\|_1, \; x \in R^n

keep almost the same calling sequence, but with `AtA = A'*A`

and `Atb = -2A'*b`

, i.e.

`tfocs_SCD( [], { AtA, -Atb; W, 0 }, prox, mu, x0,...)`

and then change the prox to

`prox = { smooth_quad( alpha ), proj_linf(beta) }`

For more efficiency, see `solver_sBPDN_W.m`

for some scaling tricks that make sure A and B have roughly the same scale.

The previous solution I posted works, but it actually relies on the ability of `smooth_quad`

to compute a proximity operator (rather than taking a gradient). Using that method works, but it will need to solve the linear system of equations I + \tau A^TA every time it is called. When A is a scalar or diagonal, this is trivial, but for large matrices it will be slow. Our new release will make this faster by computing a one-time factorization (a bit slow) to make all subsequent calls very fast. But still, for very large systems, this new approach (in the updated response) is the way to go.