Ill-conditioned weighted least squares

opt_analyst · September 21, 2012, 3:53pm

Hi,
Sorry my question is a little long. I have a optimization problem like this:

A and D are M\times M matrix, x,y are M\times 1 vector. C is
covariance of x, \lambda is a regularization parameter which I can
specify.

$$ \begin{equation*}
\begin{aligned}
& \underset{x}{\text{minimize}}
& &(Ax-y)^TC^{-1}(Ax-y)+\lambda |Dx|2^2 \
& \text{subject to}
& & x_i\geq 0, \quad i=1,\cdots, M \
&&& \sum{i=1}^M{x_i}=N\
&&& \sum_{i=1}^M{i (x_i/N)}= \mu\
&&& \sum_{i=1}^M{(i-\mu)^2 (x_i/N)=\sigma^2}
\end{aligned}
\end{equation*}

* First, how to write WLS in DCP? I tried L=chol(C,'lower'); invL=inv(L); invC=inv(C); minimize(sum_square(invL*A*x-invL*y)+lambda*sum_square(D*x)) and minimize (quad_form(A*x-y,invC)+lambda*sum_square(D*x)); I found the latter is much more efficient than the former. Another concerned is that since the covariance matrix I have is ill conditioned, so getting an inverse of it directly through inv(.) may not be a good idea. So I was wondering if there is any way I can put C instead of inv(C) directly in the objective function to avoid crude inverse of C, and still comply with the rules of cvx? I just want the ill condition of C to impact the stability of the result as less as possible. * Second, in the setting of the problem, $x_i/N$ can be considered as frequency of a discrete distribution. The third and fourth constraints are the constraints of the mean and variance. So I can also write these two constraints to $$\sum_{i=1}^M{i x_i}= \mu*N$$ and $$ \sum_{i=1}^M{i^2 x_i}=\text{2nd raw moment}$$ I found that also the feasible set should be the same but the result from CVX a is different. Moving $N$ to the right hand side of the constraints makes a great difference in the solution, especially when $\lambda$ is small. Which way do you think I can get a more accurate solution? * In addition, I want to do the optimization many times, each time with a different $\lambda$ or $y$. So far I have been using for loop. Do you know any equivalent formulation that cvx can process more efficiently? Thank you very much! YZ

mcg · September 21, 2012, 7:38pm

Thank you for posting your question here. This is a great opportunity to share a practical tip about using CVX effectively: avoid using quadratic forms like quad_form and sum_square whenever you can use norm (without squaring it) instead. If you don’t quite understand this, read on; this problem is a perfect example of this lesson.

The standard solvers that CVX uses actually don’t handle quadratic forms very well. This may seem counterintuitive—after all, QP solvers have been around for a long time, right? But CVX doesn’t use QP solvers, it uses conic solvers. And translating quadratic forms to conic solvers can cause some scaling issues. Gurobi and MOSEK have dedicated support for quadratic objectives, but CVX cannot yet exploit that.

Therefore, it is always worthwhile to see if you can rewrite your problem to eliminate quadratics. In this case, this is not difficult. Your objective is equivalent to

\|C^{-1/2}(Ax-y)\|_2^2 + \lambda\|Dx\|_2^2 = \left\|\begin{bmatrix} C^{-1/2} ( A x - y ) \\ \sqrt{\lambda} D x \end{bmatrix} \right\|_2^2

Furthermore, since minimizing the norm is equivalent to minimizing the squared norm, you can drop the exponent. That means that you can use this objective in CVX:

norm( [ invL * ( A * x - y ) ; sqrt( lambda ) * D * x ] )

I think you will find that the scaling issues for this equivalent form are a lot better.

I suspect the variances you’re seeing when you rewrite the constraints in different ways are at least in part due to the quadratic form issue. You should certainly try both solvers to see which one works better for you (and, with the new version of CVX, try MOSEK and Gurobi too, of course.) Feel free to try boosting the precision (see the cvx_precision command) to see if that helps as well.

As for the looping, there really is nothing else you can do. CVX does not have any sort of warm-start facility (not an easy thing to implement for such a general-purpose tool). To be honest, the primary goal of CVX is to get your models running as quickly as possible, not necessarily to solve them as quickly as possible.

mcg · September 27, 2012, 9:15pm

I have decided that the concept of eliminating quadratic forms merited inclusion on the user guide. You can see it now, here: http://cvxr.com/cvx/doc/advanced.html