Mosek Fails Where SDPT3 Works - Least Squares with Semi Definite Constraint

I solved the following problem on StackExchange Mathematics - Variation of Least Squares with Symmetric Positive Semi Definite (PSD) Constraint.

As always, I verify my solver vs. CVX.
The interesting thing is while CVX using SDPT3 or SeDuMi manage to solve the problem CVX using Mosek fails.

Here is a MWE:

% Initialization

close('all');
clear('all');
clc();

%% Parameters

numRows     = 10;
numVectors  = 10;


%% Load / Generate Data

mXX = randn(numRows, numVectors); %<! The set of {x}_{i} (Each column)

mX = zeros(numVectors, numRows * numRows);
for ii = 1:numVectors
    mX(ii, :) = kron(mXX(:, ii).', mXX(:, ii).');
end

vY = randn(numVectors, 1);

hObjFun = @(mW) 0.5 * sum((mX * mW(:) - vY) .^ 2);


%% Solution by CVX

cvx_solver('SDPT3'); %<! Works
cvx_solver('Mosek'); %<! Doesn't Work!!!
% cvx_solver('SeDuMi'); %<! Works

hRunTime = tic();

cvx_begin('quiet')
    cvx_precision('best');
    variable mW(numRows, numRows) semidefinite
    objVal = 0;
    for ii = 1:numVectors
        objVal = objVal + square( (mXX(:, ii).' * mW * mXX(:, ii)) - vY(ii) );
    end
    minimize( 0.5 * objVal )
cvx_end

runTime = toc(hRunTime);

disp([' ']);
disp(['CVX Solution Summary']);
disp(['The CVX Solver Status - ', cvx_status]);
disp(['The Optimal Value Is Given By - ', num2str(hObjFun(mW))]);
disp(['The Optimal Argument Is Given By - [ ', num2str(mW(:).'), ' ]']);
disp(['The Run Time Is Given By - ', num2str(runTime), ' [Sec]']);
disp([' ']);

sCvxSol.vXCvx     = mW(:);
sCvxSol.cvxOptVal = cvx_optval;
sCvxSol.cvxOptVal = hObjFun(mW);

Just change the solver (See cvx_solver('SDPT3'); and cvx_solver('Mosek');) to Mosek and see it fails (Returns NaN and says the problem is infeasible).

If you remove cvx_precision('best'); then it solves very nicely. After Easter we can try to investigate in more detail what is going on with the precision parameters that stops it working with your settings.

Indeed.
I also figured it out few hours ago and wrote an issue on that - https://github.com/cvxr/CVX/issues/12.

By the way, it is interesting that in many cases SDPT3 is faster than MOSEK.

Using cvx_precision('best') makes CVX set many Mosek termination tolerances to 0.0, which leads to unexpected behaviour. Whether it is a Mosek issue or a CVX issue is yet to be determined :). You can circumvent it by setting those parameters manually, for example

    cvx_solver_settings('MSK_DPAR_INTPNT_CO_TOL_DFEAS', 1e-11, ...
						'MSK_DPAR_INTPNT_CO_TOL_INFEAS', 1e-11, ...
						'MSK_DPAR_INTPNT_CO_TOL_MU_RED', 1e-11, ...
						'MSK_DPAR_INTPNT_CO_TOL_PFEAS', 1e-11, ...
						'MSK_DPAR_INTPNT_CO_TOL_REL_GAP', 1e-11);

if you want something else than the default precision.

As the one of the main author of the interior-point optimizer in Mosek I strongly recommend against using

cvx_precision(‘best’);

with Mosek. If I had been @mcg I would never ever have added this option to Cvx. Never.

In fact I strongly recommend using the default settings because that is for the vast majority of problems is the maximum precision that can be obtained.

If you should change options you should change the original options of the optimizer and think very careful what you should change.

I think the logic in cvx_precision(‘best’); is simple and makes sense.
It says, I will let the solver work until it decides there is nothing more to do.
I guess MOSEK doesn’t support this mode of operation for some reason.

@Erling, I’d be happy of you shed some information about why.

Another interesting observation I had is that SDPT3 is faster than MOSEK on some cases. I was surprised by that. Are you aware of such cases?

Thank You.

cvx_precision(‘best’);

is a bad idea because due the usage of finite precision floating point numbers implies you rarely can get a higher accuracy than the default setting specified in Mosek. So you are just wasting computation time trying. Normally that is considered a bad for the environment due the CO2 emission.

SDPT3 is faster sometimes as documented at

http://plato.asu.edu/ftp/sparse_sdp.html

For instance if you have huge matrix variables then SDPT3 at least in theory has an advantage because they use the HKM search direction where we use NT direction. But there can be other reasons as well. On average (whatever that means) we should be faster.

I cannot say why it is the case for your problem.

See also the recommendation of Ed Klotz mentioned at:

Maybe you are right that SDPT3 is faster on your instances, but you should examine the logs carefully to see that both solvers reach the optimal solution and don’t terminate prematurely (or stall in the end resulting in many iterations with limited progress).

The case SDPT3 was faster is the case above (The files on my GitHub).

I have no complaints, MOSEK is great and I’m sure that on average it is better.
Just pointed a case you might look at.

I think speed is really important, but more important is the number of successful solutions :-).

I also praise your very welcoming approach towards academic use. Certainly smoother procedure than any of the competition. Thank you for that.

I agree with Erling’s criticism, in hindsight. The cvx_precision('best') setting works really well with SeDuMi, which was the first solver we built support for. The reason for this is that the very specific way SeDuMi handled its stopping criteria meant that I could reliably just let it run until it could make no progress. But most solvers really don’t work so well that way.

So yes, cvx_precision('best') is not recommended, and it’s likely not going to ever be fixed.