The problem got solved but the result violates some constraints

Hello, I want to solve a convex problem with CVX MOSEK solver. The problem got solved but the result didn’t satisfy some constraints. Also the results of the primal problem and the dual problem are different. Here are my codes:

clc;
clear all;

%hyperparameters
T = 1;  %number of Monte Carlo
M = 8;  %number of clusters
NA = [6];  % number of antennas
K = 4;  %average number of semantic symbols per word
S0 = 0.1;  % targert semantic rate (I/L Msut/s)
ep0 = 0.7;  % target semantic similarty
R0 = 5;  % target bit rate (Mbit/s)
N0 = -60;  % noise power (dbm)
SideLength = 50;
KP = 1;
PL = 0.8;

%preprocess
n0 = 10^((N0/10)-3);
ep_til = S0*K;  % semantic similarty from semantic rate
ep_bar = max(ep0, ep_til); 
inv_ep = inverseSimilarity(ep_bar,K);
r0 = 2^(R0/10) - 1;

for index = 1:length(NA)  %outter loop of antenna
    N = NA(index);  % the number of antennas in each experiment
    s_coords_x = zeros(M);
    s_coords_y = zeros(M); % init coordinates of M S-users
    b_coords_x = zeros(M);
    b_coords_y = zeros(M); % init coordinates of M B-users
    for ct = 1:length(T) %inner loop of Monte Carlo
        s_coords_x = (rand(1, M) - 0.5) * SideLength;
        s_coords_y = (rand(1, M) - 0.5) * SideLength;
        b_coords_x = (rand(1, M) - 0.5) * SideLength;
        b_coords_y = (rand(1, M) - 0.5) * SideLength;
        s_distance = vecnorm([s_coords_x' s_coords_y'], 2, 2); % distance between S-user and BS
        b_distance = vecnorm([b_coords_x' b_coords_y'], 2, 2); % distance between B-user and BS
        s_channel = (sqrt(KP/(1+KP)) + sqrt(1/(1+KP))*((randn(M,N) + 1i*randn(M,N))/sqrt(2)))./sqrt(repmat(s_distance, 1, N).^PL);
        b_channel = (sqrt(KP/(1+KP)) + sqrt(1/(1+KP))*((randn(M,N) + 1i*randn(M,N))/sqrt(2)))./sqrt(repmat(b_distance, 1, N).^PL);
        [s_cluster,b_cluster] = powerGrouping(M,s_channel,b_channel);
        cvx_begin
            variable Ws(N, N, M/2) hermitian semidefinite
            variable Wb(N, N, M/2) hermitian semidefinite
            t = 0;
            for i = 1:(M/2)
                t = t + trace(Ws(:,:,i)) + trace(Wb(:,:,i));
            end
            minimize (t)
            subject to
            for i = 1:(M/2)
                HsN = s_channel(s_cluster{i}(2),:)'*s_channel(s_cluster{i}(2),:);
                HsF = s_channel(s_cluster{i}(1),:)'*s_channel(s_cluster{i}(1),:);
                HbN = b_channel(b_cluster{i}(2),:)'*b_channel(b_cluster{i}(2),:);
                HbF = b_channel(b_cluster{i}(1),:)'*b_channel(b_cluster{i}(1),:);
                inv_ep*(inv_ep+2)*(t-trace(Ws(:,:,i))+n0) - real(trace(HsN*Ws(:,:,i))) <= 0;
                r0*(r0+2)*(t-trace(Wb(:,:,i))+n0) - real(trace(HbN*Wb(:,:,i))) <= 0;
                real(trace(HsN*Ws(:,:,i))) -  real(trace(HsF*Ws(:,:,i))) <= 0;
                real(trace(HbN*Wb(:,:,i))) -  real(trace(HbF*Wb(:,:,i)))<= 0;
            end
        cvx_end
    end
end

The result shows;

Calling Mosek 9.1.9: 304 variables, 16 equality constraints

MOSEK Version 9.1.9 (Build date: 2019-11-21 11:34:40)
Copyright (c) MOSEK ApS, Denmark. WWW: mosek.com
Platform: Windows/64-X86

Problem
Name :
Objective sense : min
Type : CONIC (conic optimization problem)
Constraints : 16
Cones : 0
Scalar variables : 16
Matrix variables : 8
Integer variables : 0

Optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator started.
Freed constraints in eliminator : 0
Eliminator terminated.
Eliminator - tries : 1 time : 0.00
Lin. dep. - tries : 1 time : 0.00
Lin. dep. - number : 0
Presolve terminated. Time: 0.00
Problem
Name :
Objective sense : min
Type : CONIC (conic optimization problem)
Constraints : 16
Cones : 0
Scalar variables : 16
Matrix variables : 8
Integer variables : 0

Optimizer - threads : 16
Optimizer - solved problem : the primal
Optimizer - Constraints : 16
Optimizer - Cones : 0
Optimizer - Scalar variables : 16 conic : 0
Optimizer - Semi-definite variables: 8 scalarized : 624
Factor - setup time : 0.00 dense det. time : 0.00
Factor - ML order time : 0.00 GP order time : 0.00
Factor - nonzeros before factor : 108 after factor : 108
Factor - dense dim. : 0 flops : 1.30e+05
ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME
0 6.8e+01 0.0e+00 9.7e+01 0.00e+00 9.600000000e+01 0.000000000e+00 1.0e+00 0.00
1 1.0e+01 2.2e-16 2.6e+01 -1.58e+00 3.847260398e+01 2.121318572e-09 1.5e-01 0.00
2 7.7e-01 2.6e-15 1.0e+00 1.92e-01 3.845056624e+00 4.280465945e-09 1.1e-02 0.02
3 5.7e-04 3.0e-15 2.8e-05 9.36e-01 2.485062105e-03 4.542323827e-09 8.4e-06 0.02
4 1.2e-09 2.8e-15 1.4e-13 1.00e+00 5.302128963e-09 4.542569079e-09 1.8e-11 0.02
Optimizer terminated. Time: 0.02

Interior-point solution summary
Problem status : PRIMAL_AND_DUAL_FEASIBLE
Solution status : OPTIMAL
Primal. obj: 5.3021289627e-09 nrm: 1e-07 Viol. con: 7e-07 var: 0e+00 barvar: 0e+00
Dual. obj: 4.5425690788e-09 nrm: 6e+00 Viol. con: 0e+00 var: 1e-16 barvar: 1e-14
Optimizer summary
Optimizer - time: 0.02
Interior-point - iterations : 4 time: 0.02
Basis identification - time: 0.00
Primal - iterations : 0 time: 0.00
Dual - iterations : 0 time: 0.00
Clean primal - iterations : 0 time: 0.00
Clean dual - iterations : 0 time: 0.00
Simplex - time: 0.00
Primal simplex - iterations : 0 time: 0.00
Dual simplex - iterations : 0 time: 0.00
Mixed integer - relaxations: 0 time: 0.00


Status: Solved
Optimal value (cvx_optval): +5.30213e-09

Your problem is not reproducible because inverseSimilarity is not provided (you probably could have just assigned the result and avoided the call to inverseSimilarity).

The primal and dual objective differ by 1e-9, which is within normal termination tolerance. It would appear that the true optimal objective value might be exactly zero, and the optimal Mosek solution is within tolerance of that.

As for the constraints, you haven’t shown us which constraints are violated and by how much. But they only need be satisfied to within a feasibility tolerance.

But a seemingly likely culprit for apparent, but not actual, constraint violation in your program is that your constraints involve expressions, which everything on the LHS of = are, whether or not they are declared as expressions. Unlike CVX variables which are populated with the optimal values after a successful solve, expressions are populated with a value which might or might not be the optimal value after a successful solve. The only way to know the optimal value of CVX expressions is to calculate them (after cvx_end), starting with CVX variable values. If you do this, you should hopefully see all the constraints are satisfied to within feasibiloity tolerance.

There is no need these days to use Mosek 9.1 shipped with CVX, just install Mosek 10.2 and let CVX use that.

Optimization results hardly ever satisfy all constraints exactly in floating-point arithmetic, and two theoretically equal numbers computed in different ways are hardly ever exactly floating-point equal, all that holds just do so up to some precision. According to the solution summary the biggest violations are of order 1e-7. Whether or not this is much depends on your context, the magnitude of your data etc. The objective values differ by 1e-9 and are essentially 0.

https://docs.mosek.com/latest/faq/faq.html#why-does-the-solution-slightly-violate-constraints

Thank you Mark! I found an issue in my previous formulation and now I corrected that issue. However, the optimal solution still violates the first and the second constraint. I put the intact code below.

Main file:

clc;
clear all;

%hyperparameters
T = 1;  %number of Monte Carlo
M = 4;  %number of clusters
NA = [10];  % number of antennas
K = 4;  %average number of semantic symbols per word
S0 = 0.1;  % targert semantic rate (I/L Msut/s)
ep0 = 0.1;  % target semantic similarty
R0 = 3;  % target bit rate (Mbit/s)
N0 = -70;  % noise power (dbm)
SideLength = 50;
KP = 1;
PL = 0.8;

%preprocess
n0 = 10^((N0/10)-3);
ep_til = S0*K;  % semantic similarty from semantic rate
ep_bar = max(ep0, ep_til); 
inv_ep = inverseSimilarity(ep_bar,K);
r0 = 2^(R0/10) - 1;

for index = 1:length(NA)  %outter loop of antenna
    N = NA(index);  % the number of antennas in each experiment
    s_coords_x = zeros(M);
    s_coords_y = zeros(M); % init coordinates of M S-users
    b_coords_x = zeros(M);
    b_coords_y = zeros(M); % init coordinates of M B-users
    for ct = 1:length(T) %inner loop of Monte Carlo
        s_coords_x = (rand(1, M) - 0.5) * SideLength;
        s_coords_y = (rand(1, M) - 0.5) * SideLength;
        b_coords_x = (rand(1, M) - 0.5) * SideLength;
        b_coords_y = (rand(1, M) - 0.5) * SideLength;
        s_distance = vecnorm([s_coords_x' s_coords_y'], 2, 2); % distance between S-user and BS
        b_distance = vecnorm([b_coords_x' b_coords_y'], 2, 2); % distance between B-user and BS
        s_channel = (sqrt(KP/(1+KP)) + sqrt(1/(1+KP))*((randn(M,N) + 1i*randn(M,N))/sqrt(2)))./sqrt(repmat(s_distance, 1, N).^PL);
        b_channel = (sqrt(KP/(1+KP)) + sqrt(1/(1+KP))*((randn(M,N) + 1i*randn(M,N))/sqrt(2)))./sqrt(repmat(b_distance, 1, N).^PL);
        [s_cluster,b_cluster] = powerGrouping(M,s_channel,b_channel);
        cvx_begin
            variable Ws(N, N, M/2) hermitian semidefinite
            variable Wb(N, N, M/2) hermitian semidefinite
            t = 0;
            for i = 1:(M/2)
                t = t + trace(Ws(:,:,i)) + trace(Wb(:,:,i));
            end
            minimize (t)
            subject to;
            for i = 1:(M/2)
                HsN = s_channel(s_cluster{i}(2),:)'*s_channel(s_cluster{i}(2),:);
                HsF = s_channel(s_cluster{i}(1),:)'*s_channel(s_cluster{i}(1),:);
                HbN = b_channel(b_cluster{i}(2),:)'*b_channel(b_cluster{i}(2),:);
                HbF = b_channel(b_cluster{i}(1),:)'*b_channel(b_cluster{i}(1),:);
                IsiN = 0;
                IbiN = 0;
                IsiF = 0;
                IbiF = 0;
                for k = 1:(M/2)
                    IsiN = IsiN + trace(HsN*Wb(:,:,i));
                    IsiF = IsiF + trace(HsF*Wb(:,:,i));
                    IbiN = IbiN + trace(HbN*Ws(:,:,i));
                    IbiF = IbiF + trace(HbF*Ws(:,:,i));
                    if k ~= i
                        IsiN = IsiN + trace(HsN*Ws(:,:,i));
                        IsiF = IsiF + trace(HsF*Ws(:,:,i));
                        IbiN = IbiN + trace(HbN*Wb(:,:,i));
                        IbiF = IbiF + trace(HbF*Wb(:,:,i));
                    end
                end
                inv_ep*(inv_ep+2)*(IsiN+n0) - real(trace(HsN*Ws(:,:,i))) <= 0;  %first constraint
                r0*(r0+2)*(IbiF+n0) - trace(HbN*Wb(:,:,i)) <= 0;  % second constraint
                trace(HsN*Ws(:,:,i)) -  trace(HsF*Ws(:,:,i)) <= 0;
                trace(HbN*Wb(:,:,i)) -  trace(HbF*Wb(:,:,i)) <= 0;
                IsiN >= IsiF;
                IbiN >= IbiF;
            end
        cvx_end
    end
end

function 1

function r = inverseSimilarity (ep, k)
    % K = 3,4,6,8
    K = [3 4 6 8];
    AL = [-0.1 -0.1 -0.1 -0.1];
    AR = [0.85 0.88 0.885 0.89];
    l = [0.1 0.18 0.3 0.4];
    r0 = [0 -0.2 -2.3 -3];
    r0 = 10.^(r0/10);
    v = [1 1 1 1];
    index = find(K == k);

    r = r0(index) - (1/l(index))*log(((AR(index) - AL(index))/(ep - AL(index)))^v(index) - 1);
end

function 2

function [Scluster, Bcluster] = powerGrouping(total_cluster,s_channel,b_channel)
M = total_cluster/2;
Schannel_condition = vecnorm(s_channel, 2, 2);
Bchannel_condition = vecnorm(b_channel, 2, 2);
Channel_condition = [Schannel_condition,Bchannel_condition];
for idx=1:2
    [sorted_channels, sorted_indices] = sort(Channel_condition(:,idx));
    clusters = cell(M, 1);
    cluster_index = 1;
    for i = 1:M
        % Pair the i-th smallest with the i-th largest
        user1 = sorted_indices(i);  % User with i-th smallest channel condition
        user2 = sorted_indices(length(Schannel_condition)-i+1);  % User with i-th largest channel condition
        clusters{cluster_index} = [user1, user2];
        cluster_index = cluster_index + 1;
    end
    if idx == 1
        Scluster = clusters;
    else
        Bcluster = clusters;
    end
end
end

Thank you Michal. I will try to use the latest version!

You haven’t told us what the amount of the “violation” is (see previous posts by @Michal_Adamaszek and me), nor whether you evaluated constraint satisfaction, for constraints involving expressions, in the manner I indicated (if not, the supposed violations don’t reflect the actual properties of the solution).

The amount of ‘violation’ is e-8 and e-10 so I think this is within the tolerance. If I want to avoid this, should I rescale my input?

Yes, that is within tolerance. You can’t expect better without some not recommended fiddling with solver tolerance parameters.

It’s always a good idea to scale as well as you can.