Hello, i am using matlab cvx for classification, the input data is a random data of positive and negative numbers as the samples and the labels are -1 for negative numbers and +1 for positive numbers, first i train cvx with some training randomly generated data and then i test it. in the test i require cvx to predict the labels but for some reason it always predicts all the labels to be -1. i did that with two different equations for classifications from the same paper i am reading which is “interaction between financial risk measures and machine learning methods” jun-ya gotoh.

plus when i increase the number of samples (s) over 1000 the training code doesnt work

here is the code

``````clc
close all
clear all
s=1000; %number of samples
%randomly generated samples
samplesneg1 = -1 +.07*rand(s,1);
samples1 = 1+.07*rand(s,1);
y = [ zeros(s/2,1) ; ones(s/2,1) ];

for i =1:length(y)
if(y(i)<1)
y(i)=-1;
end
end

trainneg1 = samplesneg1(1:s/2 , :);
train1 = samples1(1:s/2 , :);
x = [ trainneg1 ; train1 ];

m=length(y);

%generating test samples
xt= [ samplesneg1((s/2 +1):s, :) ; samples1((s/2 +1):s , :) ];
xt = xt (randperm (length (xt)));

n= size(x(1,:)); %size of one sample

cvx_begin

variables w(n)

minimize ( ((1/2)* norm(w,2)^2));

subject to

for i=1:m
( y(i)*((w*(x(i,:))') -1)) >= 0;
end

cvx_end

disp('after')

labelsT = [];
for i=1:length(xt)
if(xt(i)<0)
labelsT(i)=-1;

else if (xt(i)>0)
labelsT(i)=1;
end
end
end

w1=w;
u=length(xt);

cvx_begin

variables yf(s)

minimize ( ((1/2)* norm(w1,2)^2) );

subject to

for i=1:m
(yf(i)* ((w1*(xt(i,:))') -1 )) >= 0;
end

cvx_end

for i=1:length(yf)
if(yf(i)<=0)
yf(i)=-1;
else
yf(i)=1;
end
end

c=0;
L = labelsT;
for i=1:length(L)
if(L(i)~=yf(i))
c=c+1;
end
end

error = (c/length(L))*100``````

In your 2nd CVX invocation, CVX is able to analytically determine an optimal solution without calling the solver. Specifically, the only constraints are for i from 1 to n,
`(yf(i)* ((w1*(xt(i,:))') -1 )) >= 0`
Clearly, as CVX determines analytically, `yf` = vector of all zeros is feasible. `w` is not declared as a variable (nor is it a CVX expression), therefore there is nothing to optimize, and this is just a feasibility problem, for which `yf` = vector of all zeros is feasible, and therefore, “optimal”.

The value of `n` comes out to 1, which I suspect is not what you want. Bit that is a MATLAB, not CVX matter for you to deal with. As a result, in your first CVX invocation,
`variables w(n)` declares a scalar variable, which I doubt is what you want.

Rather than using `(1/2)* norm(w1,2)^2)` as the objective function, it would be better to use `norm(w1,2)`, which is equivalent in the sense of producing the same argmin in exact arithmetic, but is more numerically stable and reliable.

If your objective did not produce an error message, you must have been using CVX 3.0beta. I recommend you use CVX 2,.1 instead, because CVX 3.,0beta has many bugs, and may produce the wrong answer without producing any error or warning messages. If you do use CVX 2.1, then I recommend you don’t square `norm`, but if you really want to, then use `square_pos(norm(w1,2))` to comply with CVX 2.1’s stricter DCP rules than what CVX 3.0 beta allows.

As for “training code doesn’t work” when number of samples is over 1000, specifically what happened? I tried the code with s = 2000, and it worked as well (if you want to call it that) as with s = 1000.

I didn’t look very carefully at what you did, so don’t assume everything not mentioned is correct. Nor did I examine at all whether your statistical procedure makes any sense.