Beginner's question

I would like to minimize this function –

( 0.5 * square( y-yf_dot * (theta1/theta3)-yf * (theta2/theta3)-uf * (1/theta3)

– where I supply values for y, yf_dot, yf and uf and would like to solve for variables theta1, theta2, and theta3.

My understanding is that the function is convex for theta3>0 (which is my constraint), and the problem should be well-posed. However, I get the error “Cannot perform the operation: {real affine} ./ {real affine}”. What is wrong with my code?

uf =5; yf=3; yf_dot=0.1; y=3;

cvx_begin

variable theta1

variable theta2

variable theta3

minimize( 0.5 * square( y-yf_dot * (theta1/theta3)-yf * (theta2/theta3)-uf * (1/theta3) ))

subject to

theta3>=0

cvx_end

error:

Error in ==> cvx.rdivide at 19

z = times( x, y, ‘./’ );

Error in ==> cvx.mtimes at 36
z = feval( oper, x, y );

Error in ==> cvx.mrdivide at 15
z = mtimes( x, y, ‘rdivide’ );

I found a way to calculate the Hessian for my function in Matlab and confirm what Mark said:
1). I found function Hessian (here: http://www.mathworks.com/matlabcentral/fileexchange/17847-symbolic-hessian-evaluator/content/hessian.m)
2). Then I ran this code

syms uf yf yf_dot y theta1 theta2 theta3

f = 0.5*(y-yf_dot*(theta1/theta3)-yf*(theta2/theta3)-uf*(1/theta3))^2;

H=hessian(f,[theta1 theta2 theta3]);

H1=subs ( H, {uf, yf, yf_dot, y, theta1, theta2, theta3}, …
{5, 3, -4, 2, 1, 1, 1})

eig(H1)

I indeed made an error when I derived the Hessian by hand. The eig. values found in Matlab were -1.703476621077913, 0 and 58.703476621077911. This function is not convex due to the eig. value of -1.7, as Mark said earlier. Thank you, Mark.


Thanks,
Sonja

Unless I made a mistake, your problem is not convex. I evaluated the Hessian at theta1=theta2=theta3=1, and obtained the eigenvalues -0.0018, 0, 148.2418, which shows it is not convex. If theta3 is set equal to a non-zero constant (positive or negative), then your objective function is convex in theta1 and theta2, and will be readily accepted and solved in CVX if you make only theta1 and theta2 your variables, and this results in a minimum objective value of 0.

Mark’s answer is right. But for the benefit of all readers, I want to add a more general point, that frankly I find cannot be hammered home enough:

CVX does not solve convex programs.

What? “Of course it does,” I hear you saying. Yes, but the problem is that people often interpret the statement this way:

If my problem is convex, then CVX can solve it.

and this, unfortunately, is not true. A more accurate statement is this:

CVX solves disciplined convex programs (DCPs).

But even that’s not quite enough! Because if you look at how we define the term “disciplined convex program”—in, say, my Ph.D. thesis—every convex program could be converted to a DCP if you just define the right function set. So really, the best statement is this:

CVX solves only those DCPs that can be constructed using only the functions found in the function reference, and combined only according to the rules enumerated in the DCP ruleset.

That’s a mouthful, I know. But I think that for those who have spent time with CVX, and truly understand its intended purpose and deliberate limitations, it is clear why the qualifications are necessary.

So why is it limited in this way, then? The answer has to do with exactly what CVX has to do to your models before they can be solved. It has to perform a sequence of transformations on your models to convert them to a form that the underlying solvers can accept (SeDuMi, SDPT3, MOSEK’s conic solver, and Gurobi). And those solvers can’t solve every problem you can conceive of; they have a limited set of nonlinearities for us to work with. To be perfectly frank, I think is impressive how far we can go using just the nonlinearities supported by these solvers.

This may be little comfort for someone who has a model CVX can’t handle, I understand. Such is the case here. But the good news in your case is that this is an unconstrained smooth problem, a good old descent method—gradient search, Newton’s method, BFGS—will suffice. Just be careful with your step sizes!

Mark and MCG,

thank you both for your answers. I am not sure how to distinguish a DCP from another convex problem yet because this is my first experience solving an optimization problem.

Mark,

did you derive the Hessian matrix by hand or in Matlab? If you did it in Matlab, could you post the script you ran? I derived H by hand (for theta1=theta2=theta3=1 and the values for uf =5; yf=3; yf_dot=0.1; y=3, and I got H=[0.01 0.3 -0.51; 0.3 3 -3.3; -1.4 -24.6 27.6], which is not symmetric->I must have made an error.


Thanks, Sonja

I used Maple. Given required derivative continuity, as we have here, the Hessian should be symmetric; yours is not, so is wrong. I don’t have last night’s session (it was late, so maybe I made a mistake), and recreating now, I get different eigenvalues, -1.4765, 0, 158.7165. So (still) not convex.

Of course, if you (are able to) specify your optimization problem following the rules mentioned above by mcg, then your optimization problem will necessarily be convex - either that, or there is a problem with the DCP ruleset, and mcg’s Ph.D. (thesis) should be revoked (kidding).

For the record, the Hessian (at theta1=theta2=theta3=1) I used in the comment above is [0.010 0.30 -1.320; 0.30 9.0 -39.60; -1.320 -39.60 148.230], which of course is symmetric. I picked those values of the thetas just to demonstrate non-convexity; I could as well have picked other values.

Honestly, I think that even trying CVX is a waste of time if a problem is smooth and unconstrained. A descent method would be preferred. If the problem is nonconvex, the situation is difficult; I’d try a gradient method and hope for a good local optimum.

Or in this particular case, choose whatever values you want for two of theta1, theta2, theta3 (>=0), then solve the linear equation in one unknown consisting of argument of square equals 0; this achieves objective of 0, which is good as can be. The argmin vector is not a single point in this case.

Hi Mark, I cannot arbitrarily choose values for two theta’s. This problem is to solve for theta’s that will be used in a control algorithm in my undergraduate project. My goal is to minimize the error between y and the estimate of y=yf_dot * (theta1/theta3)-yf * (theta2/theta3)-uf * (1/theta3).

So I have y, yf_dot, yf and uf and I need to find theta’s that minimize the error.