Domain of smooth function


(Carl Nettelblad) #1

I am trying to use TFOCS for a problem where I tried interior point methods before, as well as with crude alternating projection based methods. The first class was extremely slow for my problem sizes (in the range of 1e6 unknowns, dense problem) and sometimes suffered numerical issues where very similar formulations seemed to lead the solver to solutions of very different quality. Hand-made alternating projections with some improvised gradient descent worked better, but convergence could be ridiculously slow there as well.

I am fairly sure that my problem is indeed convex, but my objective function (f in my tfocs call) is only defined in R+. As far as I understand, this should still be supported? Several of the included examples for smooth functions seem to have similar restrictions, and based on those I return Inf for function value and NaN for gradient outside of this region.

In practice, I end up with the method terminating with the following message:
“Finished: Step size tolerance reached (||dx||=0)”

Sometimes the solution seems fine, but in purely simulated cases, where I can plug in the correct solution, the objective function value of that solution is clearly superior, with a relative error > 1 %. I’ve verified specifically that the trajectory of linear combinations between my x0 and the correct solution are all within the defined region and that it is numerically convex.

I realized that this could be due to the optimum for some variables being exactly at the boundary of the domain. Any non-zero step along the gradient will bring the function outside the domain. Things can be improved by adding a convex barrier term, at the cost of some optimality. However, I still get early termination with weird results, maybe due to my projection in g interacting in weird ways.

Is the answer simply that such domain restrictions are not allowed? There might be numerical inaccuracy issues at play as well. Initially, I thought that this was all about f(y)-f(y + dy) = 0 and that things might be improved by allowing the smooth function to specifically compute this difference (this would require an API change). However, I am not all that sure anymore that this would help.

I would be happy to provide more details, but some dos and donts for the R+ restriction seems to be the place to start.


(Carl Nettelblad) #2

I think I’ve found a very hacky solution to my current situation. In tfocs_backtrack, L is re-estimated based on the current step. If the function is singular, the L estimated from a large step might lead to an exceedingly small next step, so small that it is less than machine epsilon. In fact, L is not that high in a smaller neighborhood, so there is indeed a “safe” small step - or, there is in my case.

By changing line 36 in tfocs_backtrack to impose a maximum value on L, I don’t get the early terminations.

L = min(min( Lexact, max( localL, L / beta ) ), 1e15);

It should be noted that restarts were not enough to avoid this. I could get better results by using restarts and/or calling tfocs again with the latest answer, but the iteration still got stuck at a far worse objective function value compared to what I am getting now.

The hard-coded 1e15 is of course not a panacea, the proper value should actually depend on the magnitude of the values.

I am not quite sure yet whether I could achieve all this by just setting Lexact. I am confounded by the phrase “prevent the step size from growing beyond t=1/Lexact” in the user guide, while what I am looking for is to limit the minimum step size. The code itself seems to use Lexact interchangeable with what I am adding here, but I haven’t traced the full logic of how it’s used.