Very low value of a parameter in the evaluation function called by optim (Nelder-Mead)

Hi,
I am using optim function (Nelder-Mead). At some step, the optimizer is calling the evaluation function with very low value of one of the parameters. Could you please let me know what could be the possible mistake?
Starting value of parameter = 1.0
Trace:
Nelder-Mead direct search function minimizer
function value for initial parameters = 1786643.471146
Scaled convergence tolerance is 17.8664
Stepsize computed as 0.100000
BUILD 4 1836517.643250 1455432.712348
At this point, it calls the function with parameter value 0.0666666666666667
EXTENSION 6 1786643.471146 1169954.096582
At this point, it calls the function with parameter value 0.866666666666666
EXTENSION 8 1476566.987603 820223.481674
At this point, it calls the function with parameter value 0.733333333333333
EXTENSION 10 1455432.712348 583281.183158
At this point, it calls the function with parameter value 0.466666666666666
EXTENSION 12 1169954.096582 354039.230579
At this point, it calls the function with parameter value -9.99200722162641e-16
Any clue, why?
Let me know if you need any other information.
Thanks in advance,
Soumitra

It just means the parameter is near zero. It's not a problem. The parameter has also gone negative. Does your function allow negative parameter values?

Your function values are also very large (around 1 million), this could mean the algorithm might have trouble calculating the slope accurately. Can you scale your objective function to smaller values?

Thank you, woodward. The function is unconstrained so negative parameters are ok. The reason I am worried is sudden jump on the parameter value. I pass reltol = 1e-5 to optim expecting it to stop sooner.
By scaling down, do you mean to use log of the function? In fact it is a loglikelihood function.

So what is the problem? Many optim methods works by taking jumps. To be honest Nelder-Meads is a bit old and huckery, they are better methods out there. Usually start with a hill climbing method like Levenberg–Marquardt and if this doesn't work try an Evolutionary Algorithm or something.

Are you trying to find a maximum likelihood parameter set? Are you sure you've specified the log likelihood correctly? A log likelihood is often a largish negative number, since it's the sum of the log of a lot of probability density values, one for each data point. Probability densities tend to be (much) smaller than one, so their logs are negative. (In weird cases they can be positive but this only occur when the parameter range is rather small, typically much less than one).

I am trying to understand an existing R package implementation. The reason they use Nelder-Meads most probably is that the gradient and hessian will take extra effort to calculate. The function is computing -2L where L is the loglikelihood function and hence I guess the function is positive.

However, they do a lot of approximation in computing -2L. And that approximation step gives an error when the parameter becomes very low in magnitude. In case you are interested, I am trying to understand the code in function "ou.lik.fn" in the following file:
https://github.com/kingaa/ouch/blob/master/R/hansen.R

I think I now have a better understanding, thanks to your explanation. At least it is helping me to progress in understanding the code. I will ping you in case I need more help. I tried to contact the package author but I have not got any help yet.

So you wan to minimise -2L, which seems correct. Is there only one parameter?

Is there a default assumption of non-negativity? This might be why the parameter hits near 0? Although I see the default settings for optim do not assume non-negativity.

Is this a tree search function? In which case Nelder-Meads is a poor choice and an Evolutionary Algorithm is better, like SANN.

Thank you so much, Simon. I have progressed a little more. Basically, we need to minimize the following log likelihood function U = -2 log L where
U(alpha, sigma | x) = N log (2 pi) + log( det (V) ) + (x - W theta)' V (x - W theta)
where V, W, theta are matrices of functions of alpha and sigma. For some values of data, i.e. x I see that U is -ve at the optimum returned by optim (I am still using Nelder-Meads, for the time being to have an idea of what is going on).
Do you have any idea why U would be negative?
I noticed that the negative U happens because det(V) is very small say 1.28479711635891e-43. Is there a (mathematically sound) way to stop optim before U goes below 0?

It doesn't matter whether U is positive or negative, provided you're doing the right thing. Could you post a reprex?

One thing you can do is plot U against the parameters to see the shape of the surface.

Nelder-Mead has a tendency to get stuck in local minima, that's why SANN might be better, depending on the problem.

If the determinant of a matrix goes to zero, the matrix becomes singular, and it can't be inverted. If det(V)<0 then you will have a problem because you can't take the log.

Hi Simon, I now have a better understanding. I found out that the authors mention in the appendix of the paper that for alpha very small or very high the values of -2L becomes very unstable and in their experimental results they restricted alpha to 0.001 < alpha < 2. I don't understand how this condition is ensured in their code. However, when I changed optim to nloptr for subplex, and set the bounds accordingly I see stable output.

However, I still see some sample x gives -ve value of -2L. I am worried about this because this makes logLik value positive which is used in pvalue computation by log-ratio test.

Let me know if I should give a sample x and the code which gives -ve -2L.

On a slight digression, I was wondering if you have any insight about the following aspect of the extension of loglikelihood function that I am currently working on.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.