Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete unnecessary output #3

Closed
wants to merge 1 commit into from
Closed

Conversation

idavydov
Copy link

@idavydov idavydov commented Mar 8, 2016

Sometimes the following message is written to the stdout:

  Positive dir derivative in projection 
  Using the backtracking step 

Since it is a library, I don't think writing a message, especially to stdout is a good idea.
What do you think about removing those lines?

P.S. Thanks for a very useful library!

@afbarnard
Copy link
Owner

I agree that, in principle, a library should be quiet. However, numerical computations can sometimes break assumptions, and those cases warrant a message. Here are my thoughts so far.

  • Are your objective and gradient functions correct? Having a positive derivative after projecting the gradient means that one of those functions is inaccurate or you have a nonconvex objective.
  • I would like to leave the L-BFGS-B Fortran code alone since I didn't write it. But I could maintain patches against it.
  • One option would be to change the Fortran output to be on standard error. Ideally some sort of logging would be used, but, hey, this is Fortran. I would prefer changing all the Fortran output so that things are consistent. Would you like to take this on?

@idavydov
Copy link
Author

idavydov commented Mar 9, 2016

  • In my problem likelihood function is very difficult to compute and no analytical form for the gradient exists. I'm not aware of any studies of the likelihood surface, it might be a non-convex; maybe there are numerical problems. A the same time in practice lbfgsb was shown to work well (not worse than non-convex methods).
  • Actually it seems that every function has an iprint argument, a value <0 means that no output is generated.
c     iprint is an INTEGER variable that must be set by the user.
c       It controls the frequency and type of output generated:
c        iprint<0    no output is generated;
c        iprint=0    print only one line at the last iteration;
c        0<iprint<99 print also f and |proj g| every iprint iterations;
c        iprint=99   print details of every iteration except n-vectors;
c        iprint=100  print also the changes of active set and final x;
c        iprint>100  print details of every iteration including x and g;
c       When iprint > 0, the file iterate.dat will be created to
c                        summarize the iteration.
  • In this particular fragment iprint value is not checked; seems like a small bug in the original fortran code. Would you accept a patch which adds a second condition. Something like this: if ( iprint .ge. 1) then?

@afbarnard
Copy link
Owner

First of all, do you mean you compute an approximation of the gradient or that you don't have a gradient function at all? (L-BFGS-B is a gradient-based method and requires a gradient to work.)

Also, I should be clear that L-BFGS-B will work for nonconvex problems but it will only find local minima in those cases. Managing the global optimization is up to you (random restarts, etc.).

Updating the L-BFGS-B Fortran code to consistently support the iprint parameter would be a good solution. I'll take a look at the Fortran code to remind myself how iprint is handled and to see if there are other places where the control of output needs to be updated.

@idavydov
Copy link
Author

What I mean is that I do not have an analytical form for the gradient. But I am able to compute likelihood at any point, this allows me to compute the gradient.

Thanks, I can create a patch if needed, it is trivial. It seems that in all other places write is wrapped with if ( iprint .ge. X ) then.

@afbarnard
Copy link
Owner

Sorry for the delay on this and my lack of attention. I haven't had much time to work on this recently.

I will take a look at your new pull request (#6) and at the new methods for passing pointers with cgo (#4) sometime later this week.

Let's keep this discussion open until this issue is fixed as this is where all the information is.

Also, since you can compute the function value (likelihood) at any point but not the gradient, it seems like one of the non-gradient-based optimization algorithms (that use finite differences and/or interpolation) would be worth looking into. Such an algorithm could be more efficient as it maintains its own approximation of the gradients based on the function values. (I assume you're already aware of these algorithms, but I figured I should mention it in case you're not.)

@idavydov
Copy link
Author

Thanks, I'm looking forward.

Regarding the optimization algorithm. I already tried couple of gradient free methods (e.g. those implemented in nlopt). It seems so far they have a much worse performance compared to LBFGSB. But if you have any particular suggestions, e.g. your favorite gradient-free method, it would be interesting to hear.

@afbarnard
Copy link
Owner

If you have already tried those algorithms in nlopt, then I don't have anything additional to suggest.

@afbarnard
Copy link
Owner

What do you think about commit ab8187a? It's very similar to what you proposed in #6 except that it includes the research and reasoning. I also thought the output level should be >= 0 rather than > 0. If you don't have any complaints I will merge it into master.

@afbarnard
Copy link
Owner

Fixed in commit ab8187a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants