Support the FATROP NLP solver in Moco #3787

nickbianco · 2024-05-20T18:30:28Z

Add support for the FATROP solver via the recently added support through CasADi's nlpsol.

Requires:

Adding FATROP as a dependency
Updating the CasADi dependency to a bleeding edge commit
Rearrage problem sparsity structure based on CasADi requirements

nickbianco · 2024-09-18T22:50:12Z

Leaving some thoughts here as I table development on this feature for now. These thoughts are based on the state of #3906 at commit 93d0060.

Basic support for FATROP is working! But...
It is not at all faster than Ipopt currently. This is most likely because we still rely on finite differences.
Some problems lead to a "degenerate Jacobian" warning. For example, "legendre-gauss" schemes when using kinematic constraints issues this warning, possibly because I needed to manually program in a FATROP-suitable problem structure. It works, but it would be good to understand this warning (rank deficiency?).
Refactoring to calculate kinematic constraints in a seprate CasADi callback function improves sparsity, but currently slows down performance, at least in the perf benchmarking suite. With automatic differentiation, this performance drop may be less noticeable or even negligable. This implementation is much cleaner so it would be great to keep this way if possible.
Interpolating controls symbolically enables a structure suitable to FATROP, but may be causing other problems. With this change, the MocoSquatToStand perf test has spiky excitations and activations that it didn't have before (struggling with activation dynamics?).
Reverting back to the current approach for interpolating controls via constraints would break the structure needed for FATROP, but might be needed based on the previous point.
An alternative approach would be to introduce a set of "gap-closing" constraints that would essentially turn each transcription scheme into multiple shooting problems. I don't love this approach, but it might be the simplest, and would have the benefit of working for all transcription schemes.

nickbianco · 2024-09-20T17:49:25Z

A quick comparison between Ipopt and FATROP with the same solver settings. Running a 1-DOF "sliding mass" problem that travels one meter and starts and ends at rest:

[info] ========================================================================
[info] MocoCasADiSolver starting.
[info] Fri Sep 20 10:41:52 2024
[info] ------------------------------------------------------------------------
[info] Costs: (total: 1)
[info]   goal. MocoFinalTimeGoal, enabled: true, mode: cost, weight: 1.0
[info] Endpoint constraints: none
[info] Kinematic constraints: none
[info] Path constraints: none
[info] States: (total: 2)
[info]   /slider/position/speed. bounds: [-50, 50] initial: 0 final: 0
[info]   /slider/position/value. bounds: [-5, 5] initial: 0 final: 1
[info] Controls: (total: 1)
[info]   /actuator. bounds: [-50, 50]
[info] Input controls: none
[info] Parameters: none
[info] Number of threads: 24

The shared solver settings:

// Configure the solver.
// =====================
MocoCasADiSolver& solver = study.initCasADiSolver();
solver.set_num_mesh_intervals(50);
solver.set_kinematic_constraint_method("Bordalba2023");
solver.set_optim_hessian_approximation("exact");
solver.set_transcription_scheme("legendre-gauss-2");

Ipopt output

List of user-set options:

                                    Name   Value                used
                   hessian_approximation = exact                 yes
                           linear_solver = mumps                 yes
                      print_user_options = yes                   yes

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

This is Ipopt version 3.14.16, running with linear solver MUMPS 5.6.2.

Number of nonzeros in equality constraint Jacobian...:     1887
Number of nonzeros in inequality constraint Jacobian.:        4
Number of nonzeros in Lagrangian Hessian.............:     1940

Total number of variables............................:      499
                     variables with only lower bounds:        0
                variables with lower and upper bounds:      399
                     variables with only upper bounds:        0
Total number of equality constraints.................:      400
Total number of inequality constraints...............:        2
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        2
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  2.5000000e+00 1.00e+00 1.96e-02  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  1.2500000e+00 5.31e-02 4.48e-01  -1.0 4.25e+00    -  6.78e-01 1.00e+00f  1
   2  2.1608090e+00 4.73e-02 4.72e-01  -1.0 5.19e+00    -  6.14e-01 1.00e+00h  1
   3  4.8851989e-01 8.07e-02 2.71e-01  -1.0 4.83e+00  -2.0 9.18e-01 1.00e+00f  1
   4  7.5175680e-01 2.59e-02 1.38e-01  -1.0 1.16e+01    -  1.00e+00 5.00e-01h  2
   5  8.4896735e-01 3.00e-03 4.73e-02  -1.7 3.08e+00    -  1.00e+00 1.00e+00h  1
   6  6.3246649e-01 1.81e-02 3.36e-03  -2.5 8.36e+00    -  1.00e+00 1.00e+00h  1
   7  4.9899912e-01 1.57e-02 3.22e-03  -3.8 1.18e+01    -  9.20e-01 1.00e+00h  1
   8  4.2294611e-01 1.57e-02 1.39e-03  -3.8 2.07e+01    -  1.00e+00 1.00e+00h  1
   9  4.1309525e-01 1.15e-03 8.63e-05  -3.8 1.16e+01    -  1.00e+00 1.00e+00h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  4.0154322e-01 1.58e-03 5.98e-05  -5.7 1.37e+01    -  8.45e-01 1.00e+00h  1
  11  4.0024276e-01 3.32e-04 1.33e-05  -5.7 2.56e+01    -  7.68e-01 1.00e+00h  1
  12  4.0018781e-01 7.20e-06 3.22e-07  -5.7 1.31e+01    -  1.00e+00 1.00e+00h  1
  13  4.0000065e-01 4.48e-06 8.31e-08  -8.6 2.39e+00    -  9.77e-01 1.00e+00h  1
  14  4.0000025e-01 3.12e-10 4.87e-10  -8.6 7.87e-02    -  1.00e+00 1.00e+00h  1

Number of Iterations....: 14

                                   (scaled)                 (unscaled)
Objective...............:   4.0000025362775571e-01    4.0000025362775571e-01
Dual infeasibility......:   4.8670777094185231e-10    4.8670777094185231e-10
Constraint violation....:   3.1176278225686360e-10    3.1176278225686360e-10
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   2.6859964628739728e-09    2.6859964628739728e-09
Overall NLP error.......:   2.6859964628739728e-09    2.6859964628739728e-09


Number of objective function evaluations             = 16
Number of objective gradient evaluations             = 15
Number of equality constraint evaluations            = 16
Number of inequality constraint evaluations          = 16
Number of equality constraint Jacobian evaluations   = 15
Number of inequality constraint Jacobian evaluations = 15
Number of Lagrangian Hessian evaluations             = 14
Total seconds in IPOPT                               = 1.236

EXIT: Optimal Solution Found.
         nlp  :   t_proc      (avg)   t_wall      (avg)    n_eval
callback_fun  | 423.00us ( 28.20us) 422.34us ( 28.16us)        15
       nlp_f  | 799.00us ( 49.94us) 796.90us ( 49.81us)        16
       nlp_g  |  25.71ms (  1.61ms)  22.51ms (  1.41ms)        16
  nlp_grad_f  |   2.50ms (147.06us)   2.50ms (147.18us)        17
  nlp_hess_l  |  17.63 s (  1.26 s)   1.03 s ( 73.35ms)        14
   nlp_jac_g  |   1.76 s (109.97ms) 163.20ms ( 10.20ms)        16
       total  |  19.44 s ( 19.44 s)   1.24 s (  1.24 s)         1

FATROP output

accepted lam 
 it  obj            cv        du        lg(mu) reg  alpha_du  alpha_pr  ls
  0, 2.5000000e+00, 1.00e+00, 1.92e-02,  2.0,  -.-, 0.00e+00, 0.00e+00, 0x 
  1, 2.1875000e+00, 7.50e-01, 7.21e-01,  0.6,  -.-, 9.85e-01, 2.50e-01, 3h 
  2, 2.4113630e+00, 5.62e-01, 4.81e-01, -0.8,  -.-, 7.20e-01, 2.50e-01, 3h 
  3, 2.4113620e-02, 5.10e-01, 4.46e-01, -0.8,  -.-, 9.25e-02, 9.41e-02, 1H 
  4, 2.4112630e-04, 5.04e-01, 5.50e+00, -0.8,  -.-, 5.61e-01, 1.06e-02, 1H 
degenerate Jacobian
  5, 2.2519644e-01, 3.37e-01, 5.48e+03, -0.8,  -.-, 3.65e-04, 3.31e-01, 2h 
  6, 3.2824739e-01, 3.14e-01, 5.11e+03, -0.8,  -.-, 2.72e-01, 6.85e-02, 2h 
  7, 6.1166171e-01, 2.21e-01, 3.59e+03, -0.8,  -.-, 1.00e+00, 2.98e-01, 2h 
  8, 8.9591865e-01, 5.87e-03, 3.48e+01, -0.8,  0.0, 1.00e+00, 1.00e+00, 1h 
  9, 8.4861091e-01, 1.00e-04, 1.29e+00, -0.8, -0.5, 1.00e+00, 1.00e+00, 1h 
 10, 8.4357947e-01, 8.83e-05, 4.72e-02, -1.5,  -.-, 1.00e+00, 1.00e+00, 1h 
 11, 6.2921858e-01, 2.12e-02, 2.63e-03, -3.4,  -.-, 8.72e-01, 1.00e+00, 1h 
 12, 5.1457245e-01, 1.36e-02, 4.48e-03, -3.4,  -.-, 9.68e-01, 1.00e+00, 1h 
 13, 4.3535812e-01, 1.55e-02, 1.55e-03, -3.4,  -.-, 1.00e+00, 1.00e+00, 1h 
 14, 4.3264790e-01, 1.27e-04, 2.77e-05, -3.4,  -.-, 1.00e+00, 1.00e+00, 1h 
 15, 4.0715170e-01, 3.09e-03, 2.45e-04, -5.0,  -.-, 8.72e-01, 1.00e+00, 1h 
 16, 4.0128508e-01, 1.53e-03, 2.66e-05, -5.0,  -.-, 9.56e-01, 1.00e+00, 1h 
 17, 4.0090454e-01, 1.03e-04, 2.61e-07, -5.0,  -.-, 1.00e+00, 1.00e+00, 1h 
 18, 4.0090971e-01, 2.38e-07, 2.25e-07, -5.0,  -.-, 1.00e+00, 1.00e+00, 1h 
 19, 4.0001186e-01, 8.60e-05, 1.94e-06, -7.6,  -.-, 9.05e-01, 1.00e+00, 1h 
 20, 4.0000277e-01, 1.55e-07, 7.29e-10, -7.6,  -.-, 1.00e+00, 1.00e+00, 1h 
 21, 4.0000010e-01, 1.04e-09, 2.86e-11, -9.0,  -.-, 1.00e+00, 1.00e+00, 1h 
found solution
---- stats ----
compute_sd:     0.004085 s
duinf:          0.000204 s
initialization: 0.001529 s  count: 21
time_FE :       2.0309 s
    eval hess:  1.73628 s  count: 21
    eval jac:   0.239181 s  count: 23
    eval cv:    0.04839 s  count: 36
    eval grad:  0.004099 s  count: 23
    eval obj:   0.002948 s  count: 79
rest  :       0.002365 s
----- 
time_w/o_FE : 0.008183 s
time_FE :       2.0309 s
time_total : 2.03908 s  iterations: 21
         nlp  :   t_proc      (avg)   t_wall      (avg)    n_eval
       nlp_f  |   2.80ms ( 35.49us)   2.81ms ( 35.54us)        79
       nlp_g  |  54.82ms (  1.52ms)  48.17ms (  1.34ms)        36
  nlp_grad_f  |   3.98ms (173.17us)   3.99ms (173.47us)        23
  nlp_hess_l  |  28.96 s (  1.38 s)   1.74 s ( 82.65ms)        21
   nlp_jac_g  |   2.66 s (115.49ms) 238.50ms ( 10.37ms)        23
       total  |  31.75 s ( 31.75 s)   2.05 s (  2.05 s)         1

Summary

With the current implementation of Moco and the above solver settings, FATROP takes more solver time (2.05 s) compared to Ipopt (1.24 s). Notably, FATROP requires more iterations, more function evalulations in every category, and loses the most time in computing the Hessian. This is likely because Moco still relies on finite-differencing, which may be negating the benefits of FATROP.

nickbianco added the Moco This label identifies bugs or desired features to aid Moco development label May 20, 2024

nickbianco self-assigned this May 20, 2024

nickbianco mentioned this issue Sep 9, 2024

[WIP] Add FATROP solver support to Moco #3906

Draft

nickbianco added the Feature label Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support the FATROP NLP solver in Moco #3787

Support the FATROP NLP solver in Moco #3787

nickbianco commented May 20, 2024

nickbianco commented Sep 18, 2024 •

edited

Loading

nickbianco commented Sep 20, 2024 •

edited

Loading

Support the FATROP NLP solver in Moco #3787

Support the FATROP NLP solver in Moco #3787

Comments

nickbianco commented May 20, 2024

nickbianco commented Sep 18, 2024 • edited Loading

nickbianco commented Sep 20, 2024 • edited Loading

Ipopt output

FATROP output

Summary

nickbianco commented Sep 18, 2024 •

edited

Loading

nickbianco commented Sep 20, 2024 •

edited

Loading