-
Notifications
You must be signed in to change notification settings - Fork 6
Improvements to nls
nls() is the primary nonlinear modelling tool in base R. It has a great many features, but it is about two decades old and has a number of weaknesses, as well as some gaps in documentation. Recently, the proposing mentor for this project has submitted a patch that overcomes one of the deficiencies in nls() and, furthermore, does this in a way that allows legacy operation to continue as the default.
This project aims at providing documentation and possible patches to incorporate other improvements, including better diagnostics to assist users to understand when output results may be inadequate.
Packages nlsr and minpack.lm both address the lack of Levenberg-Marquardt stabilization in nls(), which uses a plain Gauss-Newton solver to carry out the internal iterations to solve the underlying nonlinear least squares problem. nlsr also offers analytic/symbolic derivatives which can improve the solution or allow it to be found. However, users frequently do not discover these packages, or do not understand some of the details. Merging some of the advantages of these packages into nls() would likely give users better quality output.
Package optimx offers access to a number of nonlinear optimization packages. These can be used to minimize (weighted) sum of squares objective functions, but generally are not as efficient at finding the solutions.
Two particular tasks are to merge the analytic derivatives of nlsr into the model parsing (to compute the Jacobian used in the Gauss-Newton equations for nonlinear least squares) and the addition of a Levenberg- Marquardt stabilization of the solution of those equations.
The first stage work would be to find ways to incorporate such ideas. A second stage is to work out how to allow the changes to be activated only by easily-executed user actions, so that legacy behaviour is retained, as nls() has a large number of reverse dependencies.
Clearly, any code patches require parallel documentations, and there should be a development vignette to allow for ongoing maintenance. (At the moment, nls() is not very well documented from this perspective.)
Tests can and probably should be simple extensions of existing tests for nls() and/or nlsr and minpack.lm.
nls() has the option of using
algorithm="plinear"
but the proposing mentor has at least one example where the this choice gives a different model for the same formula as the default model. Clearly this could be problematic for users and should be corrected.
The goal of plinear -- partially linear models -- should be addressed.
Similarly, nls() can handle indexed parameters, that is, parameters that can be referenced by an integer so that a suite of related estimates can be stored in a table or array. This should be better documented, especially from the point of program maintenance and improvement, with the goal to extend the functionality to nlsr or minpack.lm.
If successful, the changes will modernize an important tool in base R. Furthermore, if well-organised programmer documentation can be provided, future maintainers will have an easier job.
MENTORS:
- EVALUATING MENTOR: John C. Nash, [email protected]. I have been a mentor and also an Org Admin for R's Google Summer of Code for over a decade. One of the creators of packages nlsr and optimx among others, and author of several books on nonlinear optimization and numerical computing.
- Other Mentors:
- Hans W. Borchers, [email protected]. I have been a mentor and co-mentor for several R-GSoC projects during the last years.
- Heather Turner, [email protected]. I am the lead developer of several statistical modelling packages, notably the gnm package for generalized nonlinear models.
Some data for the tests.
time y
5 0.0074203
6 0.3188325
7 0.2815891
8 -0.3171173
9 -0.0305409
10 0.2266773
11 -0.0216102
12 0.2319695
13 -0.1082007
14 0.2246899
15 0.6144181
16 1.1655192
17 1.8038330
18 2.7644418
19 4.1104270
20 5.0470456
21 6.1896092
22 6.4128618
23 7.2974793
24 7.8965245
25 8.4364991
26 8.8252770
27 8.9836204
28 9.6607736
29 9.1746182
30 9.5348823
31 10.0421165
32 9.8477874
33 9.2886090
34 9.3169916
35 9.6270209
Estimate, or try to estimate, a logistic sigmoid growth curve to this data.
Estimate, or try to estimate, the alternative form of the 3 parameter logistic growth curve (The following Latex form may not show up on Github.)
Can you explain why this is more difficult to estimate?
Convert the problem to one that uses a function for the residuals (and ideally
the Jacobian) and solve
the nonlinear least squares problem with a suitable tool from packages nlsr
and
minpack.lm
.
Show how to do this with both analytic Jacobian and one or more approximations.
The Evaluating Mentor has prepared solutions to each of the tests to verify that they are doable.
Students, please post a link to your test results here.
S No. | STUDENT NAME | GITHUB PROFILE | TEST RESULTS LINK |
---|---|---|---|
1 | Aarnob Guha | KW781 | https://github.com/KW781/nls-improvements-Tests |
2 | Arkajyoti Bhattacharjee | ArkaB-DS | https://github.com/ArkaB-DS/Improvements-to-nls--Solutions-to-Tests |