Merge pull request #261 from SciML/docs

ChrisRackauckas · web-flow · commit 685b37d2a66c · 2022-05-28T12:35:46.000-04:00
Change OptimizationFunction documentation to use docstrings
diff --git a/Project.toml b/Project.toml
@@ -28,7 +28,7 @@ LoggingExtras = "0.4, 0.5"
 ProgressLogging = "0.1"
 Reexport = "0.2, 1.0"
 Requires = "1.0"
-SciMLBase = "1.34"
+SciMLBase = "1.37.1"
 TerminalLoggers = "0.1"
 julia = "1.6"
 
diff --git a/docs/src/API/optimization_function.md b/docs/src/API/optimization_function.md
@@ -1,158 +1,30 @@
 # [OptimizationFunction](@id optfunction)
 
-The `OptimizationFunction` type is a function type that holds all of
-the extra differentiation data required to do fast and accurate
-optimization. The signature for the constructor is:
-
-```julia
-OptimizationFunction{iip}(f,adtype=NoAD();
-                          grad=nothing,
-                          hess=nothing,
-                          hv=nothing,
-                          cons=nothing,
-                          cons_j=nothing,
-                          cons_h=nothing)
-```
-
-The keyword arguments are as follows:
-
-- `grad`: Gradient
-- `hess`: Hessian
-- `hv`: Hessian vector products `hv(du,u,p,t,v)` = H*v
-- `cons`: Constraint function
-- `cons_j`
-- `cons_h`
-
-### Defining Optimization Functions Via AD
-
-While using the keyword arguments gives the user control over defining
-all of the possible functions, the simplest way to handle the generation
-of an `OptimizationFunction` is by specifying an AD type. By doing so,
-this will automatically fill in all of the extra functions. For example,
-
-```julia
-OptimizationFunction(f,AutoZygote())
+```@docs
+OptimizationFunction
 ```
 
-will use [Zygote.jl](https://github.com/FluxML/Zygote.jl) to define
-all of the necessary functions. Note that if any functions are defined
-directly, the auto-AD definition does not overwrite the user's choice.
+## Automatic Differentiation Construction Choice Recommendations
 
 The choices for the auto-AD fill-ins with quick descriptions are:
 
 - `AutoForwardDiff()`: The fastest choice for small optimizations
 - `AutoReverseDiff(compile=false)`: A fast choice for large scalar optimizations
 - `AutoTracker()`: Like ReverseDiff but GPU-compatible
-- `AutoZygote()`: The fastest choice
+- `AutoZygote()`: The fastest choice for non-mutating array-based (BLAS) functions
 - `AutoFiniteDiff()`: Finite differencing, not optimal but always applicable
 - `AutoModelingToolkit()`: The fastest choice for large scalar optimizations
 
-The following sections describe the Auto-AD choices in detail.
-
-### AutoForwardDiff
-
-This uses the [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl)
-package. It is the fastest choice for small systems, especially with
-heavy scalar interactions. It is easy to use and compatible with most
-pure is Julia functions which have loose type restrictions. However,
-because it's forward-mode, it scales poorly in comparison to other AD
-choices. Hessian construction is suboptimal as it uses the forward-over-forward
-approach.
-
-- Compatible with GPUs
-- Compatible with Hessian-based optimization
-- Compatible with Hv-based optimization
-- Compatible with constraints
-
-### AutoReverseDiff
-
-This uses the [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl)
-package. `AutoReverseDiff` has a default argument, `compile`, which
-denotes whether the reverse pass should be compiled. **`compile` should only
-be set to `true` if `f` contains no branches (if statements, while loops)
-otherwise it can produce incorrect derivatives!**.
-
-`AutoReverseDiff` is generally applicable to many pure Julia codes,
-and with `compile=true` it is one of the fastest options on code with
-heavy scalar interactions. Hessian calculations are fast by mixing
-ForwardDiff with ReverseDiff for forward-over-reverse. However, its
-performance can falter when `compile=false`.
-
-- Not compatible with GPUs
-- Compatible with Hessian-based optimization by mixing with ForwardDiff
-- Compatible with Hv-based optimization by mixing with ForwardDiff
-- Not compatible with constraint functions
+## Automatic Differentiation Choice API
 
-### AutoTracker
-
-This uses the [Tracker.jl](https://github.com/FluxML/Tracker.jl) package.
-Generally slower than ReverseDiff, it is generally applicable to many
-pure Julia codes.
-
-- Compatible with GPUs
-- Not compatible with Hessian-based optimization
-- Not compatible with Hv-based optimization
-- Not compatible with constraint functions
-
-### AutoZygote
-
-This uses the [Zygote.jl](https://github.com/FluxML/Zygote.jl) package.
-This is the staple reverse-mode AD that handles a large portion of
-Julia with good efficiency. Hessian construction is fast via
-forward-over-reverse mixing ForwardDiff.jl with Zygote.jl
-
-- Compatible with GPUs
-- Compatible with Hessian-based optimization via ForwardDiff
-- Compatible with Hv-based optimization via ForwardDiff
-- Not compatible with constraint functions
-
-### AutoFiniteDiff
-
-This uses [FiniteDiff.jl](https://github.com/JuliaDiff/FiniteDiff.jl).
-While to necessarily the most efficient in any case, this is the only
-choice that doesn't require the `f` function to be automatically
-differentiable, which means it applies to any choice. However, because
-it's using finite differencing, one needs to be careful as this procedure
-introduces numerical error into the derivative estimates.
-
-- Compatible with GPUs
-- Compatible with Hessian-based optimization
-- Compatible with Hv-based optimization
-- Not compatible with constraint functions
-
-### AutoModelingToolkit
-
-This uses the [ModelingToolkit.jl](https://github.com/SciML/ModelingToolkit.jl)
-symbolic system for automatically converting the `f` function into
-a symbolic equation and uses symbolic differentiation in order to generate
-a fast derivative code. Note that this will also compile a new version
-of your `f` function that is automatically optimized. Because of the
-required symbolic analysis, the state and parameters are required in
-the function definition, i.e.:
+The following sections describe the Auto-AD choices in detail.
 
-```julia
-OptimizationFunction(f,AutoModelingToolkit(),x0,p,
-                     grad = false, hess = false, sparse = false,
-                     checkbounds = false,
-                     linenumbers = true,
-                     parallel=SerialForm(),
-                     kwargs...)
+```@docs
+AutoForwardDiff
+AutoFiniteDiff
+AutoReverseDiff
+AutoZygote
+AutoTracker
+AutoModelingToolkit
 ```
 
-The special keyword arguments are as follows:
-
-- `grad`: whether to symbolically generate the gradient function.
-- `hess`: whether to symbolically generate the Hessian function.
-- `sparse`: whether to use sparsity detection in the Hessian.
-- `checkbounds`: whether to perform bounds checks in the generated code.
-- `linenumbers`: whether to include line numbers in the generated code.
-- `parallel`: whether to automatically parallelize the calculations.
-
-For more information, see the [ModelingToolkit.jl `OptimizationSystem` documentation](https://mtk.sciml.ai/dev/systems/OptimizationSystem/)
-
-Summary:
-
-- Not compatible with GPUs
-- Compatible with Hessian-based optimization
-- Not compatible with Hv-based optimization
-- Not compatible with constraint functions
diff --git a/src/function/finitediff.jl b/src/function/finitediff.jl
@@ -1,3 +1,41 @@
+"""
+AutoFiniteDiff{T1,T2} <: AbstractADType
+
+An AbstractADType choice for use in OptimizationFunction for automatically
+generating the unspecified derivative functions. Usage:
+
+```julia
+OptimizationFunction(f,AutoFiniteDiff();kwargs...)
+```
+
+This uses [FiniteDiff.jl](https://github.com/JuliaDiff/FiniteDiff.jl).
+While to necessarily the most efficient in any case, this is the only
+choice that doesn't require the `f` function to be automatically
+differentiable, which means it applies to any choice. However, because
+it's using finite differencing, one needs to be careful as this procedure
+introduces numerical error into the derivative estimates.
+
+- Compatible with GPUs
+- Compatible with Hessian-based optimization
+- Compatible with Hv-based optimization
+- Not compatible with constraint functions
+
+Note that only the unspecified derivative functions are defined. For example,
+if a `hess` function is supplied to the `OptimizationFunction`, then the
+Hessian is not defined via FiniteDiff.
+
+## Constructor
+
+```julia
+AutoFiniteDiff(;fdtype = Val(:forward), fdhtype = Val(:hcentral))
+```
+
+- `fdtype`: the method used for defining the gradient
+- `fdhtype`: the method used for defining the Hessian
+
+For more information on the derivative type specifiers, see the
+[FiniteDiff.jl documentation](https://github.com/JuliaDiff/FiniteDiff.jl).
+"""
 struct AutoFiniteDiff{T1,T2} <: AbstractADType
     fdtype::T1
     fdhtype::T2
diff --git a/src/function/forwarddiff.jl b/src/function/forwarddiff.jl
@@ -1,4 +1,32 @@
+"""
+AutoForwardDiff{chunksize} <: AbstractADType
+
+An AbstractADType choice for use in OptimizationFunction for automatically
+generating the unspecified derivative functions. Usage:
+
+```julia
+OptimizationFunction(f,AutoForwardDiff();kwargs...)
+```
+
+This uses the [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl)
+package. It is the fastest choice for small systems, especially with
+heavy scalar interactions. It is easy to use and compatible with most
+pure is Julia functions which have loose type restrictions. However,
+because it's forward-mode, it scales poorly in comparison to other AD
+choices. Hessian construction is suboptimal as it uses the forward-over-forward
+approach.
+
+- Compatible with GPUs
+- Compatible with Hessian-based optimization
+- Compatible with Hv-based optimization
+- Compatible with constraints
+
+Note that only the unspecified derivative functions are defined. For example,
+if a `hess` function is supplied to the `OptimizationFunction`, then the
+Hessian is not defined via ForwardDiff.
+"""
 struct AutoForwardDiff{chunksize} <: AbstractADType end
+
 function AutoForwardDiff(chunksize=nothing)
     AutoForwardDiff{chunksize}()
 end
diff --git a/src/function/function.jl b/src/function/function.jl
@@ -1,3 +1,28 @@
+"""
+instantiate_function(f, x, ::AbstractADType, p, num_cons = 0)::OptimizationFunction
+
+This function is used internally by GalacticOptim.jl to construct
+the necessary extra functions (gradients, Hessians, etc.) before
+optimization. Each of the ADType dispatches use the supplied automatic
+differentiation type in order to specify how the construction process
+occurs. 
+
+If no ADType is given, then the default `NoAD` dispatch simply
+defines closures on any supplied gradient function to enclose the
+parameters to match the interfaces for the specific optimization
+libraries (i.e. (G,x)->f.grad(G,x,p)). If a function is not given
+and the `NoAD` dispatch is used, or if the AD dispatch is currently
+not capable of defining said derivative, then the constructed
+`OptimizationFunction` will simply use `nothing` to specify and undefined
+function.
+
+The return of `instantiate_function` is an `OptimizationFunction` which
+is then used in the optimization process. If an optimizer requires a
+function that is not defined, an error is thrown.
+
+For more information on the use of automatic differentiation, see the
+documentation of the `AbstractADType` types.
+"""
 function instantiate_function(f, x, ::AbstractADType, p, num_cons = 0)
     grad   = f.grad   === nothing ? nothing : (G,x)->f.grad(G,x,p)
     hess   = f.hess   === nothing ? nothing : (H,x)->f.hess(H,x,p)
diff --git a/src/function/mtk.jl b/src/function/mtk.jl
@@ -1,3 +1,50 @@
+"""
+AutoModelingToolkit <: AbstractADType
+
+An AbstractADType choice for use in OptimizationFunction for automatically
+generating the unspecified derivative functions. Usage:
+
+```julia
+OptimizationFunction(f,AutoModelingToolkit();kwargs...)
+```
+
+This uses the [ModelingToolkit.jl](https://github.com/SciML/ModelingToolkit.jl)
+symbolic system for automatically converting the `f` function into
+a symbolic equation and uses symbolic differentiation in order to generate
+a fast derivative code. Note that this will also compile a new version
+of your `f` function that is automatically optimized. Because of the
+required symbolic analysis, the state and parameters are required in
+the function definition, i.e.:
+
+Summary:
+
+- Not compatible with GPUs
+- Compatible with Hessian-based optimization
+- Not compatible with Hv-based optimization
+- Not compatible with constraint functions
+
+## Constructor
+
+```julia
+OptimizationFunction(f,AutoModelingToolkit(),x0,p,
+                     grad = false, hess = false, sparse = false,
+                     checkbounds = false,
+                     linenumbers = true,
+                     parallel=SerialForm(),
+                     kwargs...)
+```
+
+The special keyword arguments are as follows:
+
+- `grad`: whether to symbolically generate the gradient function.
+- `hess`: whether to symbolically generate the Hessian function.
+- `sparse`: whether to use sparsity detection in the Hessian.
+- `checkbounds`: whether to perform bounds checks in the generated code.
+- `linenumbers`: whether to include line numbers in the generated code.
+- `parallel`: whether to automatically parallelize the calculations.
+
+For more information, see the [ModelingToolkit.jl `OptimizationSystem` documentation](https://mtk.sciml.ai/dev/systems/OptimizationSystem/)
+"""
 struct AutoModelingToolkit <: AbstractADType
     obj_sparse::Bool
     cons_sparse::Bool
diff --git a/src/function/reversediff.jl b/src/function/reversediff.jl
@@ -1,4 +1,47 @@
-struct AutoReverseDiff <: AbstractADType end
+"""
+AutoReverseDiff <: AbstractADType
+
+An AbstractADType choice for use in OptimizationFunction for automatically
+generating the unspecified derivative functions. Usage:
+
+```julia
+OptimizationFunction(f,AutoReverseDiff();kwargs...)
+```
+
+This uses the [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl)
+package. `AutoReverseDiff` has a default argument, `compile`, which
+denotes whether the reverse pass should be compiled. **`compile` should only
+be set to `true` if `f` contains no branches (if statements, while loops)
+otherwise it can produce incorrect derivatives!**.
+
+`AutoReverseDiff` is generally applicable to many pure Julia codes,
+and with `compile=true` it is one of the fastest options on code with
+heavy scalar interactions. Hessian calculations are fast by mixing
+ForwardDiff with ReverseDiff for forward-over-reverse. However, its
+performance can falter when `compile=false`.
+
+- Not compatible with GPUs
+- Compatible with Hessian-based optimization by mixing with ForwardDiff
+- Compatible with Hv-based optimization by mixing with ForwardDiff
+- Not compatible with constraint functions
+
+Note that only the unspecified derivative functions are defined. For example,
+if a `hess` function is supplied to the `OptimizationFunction`, then the
+Hessian is not defined via ReverseDiff.
+
+## Constructor
+
+```julia
+AutoReverseDiff(;compile = false)
+```
+
+#### Note: currently compilation is not defined/used!
+"""
+struct AutoReverseDiff <: AbstractADType 
+    compile::Bool
+end
+
+AutoReverseDiff(;compile = false) = AutoReverseDiff(compile)
 
 function instantiate_function(f, x, adtype::AutoReverseDiff, p=SciMLBase.NullParameters(), num_cons = 0)
     num_cons != 0 && error("AutoReverseDiff does not currently support constraints")
diff --git a/src/function/tracker.jl b/src/function/tracker.jl
diff --git a/src/function/zygote.jl b/src/function/zygote.jl