Skip to content

Commit 685b37d

Browse files
Merge pull request #261 from SciML/docs
Change OptimizationFunction documentation to use docstrings
2 parents cb664ba + 88e4b43 commit 685b37d

File tree

9 files changed

+243
-143
lines changed

9 files changed

+243
-143
lines changed

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ LoggingExtras = "0.4, 0.5"
2828
ProgressLogging = "0.1"
2929
Reexport = "0.2, 1.0"
3030
Requires = "1.0"
31-
SciMLBase = "1.34"
31+
SciMLBase = "1.37.1"
3232
TerminalLoggers = "0.1"
3333
julia = "1.6"
3434

Lines changed: 13 additions & 141 deletions
Original file line numberDiff line numberDiff line change
@@ -1,158 +1,30 @@
11
# [OptimizationFunction](@id optfunction)
22

3-
The `OptimizationFunction` type is a function type that holds all of
4-
the extra differentiation data required to do fast and accurate
5-
optimization. The signature for the constructor is:
6-
7-
```julia
8-
OptimizationFunction{iip}(f,adtype=NoAD();
9-
grad=nothing,
10-
hess=nothing,
11-
hv=nothing,
12-
cons=nothing,
13-
cons_j=nothing,
14-
cons_h=nothing)
15-
```
16-
17-
The keyword arguments are as follows:
18-
19-
- `grad`: Gradient
20-
- `hess`: Hessian
21-
- `hv`: Hessian vector products `hv(du,u,p,t,v)` = H*v
22-
- `cons`: Constraint function
23-
- `cons_j`
24-
- `cons_h`
25-
26-
### Defining Optimization Functions Via AD
27-
28-
While using the keyword arguments gives the user control over defining
29-
all of the possible functions, the simplest way to handle the generation
30-
of an `OptimizationFunction` is by specifying an AD type. By doing so,
31-
this will automatically fill in all of the extra functions. For example,
32-
33-
```julia
34-
OptimizationFunction(f,AutoZygote())
3+
```@docs
4+
OptimizationFunction
355
```
366

37-
will use [Zygote.jl](https://github.com/FluxML/Zygote.jl) to define
38-
all of the necessary functions. Note that if any functions are defined
39-
directly, the auto-AD definition does not overwrite the user's choice.
7+
## Automatic Differentiation Construction Choice Recommendations
408

419
The choices for the auto-AD fill-ins with quick descriptions are:
4210

4311
- `AutoForwardDiff()`: The fastest choice for small optimizations
4412
- `AutoReverseDiff(compile=false)`: A fast choice for large scalar optimizations
4513
- `AutoTracker()`: Like ReverseDiff but GPU-compatible
46-
- `AutoZygote()`: The fastest choice
14+
- `AutoZygote()`: The fastest choice for non-mutating array-based (BLAS) functions
4715
- `AutoFiniteDiff()`: Finite differencing, not optimal but always applicable
4816
- `AutoModelingToolkit()`: The fastest choice for large scalar optimizations
4917

50-
The following sections describe the Auto-AD choices in detail.
51-
52-
### AutoForwardDiff
53-
54-
This uses the [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl)
55-
package. It is the fastest choice for small systems, especially with
56-
heavy scalar interactions. It is easy to use and compatible with most
57-
pure is Julia functions which have loose type restrictions. However,
58-
because it's forward-mode, it scales poorly in comparison to other AD
59-
choices. Hessian construction is suboptimal as it uses the forward-over-forward
60-
approach.
61-
62-
- Compatible with GPUs
63-
- Compatible with Hessian-based optimization
64-
- Compatible with Hv-based optimization
65-
- Compatible with constraints
66-
67-
### AutoReverseDiff
68-
69-
This uses the [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl)
70-
package. `AutoReverseDiff` has a default argument, `compile`, which
71-
denotes whether the reverse pass should be compiled. **`compile` should only
72-
be set to `true` if `f` contains no branches (if statements, while loops)
73-
otherwise it can produce incorrect derivatives!**.
74-
75-
`AutoReverseDiff` is generally applicable to many pure Julia codes,
76-
and with `compile=true` it is one of the fastest options on code with
77-
heavy scalar interactions. Hessian calculations are fast by mixing
78-
ForwardDiff with ReverseDiff for forward-over-reverse. However, its
79-
performance can falter when `compile=false`.
80-
81-
- Not compatible with GPUs
82-
- Compatible with Hessian-based optimization by mixing with ForwardDiff
83-
- Compatible with Hv-based optimization by mixing with ForwardDiff
84-
- Not compatible with constraint functions
18+
## Automatic Differentiation Choice API
8519

86-
### AutoTracker
87-
88-
This uses the [Tracker.jl](https://github.com/FluxML/Tracker.jl) package.
89-
Generally slower than ReverseDiff, it is generally applicable to many
90-
pure Julia codes.
91-
92-
- Compatible with GPUs
93-
- Not compatible with Hessian-based optimization
94-
- Not compatible with Hv-based optimization
95-
- Not compatible with constraint functions
96-
97-
### AutoZygote
98-
99-
This uses the [Zygote.jl](https://github.com/FluxML/Zygote.jl) package.
100-
This is the staple reverse-mode AD that handles a large portion of
101-
Julia with good efficiency. Hessian construction is fast via
102-
forward-over-reverse mixing ForwardDiff.jl with Zygote.jl
103-
104-
- Compatible with GPUs
105-
- Compatible with Hessian-based optimization via ForwardDiff
106-
- Compatible with Hv-based optimization via ForwardDiff
107-
- Not compatible with constraint functions
108-
109-
### AutoFiniteDiff
110-
111-
This uses [FiniteDiff.jl](https://github.com/JuliaDiff/FiniteDiff.jl).
112-
While to necessarily the most efficient in any case, this is the only
113-
choice that doesn't require the `f` function to be automatically
114-
differentiable, which means it applies to any choice. However, because
115-
it's using finite differencing, one needs to be careful as this procedure
116-
introduces numerical error into the derivative estimates.
117-
118-
- Compatible with GPUs
119-
- Compatible with Hessian-based optimization
120-
- Compatible with Hv-based optimization
121-
- Not compatible with constraint functions
122-
123-
### AutoModelingToolkit
124-
125-
This uses the [ModelingToolkit.jl](https://github.com/SciML/ModelingToolkit.jl)
126-
symbolic system for automatically converting the `f` function into
127-
a symbolic equation and uses symbolic differentiation in order to generate
128-
a fast derivative code. Note that this will also compile a new version
129-
of your `f` function that is automatically optimized. Because of the
130-
required symbolic analysis, the state and parameters are required in
131-
the function definition, i.e.:
20+
The following sections describe the Auto-AD choices in detail.
13221

133-
```julia
134-
OptimizationFunction(f,AutoModelingToolkit(),x0,p,
135-
grad = false, hess = false, sparse = false,
136-
checkbounds = false,
137-
linenumbers = true,
138-
parallel=SerialForm(),
139-
kwargs...)
22+
```@docs
23+
AutoForwardDiff
24+
AutoFiniteDiff
25+
AutoReverseDiff
26+
AutoZygote
27+
AutoTracker
28+
AutoModelingToolkit
14029
```
14130

142-
The special keyword arguments are as follows:
143-
144-
- `grad`: whether to symbolically generate the gradient function.
145-
- `hess`: whether to symbolically generate the Hessian function.
146-
- `sparse`: whether to use sparsity detection in the Hessian.
147-
- `checkbounds`: whether to perform bounds checks in the generated code.
148-
- `linenumbers`: whether to include line numbers in the generated code.
149-
- `parallel`: whether to automatically parallelize the calculations.
150-
151-
For more information, see the [ModelingToolkit.jl `OptimizationSystem` documentation](https://mtk.sciml.ai/dev/systems/OptimizationSystem/)
152-
153-
Summary:
154-
155-
- Not compatible with GPUs
156-
- Compatible with Hessian-based optimization
157-
- Not compatible with Hv-based optimization
158-
- Not compatible with constraint functions

src/function/finitediff.jl

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,41 @@
1+
"""
2+
AutoFiniteDiff{T1,T2} <: AbstractADType
3+
4+
An AbstractADType choice for use in OptimizationFunction for automatically
5+
generating the unspecified derivative functions. Usage:
6+
7+
```julia
8+
OptimizationFunction(f,AutoFiniteDiff();kwargs...)
9+
```
10+
11+
This uses [FiniteDiff.jl](https://github.com/JuliaDiff/FiniteDiff.jl).
12+
While to necessarily the most efficient in any case, this is the only
13+
choice that doesn't require the `f` function to be automatically
14+
differentiable, which means it applies to any choice. However, because
15+
it's using finite differencing, one needs to be careful as this procedure
16+
introduces numerical error into the derivative estimates.
17+
18+
- Compatible with GPUs
19+
- Compatible with Hessian-based optimization
20+
- Compatible with Hv-based optimization
21+
- Not compatible with constraint functions
22+
23+
Note that only the unspecified derivative functions are defined. For example,
24+
if a `hess` function is supplied to the `OptimizationFunction`, then the
25+
Hessian is not defined via FiniteDiff.
26+
27+
## Constructor
28+
29+
```julia
30+
AutoFiniteDiff(;fdtype = Val(:forward), fdhtype = Val(:hcentral))
31+
```
32+
33+
- `fdtype`: the method used for defining the gradient
34+
- `fdhtype`: the method used for defining the Hessian
35+
36+
For more information on the derivative type specifiers, see the
37+
[FiniteDiff.jl documentation](https://github.com/JuliaDiff/FiniteDiff.jl).
38+
"""
139
struct AutoFiniteDiff{T1,T2} <: AbstractADType
240
fdtype::T1
341
fdhtype::T2

src/function/forwarddiff.jl

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,32 @@
1+
"""
2+
AutoForwardDiff{chunksize} <: AbstractADType
3+
4+
An AbstractADType choice for use in OptimizationFunction for automatically
5+
generating the unspecified derivative functions. Usage:
6+
7+
```julia
8+
OptimizationFunction(f,AutoForwardDiff();kwargs...)
9+
```
10+
11+
This uses the [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl)
12+
package. It is the fastest choice for small systems, especially with
13+
heavy scalar interactions. It is easy to use and compatible with most
14+
pure is Julia functions which have loose type restrictions. However,
15+
because it's forward-mode, it scales poorly in comparison to other AD
16+
choices. Hessian construction is suboptimal as it uses the forward-over-forward
17+
approach.
18+
19+
- Compatible with GPUs
20+
- Compatible with Hessian-based optimization
21+
- Compatible with Hv-based optimization
22+
- Compatible with constraints
23+
24+
Note that only the unspecified derivative functions are defined. For example,
25+
if a `hess` function is supplied to the `OptimizationFunction`, then the
26+
Hessian is not defined via ForwardDiff.
27+
"""
128
struct AutoForwardDiff{chunksize} <: AbstractADType end
29+
230
function AutoForwardDiff(chunksize=nothing)
331
AutoForwardDiff{chunksize}()
432
end

src/function/function.jl

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,28 @@
1+
"""
2+
instantiate_function(f, x, ::AbstractADType, p, num_cons = 0)::OptimizationFunction
3+
4+
This function is used internally by GalacticOptim.jl to construct
5+
the necessary extra functions (gradients, Hessians, etc.) before
6+
optimization. Each of the ADType dispatches use the supplied automatic
7+
differentiation type in order to specify how the construction process
8+
occurs.
9+
10+
If no ADType is given, then the default `NoAD` dispatch simply
11+
defines closures on any supplied gradient function to enclose the
12+
parameters to match the interfaces for the specific optimization
13+
libraries (i.e. (G,x)->f.grad(G,x,p)). If a function is not given
14+
and the `NoAD` dispatch is used, or if the AD dispatch is currently
15+
not capable of defining said derivative, then the constructed
16+
`OptimizationFunction` will simply use `nothing` to specify and undefined
17+
function.
18+
19+
The return of `instantiate_function` is an `OptimizationFunction` which
20+
is then used in the optimization process. If an optimizer requires a
21+
function that is not defined, an error is thrown.
22+
23+
For more information on the use of automatic differentiation, see the
24+
documentation of the `AbstractADType` types.
25+
"""
126
function instantiate_function(f, x, ::AbstractADType, p, num_cons = 0)
227
grad = f.grad === nothing ? nothing : (G,x)->f.grad(G,x,p)
328
hess = f.hess === nothing ? nothing : (H,x)->f.hess(H,x,p)

src/function/mtk.jl

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,50 @@
1+
"""
2+
AutoModelingToolkit <: AbstractADType
3+
4+
An AbstractADType choice for use in OptimizationFunction for automatically
5+
generating the unspecified derivative functions. Usage:
6+
7+
```julia
8+
OptimizationFunction(f,AutoModelingToolkit();kwargs...)
9+
```
10+
11+
This uses the [ModelingToolkit.jl](https://github.com/SciML/ModelingToolkit.jl)
12+
symbolic system for automatically converting the `f` function into
13+
a symbolic equation and uses symbolic differentiation in order to generate
14+
a fast derivative code. Note that this will also compile a new version
15+
of your `f` function that is automatically optimized. Because of the
16+
required symbolic analysis, the state and parameters are required in
17+
the function definition, i.e.:
18+
19+
Summary:
20+
21+
- Not compatible with GPUs
22+
- Compatible with Hessian-based optimization
23+
- Not compatible with Hv-based optimization
24+
- Not compatible with constraint functions
25+
26+
## Constructor
27+
28+
```julia
29+
OptimizationFunction(f,AutoModelingToolkit(),x0,p,
30+
grad = false, hess = false, sparse = false,
31+
checkbounds = false,
32+
linenumbers = true,
33+
parallel=SerialForm(),
34+
kwargs...)
35+
```
36+
37+
The special keyword arguments are as follows:
38+
39+
- `grad`: whether to symbolically generate the gradient function.
40+
- `hess`: whether to symbolically generate the Hessian function.
41+
- `sparse`: whether to use sparsity detection in the Hessian.
42+
- `checkbounds`: whether to perform bounds checks in the generated code.
43+
- `linenumbers`: whether to include line numbers in the generated code.
44+
- `parallel`: whether to automatically parallelize the calculations.
45+
46+
For more information, see the [ModelingToolkit.jl `OptimizationSystem` documentation](https://mtk.sciml.ai/dev/systems/OptimizationSystem/)
47+
"""
148
struct AutoModelingToolkit <: AbstractADType
249
obj_sparse::Bool
350
cons_sparse::Bool

src/function/reversediff.jl

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,47 @@
1-
struct AutoReverseDiff <: AbstractADType end
1+
"""
2+
AutoReverseDiff <: AbstractADType
3+
4+
An AbstractADType choice for use in OptimizationFunction for automatically
5+
generating the unspecified derivative functions. Usage:
6+
7+
```julia
8+
OptimizationFunction(f,AutoReverseDiff();kwargs...)
9+
```
10+
11+
This uses the [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl)
12+
package. `AutoReverseDiff` has a default argument, `compile`, which
13+
denotes whether the reverse pass should be compiled. **`compile` should only
14+
be set to `true` if `f` contains no branches (if statements, while loops)
15+
otherwise it can produce incorrect derivatives!**.
16+
17+
`AutoReverseDiff` is generally applicable to many pure Julia codes,
18+
and with `compile=true` it is one of the fastest options on code with
19+
heavy scalar interactions. Hessian calculations are fast by mixing
20+
ForwardDiff with ReverseDiff for forward-over-reverse. However, its
21+
performance can falter when `compile=false`.
22+
23+
- Not compatible with GPUs
24+
- Compatible with Hessian-based optimization by mixing with ForwardDiff
25+
- Compatible with Hv-based optimization by mixing with ForwardDiff
26+
- Not compatible with constraint functions
27+
28+
Note that only the unspecified derivative functions are defined. For example,
29+
if a `hess` function is supplied to the `OptimizationFunction`, then the
30+
Hessian is not defined via ReverseDiff.
31+
32+
## Constructor
33+
34+
```julia
35+
AutoReverseDiff(;compile = false)
36+
```
37+
38+
#### Note: currently compilation is not defined/used!
39+
"""
40+
struct AutoReverseDiff <: AbstractADType
41+
compile::Bool
42+
end
43+
44+
AutoReverseDiff(;compile = false) = AutoReverseDiff(compile)
245

346
function instantiate_function(f, x, adtype::AutoReverseDiff, p=SciMLBase.NullParameters(), num_cons = 0)
447
num_cons != 0 && error("AutoReverseDiff does not currently support constraints")

0 commit comments

Comments
 (0)