|
1 | 1 | # [OptimizationFunction](@id optfunction) |
2 | 2 |
|
3 | | -The `OptimizationFunction` type is a function type that holds all of |
4 | | -the extra differentiation data required to do fast and accurate |
5 | | -optimization. The signature for the constructor is: |
6 | | - |
7 | | -```julia |
8 | | -OptimizationFunction{iip}(f,adtype=NoAD(); |
9 | | - grad=nothing, |
10 | | - hess=nothing, |
11 | | - hv=nothing, |
12 | | - cons=nothing, |
13 | | - cons_j=nothing, |
14 | | - cons_h=nothing) |
15 | | -``` |
16 | | - |
17 | | -The keyword arguments are as follows: |
18 | | - |
19 | | -- `grad`: Gradient |
20 | | -- `hess`: Hessian |
21 | | -- `hv`: Hessian vector products `hv(du,u,p,t,v)` = H*v |
22 | | -- `cons`: Constraint function |
23 | | -- `cons_j` |
24 | | -- `cons_h` |
25 | | - |
26 | | -### Defining Optimization Functions Via AD |
27 | | - |
28 | | -While using the keyword arguments gives the user control over defining |
29 | | -all of the possible functions, the simplest way to handle the generation |
30 | | -of an `OptimizationFunction` is by specifying an AD type. By doing so, |
31 | | -this will automatically fill in all of the extra functions. For example, |
32 | | - |
33 | | -```julia |
34 | | -OptimizationFunction(f,AutoZygote()) |
| 3 | +```@docs |
| 4 | +OptimizationFunction |
35 | 5 | ``` |
36 | 6 |
|
37 | | -will use [Zygote.jl](https://github.com/FluxML/Zygote.jl) to define |
38 | | -all of the necessary functions. Note that if any functions are defined |
39 | | -directly, the auto-AD definition does not overwrite the user's choice. |
| 7 | +## Automatic Differentiation Construction Choice Recommendations |
40 | 8 |
|
41 | 9 | The choices for the auto-AD fill-ins with quick descriptions are: |
42 | 10 |
|
43 | 11 | - `AutoForwardDiff()`: The fastest choice for small optimizations |
44 | 12 | - `AutoReverseDiff(compile=false)`: A fast choice for large scalar optimizations |
45 | 13 | - `AutoTracker()`: Like ReverseDiff but GPU-compatible |
46 | | -- `AutoZygote()`: The fastest choice |
| 14 | +- `AutoZygote()`: The fastest choice for non-mutating array-based (BLAS) functions |
47 | 15 | - `AutoFiniteDiff()`: Finite differencing, not optimal but always applicable |
48 | 16 | - `AutoModelingToolkit()`: The fastest choice for large scalar optimizations |
49 | 17 |
|
50 | | -The following sections describe the Auto-AD choices in detail. |
51 | | - |
52 | | -### AutoForwardDiff |
53 | | - |
54 | | -This uses the [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) |
55 | | -package. It is the fastest choice for small systems, especially with |
56 | | -heavy scalar interactions. It is easy to use and compatible with most |
57 | | -pure is Julia functions which have loose type restrictions. However, |
58 | | -because it's forward-mode, it scales poorly in comparison to other AD |
59 | | -choices. Hessian construction is suboptimal as it uses the forward-over-forward |
60 | | -approach. |
61 | | - |
62 | | -- Compatible with GPUs |
63 | | -- Compatible with Hessian-based optimization |
64 | | -- Compatible with Hv-based optimization |
65 | | -- Compatible with constraints |
66 | | - |
67 | | -### AutoReverseDiff |
68 | | - |
69 | | -This uses the [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl) |
70 | | -package. `AutoReverseDiff` has a default argument, `compile`, which |
71 | | -denotes whether the reverse pass should be compiled. **`compile` should only |
72 | | -be set to `true` if `f` contains no branches (if statements, while loops) |
73 | | -otherwise it can produce incorrect derivatives!**. |
74 | | - |
75 | | -`AutoReverseDiff` is generally applicable to many pure Julia codes, |
76 | | -and with `compile=true` it is one of the fastest options on code with |
77 | | -heavy scalar interactions. Hessian calculations are fast by mixing |
78 | | -ForwardDiff with ReverseDiff for forward-over-reverse. However, its |
79 | | -performance can falter when `compile=false`. |
80 | | - |
81 | | -- Not compatible with GPUs |
82 | | -- Compatible with Hessian-based optimization by mixing with ForwardDiff |
83 | | -- Compatible with Hv-based optimization by mixing with ForwardDiff |
84 | | -- Not compatible with constraint functions |
| 18 | +## Automatic Differentiation Choice API |
85 | 19 |
|
86 | | -### AutoTracker |
87 | | - |
88 | | -This uses the [Tracker.jl](https://github.com/FluxML/Tracker.jl) package. |
89 | | -Generally slower than ReverseDiff, it is generally applicable to many |
90 | | -pure Julia codes. |
91 | | - |
92 | | -- Compatible with GPUs |
93 | | -- Not compatible with Hessian-based optimization |
94 | | -- Not compatible with Hv-based optimization |
95 | | -- Not compatible with constraint functions |
96 | | - |
97 | | -### AutoZygote |
98 | | - |
99 | | -This uses the [Zygote.jl](https://github.com/FluxML/Zygote.jl) package. |
100 | | -This is the staple reverse-mode AD that handles a large portion of |
101 | | -Julia with good efficiency. Hessian construction is fast via |
102 | | -forward-over-reverse mixing ForwardDiff.jl with Zygote.jl |
103 | | - |
104 | | -- Compatible with GPUs |
105 | | -- Compatible with Hessian-based optimization via ForwardDiff |
106 | | -- Compatible with Hv-based optimization via ForwardDiff |
107 | | -- Not compatible with constraint functions |
108 | | - |
109 | | -### AutoFiniteDiff |
110 | | - |
111 | | -This uses [FiniteDiff.jl](https://github.com/JuliaDiff/FiniteDiff.jl). |
112 | | -While to necessarily the most efficient in any case, this is the only |
113 | | -choice that doesn't require the `f` function to be automatically |
114 | | -differentiable, which means it applies to any choice. However, because |
115 | | -it's using finite differencing, one needs to be careful as this procedure |
116 | | -introduces numerical error into the derivative estimates. |
117 | | - |
118 | | -- Compatible with GPUs |
119 | | -- Compatible with Hessian-based optimization |
120 | | -- Compatible with Hv-based optimization |
121 | | -- Not compatible with constraint functions |
122 | | - |
123 | | -### AutoModelingToolkit |
124 | | - |
125 | | -This uses the [ModelingToolkit.jl](https://github.com/SciML/ModelingToolkit.jl) |
126 | | -symbolic system for automatically converting the `f` function into |
127 | | -a symbolic equation and uses symbolic differentiation in order to generate |
128 | | -a fast derivative code. Note that this will also compile a new version |
129 | | -of your `f` function that is automatically optimized. Because of the |
130 | | -required symbolic analysis, the state and parameters are required in |
131 | | -the function definition, i.e.: |
| 20 | +The following sections describe the Auto-AD choices in detail. |
132 | 21 |
|
133 | | -```julia |
134 | | -OptimizationFunction(f,AutoModelingToolkit(),x0,p, |
135 | | - grad = false, hess = false, sparse = false, |
136 | | - checkbounds = false, |
137 | | - linenumbers = true, |
138 | | - parallel=SerialForm(), |
139 | | - kwargs...) |
| 22 | +```@docs |
| 23 | +AutoForwardDiff |
| 24 | +AutoFiniteDiff |
| 25 | +AutoReverseDiff |
| 26 | +AutoZygote |
| 27 | +AutoTracker |
| 28 | +AutoModelingToolkit |
140 | 29 | ``` |
141 | 30 |
|
142 | | -The special keyword arguments are as follows: |
143 | | - |
144 | | -- `grad`: whether to symbolically generate the gradient function. |
145 | | -- `hess`: whether to symbolically generate the Hessian function. |
146 | | -- `sparse`: whether to use sparsity detection in the Hessian. |
147 | | -- `checkbounds`: whether to perform bounds checks in the generated code. |
148 | | -- `linenumbers`: whether to include line numbers in the generated code. |
149 | | -- `parallel`: whether to automatically parallelize the calculations. |
150 | | - |
151 | | -For more information, see the [ModelingToolkit.jl `OptimizationSystem` documentation](https://mtk.sciml.ai/dev/systems/OptimizationSystem/) |
152 | | - |
153 | | -Summary: |
154 | | - |
155 | | -- Not compatible with GPUs |
156 | | -- Compatible with Hessian-based optimization |
157 | | -- Not compatible with Hv-based optimization |
158 | | -- Not compatible with constraint functions |
0 commit comments