From 1ada2ce8092ca3caef4d93374dc542200087763b Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Sat, 29 Apr 2023 14:17:08 -0400
Subject: [PATCH 1/3] say opt_state in README

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index fe8fb9f8..38cdf1ab 100644
--- a/README.md
+++ b/README.md
@@ -38,15 +38,15 @@ It is initialised by `setup`, and then at each step, `update` returns both the n
 state, and the model with its trainable parameters adjusted:
 
 ```julia
-state = Optimisers.setup(Optimisers.Adam(), model)  # just once
+opt_state = Optimisers.setup(Optimisers.Adam(), model)  # just once
 
 grad = Zygote.gradient(m -> loss(m(x), y), model)[1]
 
-state, model = Optimisers.update(state, model, grad)  # at every step
+opt_state, model = Optimisers.update(opt_state, model, grad)  # at every step
 ```
 
 For models with deeply nested layers containing the parameters (like [Flux.jl](https://github.com/FluxML/Flux.jl) models),
-this state is a similarly nested tree. As is the gradient: if using Zygote, you must use the "explicit" style as shown,
+this `opt_state` is a similarly nested tree. As is the gradient: if using Zygote, you must use the "explicit" style as shown,
 not the "implicit" one with `Params`.
 
 The function `destructure` collects all the trainable parameters into one vector,

From 4bdc571809f0b292cc0f8721fd6e08cefacffe79 Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Sat, 29 Apr 2023 14:20:17 -0400
Subject: [PATCH 2/3] change to state_tree instead

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 38cdf1ab..4288b706 100644
--- a/README.md
+++ b/README.md
@@ -38,15 +38,15 @@ It is initialised by `setup`, and then at each step, `update` returns both the n
 state, and the model with its trainable parameters adjusted:
 
 ```julia
-opt_state = Optimisers.setup(Optimisers.Adam(), model)  # just once
+state_tree = Optimisers.setup(Optimisers.Adam(), model)  # just once
 
 grad = Zygote.gradient(m -> loss(m(x), y), model)[1]
 
-opt_state, model = Optimisers.update(opt_state, model, grad)  # at every step
+state_tree, model = Optimisers.update(opt_state, model, grad)  # at every step
 ```
 
 For models with deeply nested layers containing the parameters (like [Flux.jl](https://github.com/FluxML/Flux.jl) models),
-this `opt_state` is a similarly nested tree. As is the gradient: if using Zygote, you must use the "explicit" style as shown,
+this `state_tree` is a similarly nested object. As is the gradient: if using Zygote, you must use the "explicit" style as shown,
 not the "implicit" one with `Params`.
 
 The function `destructure` collects all the trainable parameters into one vector,

From 2cbe0a251886edc07a922e441f996ca8623475c8 Mon Sep 17 00:00:00 2001
From: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
Date: Sat, 29 Apr 2023 14:23:40 -0400
Subject: [PATCH 3/3] also mention adjust

---
 README.md | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 4288b706..7a8645a7 100644
--- a/README.md
+++ b/README.md
@@ -21,8 +21,8 @@
 
 Optimisers.jl defines many standard gradient-based optimisation rules, and tools for applying them to deeply nested models.
 
-This is the future of training for [Flux.jl](https://github.com/FluxML/Flux.jl) neural networks,
-and the present for [Lux.jl](https://github.com/avik-pal/Lux.jl).
+This was written as a new training back-end for [Flux.jl](https://github.com/FluxML/Flux.jl) neural networks,
+and is also used by [Lux.jl](https://github.com/avik-pal/Lux.jl).
 But it can be used separately on any array, or anything else understood by [Functors.jl](https://github.com/FluxML/Functors.jl).
 
 ## Installation
@@ -49,6 +49,12 @@ For models with deeply nested layers containing the parameters (like [Flux.jl](h
 this `state_tree` is a similarly nested object. As is the gradient: if using Zygote, you must use the "explicit" style as shown,
 not the "implicit" one with `Params`.
 
+You can change the learning rate during training by mutating all the states:
+
+```julia
+Optimisers.adjust!(state_tree, 0.01)
+```
+
 The function `destructure` collects all the trainable parameters into one vector,
 and returns this along with a function to re-build a similar model: