Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Commit a31d68c

Browse files
T2T TeamRyan Sepassi
authored andcommitted
Tutorial for training transformer for ASR.
PiperOrigin-RevId: 186011568
1 parent 46f518c commit a31d68c

File tree

1 file changed

+65
-0
lines changed

1 file changed

+65
-0
lines changed
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Automatic Speech Recognition (ASR) with Transformer
2+
3+
## Data set
4+
5+
This tutorial uses the publicly available
6+
[Librispeech](http://www.openslr.org/12/) ASR corpus.
7+
8+
## Generate the dataset
9+
10+
To generate the dataset use `t2t-datagen`. You need to create environment
11+
variables for a data directory `DATA_DIR` where the data is stored and for a
12+
temporary directory `TMP_DIR` where necessary data is downloaded.
13+
14+
As the audio import in `t2t-datagen` uses `sox` to generate normalized
15+
waveforms, please install it as appropriate (e.g. `apt-get install sox`).
16+
17+
```
18+
t2t-datagen --problem=librispeech --data_dir=$DATA_DIR --tmp_dir=$TMP_DIR
19+
```
20+
21+
You can also use smaller versions of the dataset by replacing `librispeech` with
22+
`librispeech_clean` or `librispeech_clean_small`
23+
24+
## Training on GPUs
25+
26+
To train a model on GPU set up`OUT_DIR` and run the trainer:
27+
28+
```
29+
t2t-trainer \
30+
--model=transformer \
31+
--hparams_set=transformer_librispeech \
32+
--problems=librispeech \
33+
--train_steps=120000 \
34+
--eval_steps=3 \
35+
--local_eval_frequency=100 \
36+
--data_dir=$DATA_DIR \
37+
--output_dir=$OUT_DIR
38+
```
39+
40+
This model should achieve approximately 22% accuracy per sequence after
41+
approximately 80,000 steps.
42+
43+
## Training on Cloud TPUs
44+
45+
To train a model on TPU set up `OUT_DIR` and run the trainer:
46+
47+
```
48+
t2t-trainer \
49+
--model=transformer \
50+
--hparams_set=transformer_librispeech_tpu \
51+
--problems=librispeech \
52+
--train_steps=120000 \
53+
--eval_steps=3 \
54+
--local_eval_frequency=100 \
55+
--data_dir=$DATA_DIR \
56+
--output_dir=$OUT_DIR \
57+
--cloud_tpu \
58+
--cloud_delete_on_done
59+
```
60+
61+
For more information, see [Tensor2Tensor's
62+
documentation](https://github.com/tensorflow/tensor2tensor/tree/master/docs/cloud_tpu.md)
63+
for Tensor2Tensor on Cloud TPUs, or the [official Google Cloud Platform
64+
documentation](https://cloud.google.com/tpu/docs/tutorials/transformer) for
65+
Cloud TPUs.

0 commit comments

Comments
 (0)