|
| 1 | +# Automatic Speech Recognition (ASR) with Transformer |
| 2 | + |
| 3 | +## Data set |
| 4 | + |
| 5 | +This tutorial uses the publicly available |
| 6 | +[Librispeech](http://www.openslr.org/12/) ASR corpus. |
| 7 | + |
| 8 | +## Generate the dataset |
| 9 | + |
| 10 | +To generate the dataset use `t2t-datagen`. You need to create environment |
| 11 | +variables for a data directory `DATA_DIR` where the data is stored and for a |
| 12 | +temporary directory `TMP_DIR` where necessary data is downloaded. |
| 13 | + |
| 14 | +As the audio import in `t2t-datagen` uses `sox` to generate normalized |
| 15 | +waveforms, please install it as appropriate (e.g. `apt-get install sox`). |
| 16 | + |
| 17 | +``` |
| 18 | +t2t-datagen --problem=librispeech --data_dir=$DATA_DIR --tmp_dir=$TMP_DIR |
| 19 | +``` |
| 20 | + |
| 21 | +You can also use smaller versions of the dataset by replacing `librispeech` with |
| 22 | +`librispeech_clean` or `librispeech_clean_small` |
| 23 | + |
| 24 | +## Training on GPUs |
| 25 | + |
| 26 | +To train a model on GPU set up`OUT_DIR` and run the trainer: |
| 27 | + |
| 28 | +``` |
| 29 | +t2t-trainer \ |
| 30 | + --model=transformer \ |
| 31 | + --hparams_set=transformer_librispeech \ |
| 32 | + --problems=librispeech \ |
| 33 | + --train_steps=120000 \ |
| 34 | + --eval_steps=3 \ |
| 35 | + --local_eval_frequency=100 \ |
| 36 | + --data_dir=$DATA_DIR \ |
| 37 | + --output_dir=$OUT_DIR |
| 38 | +``` |
| 39 | + |
| 40 | +This model should achieve approximately 22% accuracy per sequence after |
| 41 | +approximately 80,000 steps. |
| 42 | + |
| 43 | +## Training on Cloud TPUs |
| 44 | + |
| 45 | +To train a model on TPU set up `OUT_DIR` and run the trainer: |
| 46 | + |
| 47 | +``` |
| 48 | +t2t-trainer \ |
| 49 | + --model=transformer \ |
| 50 | + --hparams_set=transformer_librispeech_tpu \ |
| 51 | + --problems=librispeech \ |
| 52 | + --train_steps=120000 \ |
| 53 | + --eval_steps=3 \ |
| 54 | + --local_eval_frequency=100 \ |
| 55 | + --data_dir=$DATA_DIR \ |
| 56 | + --output_dir=$OUT_DIR \ |
| 57 | + --cloud_tpu \ |
| 58 | + --cloud_delete_on_done |
| 59 | +``` |
| 60 | + |
| 61 | +For more information, see [Tensor2Tensor's |
| 62 | +documentation](https://github.com/tensorflow/tensor2tensor/tree/master/docs/cloud_tpu.md) |
| 63 | +for Tensor2Tensor on Cloud TPUs, or the [official Google Cloud Platform |
| 64 | +documentation](https://cloud.google.com/tpu/docs/tutorials/transformer) for |
| 65 | +Cloud TPUs. |
0 commit comments