1- # T2T: Tensor2Tensor Transformers
1+ # Tensor2Tensor
22
33[ ![ PyPI
44version] ( https://badge.fury.io/py/tensor2tensor.svg )] ( https://badge.fury.io/py/tensor2tensor )
@@ -10,11 +10,18 @@ welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CO
1010[ ![ License] ( https://img.shields.io/badge/License-Apache%202.0-brightgreen.svg )] ( https://opensource.org/licenses/Apache-2.0 )
1111[ ![ Travis] ( https://img.shields.io/travis/tensorflow/tensor2tensor.svg )] ( https://travis-ci.org/tensorflow/tensor2tensor )
1212
13- [ T2T] ( https://github.com/tensorflow/tensor2tensor ) is a modular and extensible
14- library and binaries for supervised learning with TensorFlow and with support
15- for sequence tasks. It is actively used and maintained by researchers and
16- engineers within the Google Brain team. You can read more about Tensor2Tensor in
17- the recent [ Google Research Blog post introducing
13+ [ Tensor2Tensor] ( https://github.com/tensorflow/tensor2tensor ) , or
14+ [ T2T] ( https://github.com/tensorflow/tensor2tensor ) for short, is a library
15+ of deep learning models and datasets. It has binaries to train the models and
16+ to download and prepare the data for you. T2T is modular and extensible and can
17+ be used in [ notebooks] ( https://goo.gl/wkHexj ) for prototyping your own models
18+ or running existing ones on your data. It is actively used and maintained by
19+ researchers and engineers within
20+ the [ Google Brain team] ( https://research.google.com/teams/brain/ ) and was used
21+ to develop state-of-the-art models for translation (see
22+ [ Attention Is All You Need] ( https://arxiv.org/abs/1706.03762 ) ), summarization,
23+ image generation and other tasks. You can read
24+ more about T2T in the [ Google Research Blog post introducing
1825it] ( https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html ) .
1926
2027We're eager to collaborate with you on extending T2T, so please feel
@@ -29,8 +36,14 @@ You can chat with us and other users on
2936[ Google Group] ( https://groups.google.com/forum/#!forum/tensor2tensor ) to keep up
3037with T2T announcements.
3138
32- Here is a one-command version that installs tensor2tensor, downloads the data,
39+ ### Quick Start
40+
41+ [ This iPython notebook] ( https://goo.gl/wkHexj ) explains T2T and runs in your
42+ browser using a free VM from Google, no installation needed.
43+
44+ Alternatively, here is a one-command version that installs T2T, downloads data,
3345trains an English-German translation model, and evaluates it:
46+
3447```
3548pip install tensor2tensor && t2t-trainer \
3649 --generate_data \
@@ -53,11 +66,17 @@ t2t-decoder \
5366 --decode_interactive
5467```
5568
56- See the [ Walkthrough] ( #walkthrough ) below for more details on each step.
69+ See the [ Walkthrough] ( #walkthrough ) below for more details on each step
70+ and [ Suggested Models] ( #suggested-models ) for well performing models
71+ on common tasks.
5772
5873### Contents
5974
6075* [ Walkthrough] ( #walkthrough )
76+ * [ Suggested Models] ( #suggested-models )
77+ * [ Translation] ( #translation )
78+ * [ Summarization] ( #summarization )
79+ * [ Image Classification] ( #image-classification )
6180* [ Installation] ( #installation )
6281* [ Features] ( #features )
6382* [ T2T Overview] ( #t2t-overview )
@@ -132,6 +151,33 @@ cat $DECODE_FILE.$MODEL.$HPARAMS.beam$BEAM_SIZE.alpha$ALPHA.decodes
132151
133152---
134153
154+ ## Suggested Models
155+
156+ Here are some combinations of models, hparams and problems that we found
157+ work well, so we suggest to use them if you're interested in that problem.
158+
159+ ### Translation
160+
161+ For translation, esp. English-German and English-French, we suggest to use
162+ the Transformer model in base or big configurations, i.e.
163+ for ` --problems=translate_ende_wmt32k ` use ` --model=transformer ` and
164+ ` --hparams_set=transformer_base ` . When trained on 8 GPUs for 300K steps
165+ this should reach a BLEU score of about 28.
166+
167+ ### Summarization
168+
169+ For summarization suggest to use the Transformer model in prepend mode, i.e.
170+ for ` --problems=summarize_cnn_dailymail32k ` use ` --model=transformer ` and
171+ ` --hparams_set=transformer_prepend ` .
172+
173+ ### Image Classification
174+
175+ For image classification suggest to use the ResNet or Xception, i.e.
176+ for ` --problems=image_imagenet ` use ` --model=resnet50 ` and
177+ ` --hparams_set=resnet_base ` or ` --model=xception ` and
178+ ` --hparams_set=xception_base ` .
179+
180+
135181## Installation
136182
137183```
0 commit comments