Chapter 10: Machine Translation

In this chapter, we train a neural machine translation (NMT) model by using IWSLT’14 English to German translation dataset. For the actual implementation of an NMT model, use the off-the-shelf toolkits such as fairseq, Hugging Face Transformers and OpenNMT-py.

90. Data PreprocessingPermalink

Download the dataset and obtain training, validation and test data by running this script.

91. Training the machine translation modelPermalink

Use the dataset from the problem 90 and train the machine translation model. The choice of the model architecture is arbitrary (e.g., LSTM-based model, Transformer-based model).

92. Translating a textPermalink

Use the trained model from the problem 91 and implement the program that translates a given English sentence to a German sentence.

93. BLEU scorePermalink

Use the trained model from the problem 91 and compute the BLEU score on the test set.

94. Beam searchPermalink

Use the beam search when translating a given sentence with the model from the problem 91. Change the beam width from 1 to 100. Then, plot the change in BLEU score on the graph.

95. SubwordPermalink

Use the subword-based segmentation instead of the word-based segmentation. Repeat the problem 91-94 with the new segmentation.

96. Plotting the learning curvePermalink

Use tools like TensorBoard and visualize how the training of the machine translation model proceeds. Specifically, visualize the metrics such as the loss and the BLEU score on the training data, and the loss and the BLEU score on the validation data.

97. Hyper-parameter searchPermalink

Change the architecture and/or hyper-parameter of the machine translation model and search for the model that achieves the highest performance on the validation data.

98. Domain AdaptationPermalink

Evaluate the machine translation model on newstest2014 dataset. To do this, use sacreBLEU toolkit. Then, use News Commentary dataset and improve the performance of the model.

99. Translation ServerPermalink

Build a demo system, in which the user can translate arbitrary text on the Web browser.