Jump to navigation
Improving neural machine translation for morphologically rich languages
Raja Gunasekaran (author)Ian Hartley (thesis advisor)David Casperson (committee member)University of Northern British Columbia (Degree granting institution)
Master of Science (Msc)
1 online resource (vii, 67 pages)
Machine Translation aims to provide a seamless communication and interaction, thereby overcoming human language barriers. Recently, Neural Machine Translation (NMT) approaches have been very successful and achieve state-of-the-art performance in many language pairs. NMT systems consist of millions of neurons that are optimised to learn the input-output mapping between the source and the target languages. However, these systems produce poor translation quality under low-resource conditions and are unable to handle a large vocabulary particularly for languages with rich morphology such as Turkish, Tamil and German. In this project, we present a source vocabulary expansion technique to handle the problem of translating rare and unknown words by incorporating morphological information in the words. The effectiveness of the proposed technique is demonstrated by translating from two morphologically rich languages to English. Using this technique, we achieve a performance gain of approximately 2 BLEU points for both German → English and Turkish → English.
Machine translatingNeural networks (Computer science)
Neural Machine Translation (NMT)