Neural Machine Translation

Neural machine translation (NMT) is the latest evolution in machine translation (MT) and a hot topic right now. Whilst MT itself is no new phenomenon, with the first public demonstration of such computational techniques going back as far as 1954, it has gone through a series of developments to reach its current manifestation.

 

How has MT developed?

To bring you (very briefly) up to speed on the history of MT: rule-based MT was the original MT, based on dictionaries and grammar rules; statistical machine translation (SMT) became popular at the turn of the millennium with commercial and open-access services available to the public. Higher processing power and the accessibility of computers for general use at a lower cost also helped to drive this trend. Hybrid MT systems have also been developed in order to address the shortcomings of both, combining the different approaches.

SMT was most notably adopted by Google and Microsoft in 2006/2007 and has been made possible thanks to the vast number of digitalised parallel corpora now available online. We’ve discussed the issues of using Google Translate in medical translation in a previous blog article, which at time of writing applied to SMT, but the anecdotal problems of the technology are generally widely known. Because of these known inaccuracies in MT, NMT has been developed.

 

What improvements has NMT brought about?

Texts translated through NMT have a much higher level of fluency compared to those texts translated using previous incarnations of MT systems, as NMT works by feeding whole sentences into the ‘brain’ of the system and working out patterns from that information; NMT’s strength is in its “ability to learn directly” (Wu, Y. et al, 2016).

Early research between German and English texts (Burchardt, 2017) has suggested that NMT shows improved accuracy over previous MT systems. However, NMT has the ability to hide any errors by still reading the text fluently, even when it is not correct; this can be problematic if parts of a sentence are not translated at all.

So, there are still many issues with NMT which need to be resolved in order to bring it closer to the error levels of a high quality human translation, including problems with rare words and the time needed to train an NMT system to perform.

 

How is Silicon Valley developing NMT?

Google Neural Machine Translation now has the “potential to overcome many of the weaknesses of conventional phrase-based translation systems” and promises “roughly a 60% reduction in translation errors on several popular language pairs” (Wu et al, 2016). It should be noted it still relies on content from its own search engine, which may inadvertently contain translation errors itself.

The tech giant started rolling out this particular technology in September 2016 and has gradually been switching languages from their proprietary SMT over to NMT since then, most recently adding Indian languages into its NMT offering. Currently, it only works to and from English although there has been some evidence that it will be possible to translate directly between Japanese and Korean and eventually other language combinations, which leads to “zero-shot translation”; the ability for MT to move between two languages without the need to go via English. As of July 2017, SMT has not yet been completely replaced, as it is still needed for languages other than English; English continues to be used as an intermediary language in MT.

Google claims its NMT bridges the gap between human and computer translators. For the casual non-linguist user, these technological changes may improve the quality of free translation available to the general public.

Amazon

MT also helps businesses cope with the huge amounts of content being generated each day; there are not enough translators (nor competent linguists) to process all of this data as quickly as it is needed, so MT helps to plug the gap in this market.

Other companies, such as Microsoft, SYSTRAN and most recently Amazon, have entered the NMT battleground. Amazon will start to offer its internal MT system, developed at its MT offices in Pittsburgh to solve case-related issues, to third parties (Faes, 2017).

 

 

What does this mean with regards to human translators and the translation industry as a whole?

Well, Bill Kaper, General Manager of Amazon Pittsburgh and General Manager of the Translation Services group, sums it up quite nicely: “we absolutely believe human translators and human translation services will continue to be a vital component to improving the overall translation space”.

Of course, we do, too.

At Parallel, we strive to keep on top of technological developments relevant to our industry and sectors; Vicky and Paul are leading our research and regularly attend courses and webinars to find out more about the advances in this technology and to assess how we can use this knowledge in practice.

It’s a fascinating and fast-paced evolution and we will be watching how the big players’ progress with a keen eye.

 

References and further reading:

Burchardt, A. (04 May 2017). Comparing Errors of Neural MT with Errors of “Traditional” Phrase-based and Rule-based MT. GALA (Globalization and Localization Association).

Faes, F. (27 June 2017). Amazon Plans to Take On Google Translate, Media Reports. Slator.com.

Hutchins, J. (2006). The first public demonstration of machine translation: the Georgetown-IBM system, 7th January 1954.

 Wu, Y., Schuster, M., Chen, Z. et al. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.

Zetzsche, J. (2016). Going neural. ITI Bulletin. (September-October), pp.20-21.