During the TAUS Annual Conference 2016 in October, panelists will discuss "How to deliver high-quality translations for long-tail languages". This blog post is written in preparation for this session.
Recent blog posts
The last significant breakthrough in the technology of statistical machine translation (SMT) was in 2005. That year, David Chiang published his famous paper on hierarchical translation models that allowed to significantly improve the quality of statistical MT between distant languages. Nowadays we are standing on the verge of an even more exciting moment in MT history: deep learning (DL) is taking MT towards much higher accuracy and finally brings human-like semantics to the translation process.
Neural Machine Translation (NMT) systems have achieved impressive results in many Machine Translation (MT) tasks in the past couple of years. This is mainly due to the fact that Neural Networks can solve non-linear functions, making NMT perfect for mimicking the linguistic rules followed by the human brain.
Data entered the field of machine translation in the late eighties and early nineties when researchers at IBM’s Thomas J. Watson Research Center reported successes with their statistical approach to machine translation.
Until that time machine translation worked more or less the same way as human translators with grammars, dictionaries and transfer rules as the main tools. The syntactic and rule-based Machine Translation (MT) engines appealed much more to the imagination of linguistically trained translators, while the new pure data-driven MT engines with probabilistic models turned translation technology more into an alien threat for many translators. Not only because the quality of the output improved as more data were fed into the engines, but also because they could not reproduce or even conceive what really happened inside these machines.