An independent BLEU score analysis on customization of Amazon Active Custom Translate with domain-specific TAUS datasets.

Online machine translation engines provide easy access to high-quality machine translations. They are optimized for content like news articles and social media posts that users of online platforms frequently translate.

Businesses often want to translate text with a different style and a specific topic. For enterprise use, online machine translation engines offer customization via sets of pre-existing translations that reflect the desired style and topic. This data is often called “parallel data” and TAUS makes such customization data available through TAUS Data Marketplace and provides all relevant data processing services. 

Polyglot Technology LLC independently evaluated the quality of machine translation output from Amazon Translate customized with TAUS Data (using Amazon Translate Active Custom Translation) compared to non-customized Amazon Translate.

The customization of Amazon Translate with TAUS Data always improved the BLEU score measured on the test sets by more than 6 BLEU points on average and 2 BLEU points at a minimum.

These are significant improvements that demonstrate the superiority of this
customized Amazon Translation Active Custom Translation for the Ecommerce,
Medical/Pharma and Financial domain over non-customized Amazon Translate. 

The evaluated datasets are also used as a part of the TAUS Data-Enhanced Machine Translation (DEMT) service that offers an end-to-end solution to those who wish to produce customized MT output for their specific domains, without the hassle of going through the actual MT training process. With TAUS DEMT, BLEU score points are proven to increase by 15.3% on average in the Ecommerce, Medical and Financial domains. Try TAUS DEMT now!

Download Case Study

15 minute read