Why standards and metrics for objective evaluation?
Different companies use different metrics which makes it hard to compare vendors, translators, projects and to benchmark translation quality with industry averages. In order to benchmark quality and productivity of translation services, we need an objective approach by employing industry standards and metrics. The difference between metric and standard is simple: a metric is a system of measurement; a standard is a required or agreed level of quality or attainment. A metric helps ensure that a service or a product complies with an agreed level of quality, the standard. In what follows, we will highlight some of the standards and metrics used in translation quality management.
ISO 17100 provides requirements for the core processes, resources, and other aspects necessary for the delivery of a quality translation service that meets applicable specifications. The use of raw output from machine translation plus post-editing is outside the scope of this standard.
The ISO 9000 family addresses various aspects of quality management and contains some of ISO’s best known standards. The standards provide guidance and tools for companies and organizations who want to ensure that their products and services consistently meet customer’s requirements, and that quality is consistently improved.
The EN 15038 quality standard is developed especially for translation services providers and aims to unify the terminology used in the translation field, define basic requirements for LSPs and create a framework for the interaction of customers and service providers in terms of their rights and obligations. A strong focus is on administrative, documentation, review and revision processes, as well as on the functions of different specialists who are involved in translation process over its duration. As a minimum requirement under EN15038 certification translations must involve at least two separate people performing translation and editing (or review).
The ASTM F2575-14 is a standard guide for Quality Assurance in translation. It provides a framework for customers and LSPs desirous of agreeing on the specific requirements of a translation project. It does not provide specific criteria for translation or project quality, as these requirements may be highly individual, but states parameters that should be considered before beginning a translation project. As the document's name suggests, it is a guideline, informing stakeholders about what basic quality requirements are in need of compliance, rather than a prescriptive set of detail instructions for the translator.
The LISA QA metric was initially designed to promote the best translation and localization methods for the software and hardware industries. While since 2011 LISA is no longer active, their standardization methods are still widely used in translation quality evaluation. This metric features three severity levels, but no weighting. The Model consists of a set of 20, 25 or 123 error categories, depending on how they are counted.
The SAE J2450 metric has gained popularity in the manufacturing industry. It consists of four parts:
- Seven primary error categories which cover such areas as terminology, meaning, structure, spelling, punctuation, completeness, etc.
- Two subcategories: serious and minor
- Two meta-rules to help evaluators make a decision in case of ambiguity
- Numeric weights for each primary and subcategory
The current version of the metric does not measure errors in style, making it unsuitable for evaluations of material in which style is important (e.g., owner's manuals or marketing literature).
The newest metric for error-typology based evaluation is the MQM-DQF harmonized metric. LISA QA and SAE J2450 have not kept up with the times and lack the flexibility that is required in a world with much for diversified types of content. Because the need for an industry-wide quality metric was still great, both TAUS and DKFI decided to work on a new and better quality metric.
The TAUS Dynamic Quality Framework (DQF) was developed in consultation with TAUS members. DQF includes various tools for the evaluation of translation quality, the error typology being one them. The Multidimensional Quality Metrics (MQM) is an error typology metric that was developed as part of the (EU-funded) QTLaunchPad project based on careful examination and extension of existing quality models.
Despite the variety of approaches taken in industry and research, the two models turned out to be broadly similar, but they were also different in important ways due to their history. In a series of meetings the developers of MQM and DQF agreed to make substantive changes to both frameworks to bring them into harmony. The newly harmonized metric offers translation professionals a standard and dynamic model that can be used in every context. It can be used ‘stand-alone’ but is also available through the DQF open API.
Without measurement, no improvement. Companies only become more efficient and deliver the required quality if they generate statistics and act upon the data. For that standards and metrics need to be implemented in translation workflows and technologies, streamline, automate and collect data in a normalized fashion. A number of CAT tools and TMSs have already implemented some of the metrics above. Most popular of these is the MQM-DQF harmonized metric integrated in the main CAT tools. For more information, please refer to this page: https://www.taus.net/evaluate/plugins