3 Suggestions on How to Collaborate on Quality Estimation
There hasn’t been a single area of contemporary life that technology has not had a redefining effect on yet. Technology enables us to reach further and extends our realm of abilities. Translation technology in particular stretches the extent of cross-cultural activities never more so than in our globalized and localized age which in turn brings about several concerns such as quality assurance.
Considering that the first computer was invented only 79 years ago, the current status of technology can be mind-boggling. Even the first use of today’s buzz term ‘artificial intelligence’ can be traced back only to the second half of the 20th century. Whereas now, we are actively developing artificial intelligence, translating with the help of machines, computer-aided tools and letting software decide on the quality of our end product. This inevitably generates a dilemma between the quick-to-adapt, modern industry that would like to benefit from every bit of automation possible and the academia who base their foundation on the long-running academic theories.
The concept of quality estimation concerns both the area of research as well as the professional field and thus could be a perfect aid in this dilemma. Significant research towards quality estimation metrics has been done in recent years such as QuEst++ which is open source quality estimation software developed by Lucia Specia’s team at the University of Sheffield, with contributions from a number of other researchers. Thanks to similar efforts, a general framework to build such metrics is already available. However, these metrics have only been tested in very narrow scenarios, for just a couple of language pairs and datasets commonly used by the MT research community. Furthermore, the customization of such metrics to specific language pairs, text types and domains is still an open problem. The aim of these guidelines is to advance collaboration between the industry and the academia and possibly fill in the persisting gap between the two communities in terms of requirements and opportunities related to quality estimation.
The road to achieving this starts with building a bridge between what the industry expects in terms of quality and what the academia suggests ideally. Divergence between metrics used for MT system development and metrics used during actual production is actually far from ideal. On the other hand, the balance between the norms and the practices needs to be maintained on the basis of mutual effort. Here are 3 simple and brief suggestions on how this collaboration can be achieved:
1- Feed the academia with the data abundant in the industry
The lack of relevant data required for training metrics has been the main hindrance in the area of quality estimation. Relevant data is formed by a relatively small number (1000+) of source and translation pairs for which the quality assessment process has already been performed. Quality assessment can be based on several aspects including post-edited translations, time measurement, edit distance metrics of the post-editing (PE) process, accuracy/fluency judgment, error counts and so on.
At the initial stage, the type of data described above can be provided by the industry to be used in training a number of variants of quality estimation metrics (based on different domains, document types, genres, post-editors, etc.) using the existing framework. Industry collaborators could then validate these metrics, for example by direct comparison of their scores against those given by humans, or using them to select relevant data samples to be manually assessed (e.g. the cases estimated to have the lowest quality). Providing researchers with feedback regarding quality metrics and how they should be further adapted in parallel with particular scenarios could pave the path for greater improvement of such metrics.
This type of data is easily accessible within the industry due to the need for routinely assessing translation quality. Therefore a closer relationship between the industry and academia could inevitably enable research on better, reference-free automated evaluation metrics.
2- Instill academic findings back into the industry dynamic
Academic findings in a ‘sterile’, lab-like environment are often brushed aside by the industry members who work actively on the translation field. So it is vital that researchers adapt their findings to particular scenarios that are experienced day-to-day by the translation professionals. This would eventually result in a further improvement of quality assessment metrics. When viewed from the industry perspective, such a collaboration would mean better automated metrics to minimize the need for human assessment and potentially result in better MT systems.
As mentioned before, academia needs more feedback and field information from the industry to focus their research onto the problems that the industry is facing. In that sense, it is highly crucial that the academia not only receives the feedback regarding problems, but also tries to guide their research towards what solutions they can offer as a response to daily, practical problems of the industry. In addition, a crucial aspect for boosting the use of quality estimation techniques is the availability of more and different types of data that can drive research to investigate new solutions. Industry also needs better research on quality evaluation metrics from academia, both in terms of usability and performance, in order to test the techniques and solutions designed by the industry.
3- Be brave and take the collaboration onto high-tech platforms
Both parties may feel reluctant to alter their old ways on how to interact with each other. However, the ever-changing innovation in translation continues to come up with new ways and methods. Recent development of automated systems registered an appetite in the industry for a change in the static and time-consuming models of translation quality estimation. The ‘one-size-fits all’ approach along with the little amount of consideration given to the variables including type of content, end-user requirements, productivity etc. had to be replaced with adjustable, objective and innovative models. Platforms such as TAUS DQF (Dynamic Quality Framework) can be a great tool for facilitating the collaboration process between the academia and industry.
So, how can DQF do that? Simply by providing systematic ways of collecting and storing quality assessments (according to specific requirements for a given content type, audience, purpose, etc.) that can be directly used to train quality estimation metrics. By providing data-based and objective quality metrics, it can relieve both the industry and academia of the burden of arguments about creating a standard framework. Instead, the platform can be adjusted depending on what metrics are important for which project, based on concrete data.
Another significant issue for which DQF can be useful is that recent research has put great emphasis on helping professional translators in their work and increasing their productivity. Yet, much less effort has been displayed on the support of project managers. This target group should also not be left out in future research on quality estimation. DQF is a platform that provides metrics that make the translation process easier for both translators and projects managers, by allowing users to track a diverse range of metrics some of which are essential for translators and the others for projects managers.
On the same note, quality estimation metrics could be integrated into such platforms to support human evaluation. Human evaluation and data-driven platforms can be used cooperatively rather than alternatively. So, automated quality estimation and human evaluation can go hand in hand. And, this principle can be adopted by the academia to train the future members of the translation industry, which in turn would create a translation community that is more in tune with expectations of its components and the innovation surrounding itself.