Earlier this month, we attended the TAUS Data Summit, a virtual event focused on language data. The summit brought together key stakeholders in the language industry, from owners and producers of data to NMT experts, to share knowledge and information, and seek new ways to collaborate.
Over the course of the event, several key themes were identified as being industry game changers: data, digitisation, and neural machine translation (NMT).
The most salient point of the event referred to the change in mindset concerning data, dubbed the “Data-First” paradigm shift (download TAUS report). Until now, the focus in machine learning was obtaining as much data as possible in order to train efficient engines. Today, it is the quality of data that matters. Quality datasets allow machine translation (MT) engines to be trained much quicker, minimise errors and nonsensical text (i.e. noise), and reduce the post-editing burden of the human linguist.
In fact, quality data is so important nowadays that it has its own marketplace. There are several platforms available online which allow users to buy, sell and access data streams from a variety of different fields. Buyers tend to be language service providers (LSPs), namely stakeholders looking to build and train MT engines to be used on a commercial scale. Those who produce this data are usually specialised linguists who speak the language combinations involved, or with access to large volumes of translated data.
Reconfiguring our understanding of how data is used, and how it can be valorised, has profound ramifications on the translation landscape. The industry itself has morphed into a digital environment, in a process that has impacted all aspects of the translation workflow. This is reflected in the increased use of translation management software, which streamline and automate previously-laborious project management tasks, or the mounting pressure on linguists to incorporate technology into their work.
While this digital revolution was already in motion long before 2020, the COVID-19 pandemic and widespread adoption of remote working served only to intensify it. Everything is digital now, which will benefit those translators who are able to embrace new ways of working and diversify their services. On the other hand, linguists who cannot adapt well and quickly might see their work volumes dropping as a result.
Although digitisation is here to stay, it is not yet the be-all and end-all of translation. In reality, the sector is in a midst of a paradigm shift: while new technologies are rapidly emerging, especially in the field of automated translations, the quality of their output or their basic functioning is still very much dependent on human input. At present, humans still play a crucial role in the successful adoption and implementation of technology. There is still considerable value added, which will likely continue into the foreseeable future, but in time the way these skills are used may change, transforming the traditional translation service.
This calls into question the mainstream understanding of machine translation as seeking to replace or render human translators obsolete. Instead, this working model should be viewed as hybrid, merging machine translation with human expertise. One example of this approach is machine translation post-editing, referred to in the market as a “hybrid economic model”. In MT post-editing, a specially-trained engine produces raw translated output, which is then edited and polished by an expert linguist. This dramatically decreases turnaround times and allows the processing of greater volumes of translations than in the traditional model.
There is however the issue of perception. At the moment, machine translation does not necessarily have the best reputation, and there are plenty of examples online which could support this view. Of course, while reputable LSPs will use quality engines, clients will still need time to adjust and warm up to automated translation. It is then up to LSPs to build clients’ trust in NMT, by identifying their needs and offering targeted, customised solutions, and educating them on the cost, speed and consistency benefits of the new service.
This data summit was an excellent source of updates on these paradigm-shifting technologies. DWL will continue to invest in NMT as a tool to support the work of our valued translators.