AppTek Launches New Metadata-Informed Neural Machine Translation System for Enterprises; Expands MT Language and Dialect Coverage

News provided by

Apr 14, 2022, 08:34 ET

New state-of-the-art system offers enterprise customers and translation professionals with advanced customization options for multi-domain, multi-dialect, multi-genre translations, which boost accuracy and further accelerate translation and localization workflows.

MCLEAN, Va., April 14, 2022 /PRNewswire/ -- AppTek, a leader in Artificial Intelligence (AI) and Machine Learning (ML) for Automatic Speech Recognition (ASR), Neural Machine Translation (NMT), Natural Language Processing / Understanding (NLP/U) and Text-to-Speech (TTS) technologies, today announced the release of its new neural machine translation system that incorporates metadata as inputs used to customize the MT output and empower localization professionals with more accurate user-influenced machine translations. Additionally, the company expanded its core machine translation platform to support hundreds of language and dialect pairs.

AppTek's new meta-aware NMT system is changing the paradigm of how professional translators work with machine translation output. Up until today, most off-the-shelf MT systems have functioned inside a "black box" where source language text is formulated into text of a target language with no or limited awareness of the surrounding context or the domain or topic of the source text, and with limited control of the resulting output. Traditionally, enterprises would need to train, deploy and maintain multiple MT systems to account for translation tasks that differ in aspects such as language, dialect, domain, topic, and more, at the risk of high deployment costs and overfitting models.

With AppTek's new metadata informed NMT platform, enterprise customers can now access a single NMT system with multi-domain, multi-genre, multi-dialect content which increases the quality and adaptability of the system. By feeding additional metadata into the system, they gain more control of the MT output and can enable translators to simply "flip the switch" to the desired customized translation through relevant functionality in the user interface of the editing tools professionals work with.

Examples of MT output customization achieved with using additional metadata include:

Style - switch between formal and informal styles, such as that between a telenovela and a documentary, and get a translation with an appropriate politeness register depending on speaker status and relationships;
Length Control for Automatic Dubbing and Subtitling Tasks – generate shorter or longer translations with minimal information loss or distortion for tasks with hard length constraints;
Speaker Gender – toggle to the correct speaker gender, which influences inflections for certain parts of speech, especially in morphologically rich languages such as Czech;
Domain – adapt to the genre of the text, such as news programs, patents, talk shows, etc. to increase overall accuracy and use of in-domain, relevant translations of ambiguous words at the document level;
Extended Context – optionally make the system consider neighboring sentences within a document when translating a particular sentence so that ambiguity of, for example, pronoun translation can be resolved.
Glossary – account for official or mandatory translations which the system may otherwise translate differently; and,
Language Variety - account for multiple languages and dialects within a single system, as well as handling mixed-language content.

"By incorporating metadata to influence the MT output we are able to inject some 'world knowledge' into our platform," said Evgeny Matusov, AppTek's Lead Science Architect for Neural Machine Translation. "This improves the overall quality and adaptability of the system output and can be accomplished within a single multi-purpose system designed to reduce environmental footprint and cost."

AppTek's metadata-informed MT technology is now available for translation from English to selected European languages and their varieties, with more language pairs coming soon. The system can be customized and adapted to the needs of enterprise customers by utilizing existing parallel domain-specific translation corpora found inside company archives.

"As the demand for content localization continues to skyrocket, enterprises need to continue to innovate and find new ways to further accelerate production workflows," said Kyle Maddock, SVP Marketing at AppTek. "Our metadata-informed MT system has been specifically designed with translation professionals in mind, by providing them with more control over the MT output which can further speed up the localization process."

In addition to its metadata-informed NMT system, AppTek has also expanded its core MT platform to cover an extensive list of languages and dialects including the addition of Indic and Slavic languages. It now supports Afrikaans, Albanian, Amharic, Arabic (multi-dialect), Armenian, Azerbaijani, Bengali, Belorussian, Bosnian, Bulgarian, Catalan, Chinese (multi-dialect), Croatian, Czech, Danish, Dari, Dutch, English (multi-dialect), Estonian, Farsi, Finnish, French (multi-dialect), Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Kyrgyz, Latvian, Lithuanian, Macedonian, Malay, Malayalam, Marathi, Mongolian, Norwegian, Pashto, Polish, Portuguese (multi-dialect), Punjabi, Romanian, Russian, Serbian, Slovak, Slovenian, Somali, Spanish (multi-dialect), Swedish, Tagalog, Tamil, Telugu, Tigrinya, Thai, Turkish, Turkmen, Ukrainian, Urdu and Uzbek.

For more information, visit www.apptek.com.

About AppTek
AppTek is a global leader in artificial intelligence (AI) and machine learning (ML) technologies for automatic speech recognition (ASR), neural machine translation (NMT), natural language processing/understanding (NLP/U) and text-to-speech (TTS) technologies. The AppTek platform delivers industry-leading, real-time streaming and batch technology solutions in the cloud or on-premises for organizations across a breadth of global markets such as media and entertainment, call centers, government, enterprise business, and more. Built by scientists and research engineers who are recognized among the best in the world, AppTek's multidimensional 4D for HLT (human language technology) solutions with slice and dice methodology covering hundreds of languages/dialects, domains, channels and demographics drive high impact results with speed and precision. For more information, please visit http://www.apptek.com.‍

Media Contact:
Kyle Maddock
202-413-8654
[email protected]

SOURCE AppTek