Translators without Borders (TWB) has joined the Translation Initiative for COVID-19 (TICO-19). TICO-19 is focused on using language technology to make COVID-19 information available in as many languages as possible. TICO-19 includes translators, technologists, and researchers from TWB, Amazon, Appen, Carnegie Mellon University, Facebook, Google, John Hopkins University, Microsoft, and Translated, who are working together to develop efficient and scalable language technology for 37 languages, some that are under-resourced by technology, like Dari, Dinka, Hausa, Luganda, Pashto, and Zulu.
The translated content will focus on key COVID-19 terminology, ensuring COVID-19 information is more globally accessible and equitable.
“Language technology is a powerful tool that can help people communicate more consistently, quickly, and confidently about global issues like COVID-19. Yet many languages don’t have the necessary data needed to build this innovative technology,” explains Grace Tang, TWB’s Gamayun Program Manager. “We’re excited that industry leaders recognize this gap, and are working with us to develop technology that can help everyone communicate about COVID-19, no matter what language they speak.”
The initiative will develop translated datasets for approximately 70,000 key COVID-19 terms and phrases. The resulting datasets, machine translation engines, and translation memories will be made publicly accessible through TICO-19’s GitHub and TWB’s online language data portal to make sure this specialized content can inform future machine translation initiatives.
TWB brings language technology expertise to TICO-19, particularly for marginalized languages. Its language equality initiative, Gamayun, uses advanced language technology to increase language equality and improve two-way communication in marginalized languages. The ultimate goal is to allow everyone to give and receive information in the language and format they understand. TWB’s TICO-19 involvement builds on previous Gamayun experience built through a successful pilot project that developed a machine translation engine for Levantine Arabic in Syria. In addition, TWB’s Gamayun initiative has built language datasets and machine translation engines for Rohingya, Tigrinya, Kanuri, Kurmanji, and other low-resource languages.