For some asylum seekers, machine translation errors may make the difference between protection and deportation.
What’s new: Faced with a shortage of human translators, United States immigration authorities are relying on AI to process asylum claims. Faulty translations are jeopardizing applications, The Guardian reported.
How it works: The Department of Homeland Security has said it would provide human translators to asylum seekers with limited English proficiency, but this doesn’t always happen. They often resort to machine translation instead.
- Immigration authorities use a variety of models. The Department of Homeland Security works with Lionbridge and TransPerfect. Immigration Services officials use Google Translate. U.S. Customs and Border Patrol developed a bespoke app, CBP Translate, that uses translations from Google Cloud.
- Minute errors frequently result in rejected asylum applications. For example, models sometimes translate first-person pronouns from asylum seekers’ native languages into “we” in English, leading authorities to believe that multiple people filed an application, which is illegal. In one instance, authorities dismissed a woman’s application after a translator rendered a colloquial reference for her abusive father as her “boss.”
- Some translations are barely comprehensible. An illiterate Brazilian Portuguese speaker was separated from his family and subsequently detained because a model mistranslated his asylum application into gibberish.
Behind the news: Diverse factors can mar a translation model’s output:
- Even widely spoken languages may suffer from a lack of training data. For instance, on Wikipedia, roughly the same number of articles are written in Swahili, which is spoken by roughly 80 million people, as Breton, a language with fewer than 250,000 speakers.
- Many models are trained to translate among several languages using English as an intermediary, but English words don’t always account for meanings in other languages. For instance, English uses one word for rice, while Swahili and Japanese have different words for cooked and uncooked rice. This may cause inaccurate or nonsensical Swahili-to-Japanese translations of sentences that include “rice.”
- A model trained on a language’s formal variation may not translate casual usage accurately. A translator trained on a language’s most common dialect may output more errors faced with a less common one.
- Translations of spoken language may suffer if a model’s training data did not contain audio examples of the speaker’s accent, pitch, volume, or pace.
Why it matters: Machine translation has come a long way in recent years, (as has the U.S. government’s embrace of AI to streamline immigration). Yet the latest models, as impressive as they are, were not designed for specialized uses like interviewing asylum candidates at border crossings, where people may express themselves in atypical ways because they’re exhausted, disoriented, or fearful.
We’re thinking: Justice demands that asylum seekers have their cases heard accurately. We call for significantly greater investment in translation technology, border-crossing workflows, and human-in-the-loop systems to make sure migrants are treated kindly and fairly.