About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
PROPOR 2024
Workshop paper
Human Evaluation of the Usefulness of Fine-Tuned English Translators for the Guarani Mbya and Nheengatu Indigenous Languages
Abstract
We investigate how useful are machine translators based on the fine-tuning of LLMs with very small amounts of training data, typical of extremely low-resource languages such as Indigenous languages. We started by developing translators for the Guarani Mbya and Nheengatu languages by fine-tuning a WMT- 19 German-English translator. We then performed a human evaluation of the usefulness of the results of test sets and compared them to their SacreBLUE scores. We had a level of alignment around 60-70%, although there were about 40% of very wrong translations. The results suggest the need of a filter for bad translations as a way to make the translators useful, possibly only in scenarios of human-AI collaboration such as writing-support assistants.