About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
NAACL 2024
Keynote
Harnessing the Power of LLMs to Vitalize Indigenous Languages
Abstract
How can Large Language Models (LLMs) and modern NLP be used to increase the use and the documentation of Indigenous languages which are in danger of disappearing? First, I report on the development of high-quality translators for Indigenous languages by fine-tuning SOTA machine translators with tiny amounts of data, and discuss how to avoid some common pitfalls. Next, I present prototypes built with Indigenous communities aiming to stimulate and facilitate writing, using LLM models to create spell-checkers, next-word predictors, and similar tools. Finally, I discuss a future for documentation where dying languages are preserved as interactive language models.