Low Resource Machine Translation

funded by the NSF, Meta, NSF, US Army (2021-)

Project Description

This ongoing project aims to build translation technologies that cover all languages of the world, specifically going beyond the top-100. already well-supported languages. The aim includes not only text-based translation, but also speech translation.

Participants

Faculty

Antonis Anastasopoulos, GMU

Students

Nathaniel Krasner, PhD CS Chutong Meng, PhD CS

Publications

Acknowledgements

This project was supported by a Meta research award from 2022-2025. It is currently supported by a SBIR Phase II award in collaboration with Barron Associates, focusing on low-resource languages of the Indo-Pacific.

References

2025

  1. AmericasNLP
    Machine Translation Metrics for Indigenous Languages Using Fine-tuned Semantic Embeddings
    Nathaniel Krasner*, Justin Vasselli*, Belu Ticona, Antonios Anastasopoulos, and Chi-Kiu Lo
    In Proceedings of the Fifth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP) Code here , May 2025
  2. VarDial
    normalizer.png
    Large Language Models as a Normalizer for Transliteration and Dialectal Translation
    Md Mahfuz Ibn Alam, and Antonios Anastasopoulos
    In Proceedings of the 12th Workshop on NLP for Similar Languages, Varieties and Dialects Code here , Jan 2025

2024

  1. LREC-COLING
    kurdish.jpeg
    Language and Speech Technology for Central Kurdish Varieties
    Sina Ahmadi, Daban Jaff, Md Mahfuz Ibn Alam, and Antonios Anastasopoulos
    In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) code here , May 2024
  2. WMT
    Findings of the WMT 2024 Shared Task of the Open Language Data Initiative
    Jean Maillard, Laurie Burchell, Antonios Anastasopoulos, Christian Federmann, Philipp Koehn, and Skyler Wang
    In Proceedings of the Ninth Conference on Machine Translation, Nov 2024
  3. IWSLT
    FINDINGS OF THE IWSLT 2024 EVALUATION CAMPAIGN
    Ibrahim Said Ahmad, Antonios Anastasopoulos, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, William Chen, Qianqian Dong, Marcello Federico, and 34 more authors
    In Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024), Aug 2024

2023

  1. IWSLT
    GMU Systems for the IWSLT 2023 Dialect and Low-resource Speech Translation Tasks
    Jonathan Mbuya, and Antonios Anastasopoulos
    In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), Jul 2023

2022

  1. WMT
    Language Adapters for Large-Scale MT: The GMU System for the WMT 2022 Large-Scale Machine Translation Evaluation for African Languages Shared Task
    Md Mahfuz Ibn Alam, and Antonios Anastasopoulos
    In Proceedings of the Seventh Conference on Machine Translation (WMT), Dec 2022