Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models (2401.05861v2)

Published 11 Jan 2024 in cs.CL

Abstract: The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on multilingual LLMs with high-quality translation pairs. In this paper, we focus on boosting many-to-many multilingual translation of LLMs with an emphasis on zero-shot translation directions. We demonstrate that prompt strategies adopted during finetuning are crucial to zero-shot translation and introduce a cross-lingual consistency regularization, XConST, to bridge the representation gap among different languages and improve zero-shot translation performance. XConST is not a new method, but a version of CrossConST (Gao et al., 2023a) adapted for translation instruction finetuning with LLMs. Experimental results on ALMA (Xu et al., 2023), Tower (Team, 2024), and LLaMA-2 (Touvron et al., 2023) show that our approach consistently improves translation performance. Our implementations are available at https://github.com/gpengzhi/CrossConST-LLM.

References (40)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - gpengzhi/CrossConST-LLM: Code for arXiv paper "Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models" (2 stars)

Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models (2401.05861v2)

Summary

Related Papers

GitHub

Tweets