Empirical study of pretrained multilingual language models for zero-shot cross-lingual knowledge transfer in generation (2310.09917v3)

Published 15 Oct 2023 in cs.CL

Abstract: Zero-shot cross-lingual knowledge transfer enables the multilingual pretrained LLM (mPLM), finetuned on a task in one language, make predictions for this task in other languages. While being broadly studied for natural language understanding tasks, the described setting is understudied for generation. Previous works notice a frequent problem of generation in a wrong language and propose approaches to address it, usually using mT5 as a backbone model. In this work, we test alternative mPLMs, such as mBART and NLLB-200, considering full finetuning and parameter-efficient finetuning with adapters. We find that mBART with adapters performs similarly to mT5 of the same size, and NLLB-200 can be competitive in some cases. We also underline the importance of tuning learning rate used for finetuning, which helps to alleviate the problem of generation in the wrong language.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Empirical study of pretrained multilingual language models for zero-shot cross-lingual knowledge transfer in generation (2310.09917v3)

Summary

Related Papers