Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners (2405.13816v2)
Abstract: Recently, LLMs have shown impressive language capabilities. While most of the existing LLMs have very unbalanced performance across different languages, multilingual alignment based on translation parallel data is an effective method to enhance the LLMs' multilingual capabilities. In this work, we discover and comprehensively investigate the spontaneous multilingual alignment improvement of LLMs. We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages, even including those unseen during instruction-tuning. Additionally, we utilize different settings and mechanistic interpretability methods to analyze the LLM's performance in the multilingual scenario comprehensively. Our work suggests that LLMs have enormous potential for improving multilingual alignment efficiently with great language and task generalization.
- The falcon series of open language models. arXiv preprint arXiv:2311.16867.
- Nourah Alswaidan and Mohamed El Bachir Menai. 2020. A survey of state-of-the-art approaches for emotion recognition in text. Knowledge and Information Systems, 62(8):2937–2987.
- Qwen technical report. arXiv preprint arXiv:2309.16609.
- A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Breaking language barriers in multilingual mathematical reasoning: Insights and observations. arXiv preprint arXiv:2310.20246.
- Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168.
- Zero-shot cross-lingual transfer language selection using linguistic similarity. Information Processing & Management, 60(3):103250.
- Multilingual pretraining and instruction tuning improve cross-lingual knowledge alignment, but only shallowly. arXiv preprint arXiv:2404.04659.
- Principal component analysis. Nature Reviews Methods Primers, 2(1):100.
- Harold Hotelling. 1933. Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6):417.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
- Large multilingual models pivot zero-shot multimodal learning across languages. arXiv preprint arXiv:2308.12038.
- Not all languages are created equal in llms: Improving multilingual capability by cross-lingual-thought prompting. arXiv preprint arXiv:2305.07004.
- Mistral 7b. arXiv preprint arXiv:2310.06825.
- Turning english-centric llms into polyglots: How much multilinguality is needed? arXiv preprint arXiv:2312.12683.
- Bloom: A 176b-parameter open-access multilingual language model.
- Machine-created universal language for cross-lingual transfer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 18617–18625.
- Few-shot learning with multilingual generative language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9019–9052.
- Is translation all you need? a study on solving multilingual tasks with large language models. arXiv preprint arXiv:2403.10258.
- Crosslingual generalization through multitask finetuning. arXiv preprint arXiv:2211.01786.
- Nostalgebraist. 2020. interpreting gpt: the logit lens. https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens.
- Karl Pearson. 1901. Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin philosophical magazine and journal of science, 2(11):559–572.
- Cross-lingual prompting: Improving zero-shot chain-of-thought reasoning across languages. arXiv preprint arXiv:2310.14799.
- What language model to train if you have one million gpu hours? arXiv preprint arXiv:2210.15424.
- Language models are multilingual chain-of-thought reasoners. arXiv preprint arXiv:2210.03057.
- InternLM Team. 2023. Internlm: A multilingual language model with progressively enhanced capabilities.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Polylm: An open source polyglot large language model. arXiv preprint arXiv:2307.06018.
- Do llamas work in english? on the latent language of multilingual transformers. arXiv preprint arXiv:2402.10588.
- Don’t trust chatgpt when your question is not in english: A study of multilingual abilities and types of llms. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 7915–7927.
- Character-level convolutional networks for text classification. Advances in neural information processing systems, 28.
- Plug: Leveraging pivot language in cross-lingual instruction tuning. arXiv preprint arXiv:2311.08711.
- Llama beyond english: An empirical study on language capability transfer. arXiv preprint arXiv:2401.01055.
- A survey of large language models. arXiv preprint arXiv:2303.18223.
- How do large language models handle multilingualism? arXiv preprint arXiv:2402.18815.
- Llamafactory: Unified efficient fine-tuning of 100+ language models. arXiv preprint arXiv:2403.13372.
- Lima: Less is more for alignment. Advances in Neural Information Processing Systems, 36.
- Question translation training for better multilingual reasoning. arXiv preprint arXiv:2401.07817.