Emergent Mind

A Simple Approach to Learning Unsupervised Multilingual Embeddings

(2004.05991)
Published Apr 10, 2020 in cs.CL , cs.LG , and stat.ML

Abstract

Recent progress on unsupervised learning of cross-lingual embeddings in bilingual setting has given impetus to learning a shared embedding space for several languages without any supervision. A popular framework to solve the latter problem is to jointly solve the following two sub-problems: 1) learning unsupervised word alignment between several pairs of languages, and 2) learning how to map the monolingual embeddings of every language to a shared multilingual space. In contrast, we propose a simple, two-stage framework in which we decouple the above two sub-problems and solve them separately using existing techniques. The proposed approach obtains surprisingly good performance in various tasks such as bilingual lexicon induction, cross-lingual word similarity, multilingual document classification, and multilingual dependency parsing. When distant languages are involved, the proposed solution illustrates robustness and outperforms existing unsupervised multilingual word embedding approaches. Overall, our experimental results encourage development of multi-stage models for such challenging problems.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.