Cross-lingual Text Classification with Heterogeneous Graph Neural Network (2105.11246v1)

Published 24 May 2021 in cs.CL

Abstract: Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages, which is very useful for low-resource languages. Recent multilingual pretrained LLMs (mPLM) achieve impressive results in cross-lingual classification tasks, but rarely consider factors beyond semantic similarity, causing performance degradation between some language pairs. In this paper we propose a simple yet effective method to incorporate heterogeneous information within and across languages for cross-lingual text classification using graph convolutional networks (GCN). In particular, we construct a heterogeneous graph by treating documents and words as nodes, and linking nodes with different relations, which include part-of-speech roles, semantic similarity, and document translations. Extensive experiments show that our graph-based method significantly outperforms state-of-the-art models on all tasks, and also achieves consistent performance gain over baselines in low-resource settings where external tools like translators are unavailable.

Authors (5)

Ziyun Wang (27 papers)
Xuan Liu (94 papers)
Peiji Yang (5 papers)
Shixing Liu (2 papers)
Zhisheng Wang (15 papers)

Citations (28)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Cross-lingual Text Classification with Heterogeneous Graph Neural Network (2105.11246v1)

Summary

Related Papers