Mind Your Language: Abuse and Offense Detection for Code-Switched Languages (1809.08652v1)

Published 23 Sep 2018 in cs.CL

Abstract: In multilingual societies like the Indian subcontinent, use of code-switched languages is much popular and convenient for the users. In this paper, we study offense and abuse detection in the code-switched pair of Hindi and English (i.e. Hinglish), the pair that is the most spoken. The task is made difficult due to non-fixed grammar, vocabulary, semantics and spellings of Hinglish language. We apply transfer learning and make a LSTM based model for hate speech classification. This model surpasses the performance shown by the current best models to establish itself as the state-of-the-art in the unexplored domain of Hinglish offensive text classification.We also release our model and the embeddings trained for research purposes

Authors (6)

Raghav Kapoor (7 papers)
Yaman Kumar (23 papers)
Kshitij Rajput (2 papers)
Rajiv Ratn Shah (108 papers)
Ponnurangam Kumaraguru (129 papers)
Roger Zimmermann (76 papers)

Citations (34)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Mind Your Language: Abuse and Offense Detection for Code-Switched Languages (1809.08652v1)

Summary

Related Papers