Emergent Mind

Multilingual Abusiveness Identification on Code-Mixed Social Media Text

(2204.01848)
Published Mar 1, 2022 in cs.CL , cs.LG , and cs.SI

Abstract

Social Media platforms have been seeing adoption and growth in their usage over time. This growth has been further accelerated with the lockdown in the past year when people's interaction, conversation, and expression were limited physically. It is becoming increasingly important to keep the platform safe from abusive content for better user experience. Much work has been done on English social media content but text analysis on non-English social media is relatively underexplored. Non-English social media content have the additional challenges of code-mixing, transliteration and using different scripture in same sentence. In this work, we propose an approach for abusiveness identification on the multilingual Moj dataset which comprises of Indic languages. Our approach tackles the common challenges of non-English social media content and can be extended to other languages as well.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.