Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mind Your Language: Abuse and Offense Detection for Code-Switched Languages (1809.08652v1)

Published 23 Sep 2018 in cs.CL

Abstract: In multilingual societies like the Indian subcontinent, use of code-switched languages is much popular and convenient for the users. In this paper, we study offense and abuse detection in the code-switched pair of Hindi and English (i.e. Hinglish), the pair that is the most spoken. The task is made difficult due to non-fixed grammar, vocabulary, semantics and spellings of Hinglish language. We apply transfer learning and make a LSTM based model for hate speech classification. This model surpasses the performance shown by the current best models to establish itself as the state-of-the-art in the unexplored domain of Hinglish offensive text classification.We also release our model and the embeddings trained for research purposes

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Raghav Kapoor (7 papers)
  2. Yaman Kumar (23 papers)
  3. Kshitij Rajput (2 papers)
  4. Rajiv Ratn Shah (108 papers)
  5. Ponnurangam Kumaraguru (129 papers)
  6. Roger Zimmermann (76 papers)
Citations (34)

Summary

We haven't generated a summary for this paper yet.