Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 58 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 17 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 179 tok/s Pro
GPT OSS 120B 463 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Neural Models for Offensive Language Detection (2106.14609v1)

Published 30 May 2021 in cs.CL and cs.AI

Abstract: Offensive language detection is an ever-growing NLP application. This growth is mainly because of the widespread usage of social networks, which becomes a mainstream channel for people to communicate, work, and enjoy entertainment content. Many incidents of sharing aggressive and offensive content negatively impacted society to a great extend. We believe contributing to improving and comparing different machine learning models to fight such harmful contents is an important and challenging goal for this thesis. We targeted the problem of offensive language detection for building efficient automated models for offensive language detection. With the recent advancements of NLP models, specifically, the Transformer model, which tackled many shortcomings of the standard seq-to-seq techniques. The BERT model has shown state-of-the-art results on many NLP tasks. Although the literature still exploring the reasons for the BERT achievements in the NLP field. Other efficient variants have been developed to improve upon the standard BERT, such as RoBERTa and ALBERT. Moreover, due to the multilingual nature of text on social media that could affect the model decision on a given tween, it is becoming essential to examine multilingual models such as XLM-RoBERTa trained on 100 languages and how did it compare to unilingual models. The RoBERTa based model proved to be the most capable model and achieved the highest F1 score for the tasks. Another critical aspect of a well-rounded offensive language detection system is the speed at which a model can be trained and make inferences. In that respect, we have considered the model run-time and fine-tuned the very efficient implementation of FastText called BlazingText that achieved good results, which is much faster than BERT-based models.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)