Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 158 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 117 tok/s Pro
Kimi K2 182 tok/s Pro
GPT OSS 120B 439 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Developing Safe and Responsible Large Language Model : Can We Balance Bias Reduction and Language Understanding in Large Language Models? (2404.01399v5)

Published 1 Apr 2024 in cs.CL

Abstract: LLMs have advanced various NLP tasks, such as text generation and translation, among others. However, these models often generate texts that can perpetuate biases. Existing approaches to mitigate these biases usually compromise knowledge retention. This study explores whether LLMs can produce safe, unbiased outputs without sacrificing knowledge or comprehension. We introduce the Safe and Responsible LLM (\textbf{SR}${\text{LLM}}$), which has been instruction fine-tuned atop of a safe fine-tuned auto-regressive decoder-only LLM to reduce biases in generated texts. We developed a specialized dataset with examples of unsafe and corresponding safe variations to train \textbf{SR}${\text{LLM}}$ to identify and correct biased text. Experiments on our specialized dataset and out-of-distribution test sets reveal that \textbf{SR}$_{\text{LLM}}$ effectively reduces biases while preserving knowledge integrity. This performance surpasses that of traditional fine-tuning of smaller LLMs and base LLMs that merely reply on prompting techniques. Our findings demonstrate that instruction fine-tuning on custom datasets tailored for tasks such as debiasing is a highly effective strategy for minimizing bias in LLM while preserving their inherent knowledge and capabilities. The code and dataset are accessible at \href{https://github.com/shainarazavi/Safe-Responsible-LLM}{SR-LLM}

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper:

Youtube Logo Streamline Icon: https://streamlinehq.com