Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 58 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 17 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 179 tok/s Pro
GPT OSS 120B 463 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Large Language Models Understand and Can be Enhanced by Emotional Stimuli (2307.11760v7)

Published 14 Jul 2023 in cs.CL, cs.AI, and cs.HC

Abstract: Emotional intelligence significantly impacts our daily behaviors and interactions. Although LLMs are increasingly viewed as a stride toward artificial general intelligence, exhibiting impressive performance in numerous tasks, it is still uncertain if LLMs can genuinely grasp psychological emotional stimuli. Understanding and responding to emotional cues gives humans a distinct advantage in problem-solving. In this paper, we take the first step towards exploring the ability of LLMs to understand emotional stimuli. To this end, we first conduct automatic experiments on 45 tasks using various LLMs, including Flan-T5-Large, Vicuna, Llama 2, BLOOM, ChatGPT, and GPT-4. Our tasks span deterministic and generative applications that represent comprehensive evaluation scenarios. Our automatic experiments show that LLMs have a grasp of emotional intelligence, and their performance can be improved with emotional prompts (which we call "EmotionPrompt" that combines the original prompt with emotional stimuli), e.g., 8.00% relative performance improvement in Instruction Induction and 115% in BIG-Bench. In addition to those deterministic tasks that can be automatically evaluated using existing metrics, we conducted a human study with 106 participants to assess the quality of generative tasks using both vanilla and emotional prompts. Our human study results demonstrate that EmotionPrompt significantly boosts the performance of generative tasks (10.9% average improvement in terms of performance, truthfulness, and responsibility metrics). We provide an in-depth discussion regarding why EmotionPrompt works for LLMs and the factors that may influence its performance. We posit that EmotionPrompt heralds a novel avenue for exploring interdisciplinary knowledge for human-LLMs interaction.

Citations (85)

Summary

  • The paper reveals that EmotionPrompt significantly enhances LLM performance, achieving up to 115% improvement on complex tasks and a 10.9% boost in generative outcomes.
  • The study employs a novel approach by integrating emotional stimuli into prompts and evaluates diverse LLM models using standardized benchmarks and human assessments.
  • The research highlights that positive emotional cues improve prompt representation, with larger models and varied inference settings showing the most substantial benefits.

LLMs Understand and Can be Enhanced by Emotional Stimuli

Introduction to EmotionPrompt

This paper explores the intersection of emotional intelligence and LLMs, investigating whether these models can comprehend and be improved by emotional stimuli. The authors introduce the concept of "EmotionPrompt," which integrates emotional stimuli into the existing prompt structures used for LLMs. The objective is to evaluate if emotional cues can enhance the performance of LLMs across various tasks, such as reasoning and generative abilities.

The authors performed thorough experiments using LLMs like Flan-T5-Large, Vicuna, Llama 2, BLOOM, ChatGPT, and GPT-4. The results demonstrated that LLMs can indeed be fortified with EmotionPrompt, leading to improvements in task performance, especially in both deterministic tasks and human evaluations. Figure 1

Figure 1: An overview of our research from generating to evaluating EmotionPrompt.

Experimental Results

Deterministic Task Performance

The experiments included tasks from Instruction Induction and BIG-Bench Benchmark datasets to evaluate the effectiveness of EmotionPrompt under different settings. Instruction Induction tasks aim to assess LLMs in inferring underlying tasks from given demonstrations, which tend to be relatively straightforward. In contrast, BIG-Bench tasks involve more complex challenges, making them beyond most LLM capabilities.

In Instruction Induction, metrics showed a relative improvement of 8.00% in task performance with the application of EmotionPrompt. For BIG-Bench tasks, the results were even more promising, showing a 115% improvement when integrating emotional stimuli. Figure 2

Figure 2: Results on 24 tasks from Instruction Induction.

Figure 3

Figure 3: Results on 21 tasks from BIG-Bench.

Human Study on Generative Tasks

A separate human paper involved 106 participants, examining the impact of EmotionPrompt on generative tasks. The paper assessed the output using three metrics: performance, truthfulness, and responsibility. This helped evaluate the effectiveness of EmotionPrompt in scenarios that go beyond mere deterministic computations. EmotionPrompt significantly boosted generative performance, averaging a 10.9% improvement across the evaluated metrics. Figure 4

Figure 4

Figure 4: The mean and standard deviation of the human paper results in three metrics.

Truthfulness and Informativeness on TruthfulQA

Additional evaluations were carried out on the TruthfulQA dataset, a benchmark specifically concerned with truthfulness and informativeness. The incorporation of EmotionPrompt revealed improvements in truthfulness metrics by 19% and informativeness scores by 12%. Figure 5

Figure 5: Results on TruthfulQA. We use the best result of EmotionPrompt.

Mechanisms Behind EmotionPrompt

Role of Positive Words

The paper explores understanding why EmotionPrompt works by analyzing input attention contributions. Positive words embedded in emotional prompts significantly influence LLM outputs. Emotional stimuli particularly enrich the representation of the original prompts, thereby leading to enhanced outcomes. Figure 6

Figure 6: Contributions of Positive Words to the performance of output on 8 Tasks. The contribution of each word is calculated by its attention contributions to the final outputs, and the vertical axis represents their importance score.

Optimal Emotional Stimuli

The paper further explores which emotional stimuli are most effective by analyzing various configurations of emotion-infused prompts. Experiments demonstrated that different tasks might benefit from various emotional stimuli, highlighting that specific stimuli might activate inherent characteristics of LLMs more effectively. Figure 7

Figure 7

Figure 7: Performance of all emotional stimuli on Instruction Induction. The color of the bar represents the performance of each stimuli.

Influencing Factors and Variability

The efficacy of EmotionPrompt is influenced by several factors, including model scales and pre-training strategies. Larger models tend to benefit more prominently from emotional stimuli. Furthermore, factors like supervised fine-tuning and reinforcement learning affect the performance of EmotionPrompt across different LLM architectures.

Impact of Inference Settings

Experimentation with temperature settings during LLM inference revealed that higher temperatures tend to yield more substantial benefits from EmotionPrompt. This indicates that EmotionPrompt provides a robustness boost to LLMs under varying inference conditions. Figure 8

Figure 8: Performance on various temperatures.

Conclusion

The research reaffirms that LLMs not only understand but are also augmented by emotional stimuli. EmotionPrompt presents an uncomplicated method to enhance LLM performance across diverse tasks by leveraging psychological insights. Future work may include deeper exploration of the divergence between human and machine emotional intelligence, alongside optimizing pre-training strategies to more effectively incorporate emotion psychology into LLMs. This understanding could bridge gaps and offer novel pathways in AI and social science integration.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube