Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation (2405.06424v2)

Published 10 May 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Assessing response quality to instructions in LLMs is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for the quality of paired responses based on Bayesian approximation. Trained with preference datasets, our uncertainty-enabled proxy not only scores rewards for responses but also evaluates their inherent uncertainty. Empirical results demonstrate significant benefits of incorporating the proposed proxy into LLM training. Our method boosts the instruction following capability of LLMs by refining data curation for training and improving policy optimization objectives, thereby surpassing existing methods by a large margin on benchmarks such as Vicuna and MT-bench. These findings highlight that our proposed approach substantially advances LLM training and paves a new way of harnessing uncertainty within LLMs.

Authors (14)

Jae Oh Woo (13 papers)
Juree Seok (2 papers)
Parisa Hassanzadeh (19 papers)
Wooseok Jang (12 papers)
JuYoun Son (2 papers)
Sima Didari (6 papers)
Baruch Gutow (1 paper)
Heng Hao (14 papers)
Hankyu Moon (6 papers)
Wenjun Hu (14 papers)
Yeong-Dae Kwon (11 papers)
Taehee Lee (6 papers)
Seungjai Min (7 papers)
Joonho Lee (104 papers)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation (2405.06424v2)

Summary

Related Papers

Tweets