Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 165 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 189 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Inconsistencies in Masked Language Models (2301.00068v3)

Published 30 Dec 2022 in cs.CL and cs.AI

Abstract: Learning to predict masked tokens in a sequence has been shown to be a helpful pretraining objective for powerful LLMs such as PaLM2. After training, such masked LLMs (MLMs) can provide distributions of tokens in the masked positions in a sequence. However, this paper shows that distributions corresponding to different masking patterns can demonstrate considerable inconsistencies, i.e., they cannot be derived from a coherent joint distribution when considered together. This fundamental flaw in MLMs can lead to self-contradictory behaviors during inference. On various benchmark datasets including MMLU, MLMs can give different predictions to the same input question. From BERT-base to UL2-20B, we show that such inconsistencies exist ubiquitously in MLMs of diverse sizes and configurations. In light of our observations, we further propose an inference-time strategy for MLMs called Ensemble of Conditionals. It jointly considers a selected range of inconsistent conditionals directly produced by the MLM for the final prediction, which often leads to considerable accuracy improvement.

Citations (2)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 10 likes.

Upgrade to Pro to view all of the tweets about this paper: