Play the Shannon Game With Language Models: A Human-Free Approach to Summary Evaluation (2103.10918v2)
Abstract: The goal of a summary is to concisely state the most important information in a document. With this principle in mind, we introduce new reference-free summary evaluation metrics that use a pretrained LLM to estimate the information content shared between a document and its summary. These metrics are a modern take on the Shannon Game, a method for summary quality scoring proposed decades ago, where we replace human annotators with LLMs. We also view these metrics as an extension of BLANC, a recently proposed approach to summary quality measurement based on the performance of a LLM with and without the help of a summary. Using transformer based LLMs, we empirically verify that our metrics achieve state-of-the-art correlation with human judgement of the summary quality dimensions of both coherence and relevance, as well as competitive correlation with human judgement of consistency and fluency.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.