FFCI: A Framework for Interpretable Automatic Evaluation of Summarization (2011.13662v3)
Abstract: In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that comprises four elements: faithfulness (degree of factual consistency with the source), focus (precision of summary content relative to the reference), coverage (recall of summary content relative to the reference), and inter-sentential coherence (document fluency between adjacent sentences). We construct a novel dataset for focus, coverage, and inter-sentential coherence, and develop automatic methods for evaluating each of the four dimensions of FFCI based on cross-comparison of evaluation metrics and model-based evaluation methods, including question answering (QA) approaches, semantic textual similarity (STS), next-sentence prediction (NSP), and scores derived from 19 pre-trained LLMs. We then apply the developed metrics in evaluating a broad range of summarization models across two datasets, with some surprising findings.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.