Emergent Mind


LLMs have made remarkable strides in various tasks. Whether LLMs are competitive few-shot solvers for information extraction (IE) tasks, however, remains an open problem. In this work, we aim to provide a thorough answer to this question. Through extensive experiments on nine datasets across four IE tasks, we demonstrate that current advanced LLMs consistently exhibit inferior performance, higher latency, and increased budget requirements compared to fine-tuned SLMs under most settings. Therefore, we conclude that LLMs are not effective few-shot information extractors in general. Nonetheless, we illustrate that with appropriate prompting strategies, LLMs can effectively complement SLMs and tackle challenging samples that SLMs struggle with. And moreover, we propose an adaptive filter-then-rerank paradigm to combine the strengths of LLMs and SLMs. In this paradigm, SLMs serve as filters and LLMs serve as rerankers. By prompting LLMs to rerank a small portion of difficult samples identified by SLMs, our preliminary system consistently achieves promising improvements (2.4% F1-gain on average) on various IE tasks, with an acceptable time and cost investment.


  • This study evaluates the effectiveness of LLMs versus Small Language Models (SLMs) in information extraction tasks, challenging their few-shot learning capabilities.

  • Across multiple datasets and tasks, SLMs generally outperform LLMs, except in low-resource contexts, while also offering benefits in latency and costs.

  • LLMs display a specialized ability to handle difficult samples that SLMs struggle with, suggesting an opportunity for strategic application.

  • The paper proposes an adaptive filter-then-rerank paradigm, which combines the strengths of SLMs with the niche capabilities of LLMs, to improve accuracy in IE tasks.


Recent discourse surrounding the capabilities of LLMs has predominantly focused on their merits as few-shot learners. In the domain of information extraction (IE), the efficacy of LLMs, particularly in few-shot contexts, is still very much in question. This study scrutinizes the comparative advantage, if any, of LLMs over Small Language Models (SLMs) across range of popular IE tasks.

Performance Analysis of LLMs vs. SLMs

The comprehensive empirical evaluation conducted in this study spans nine datasets across four canonical IE tasks: Named Entity Recognition (NER), Relation Extraction (RE), Event Detection (ED), and Event Argument Extraction (EAE). The researchers systematically compared the performance of in-context learning via LLMs against fine-tuned SLMs, adopting multiple configurations simulating typical real-world low-resource settings. Surprisingly, the overarching finding is that, except for extremely low-resource situations, LLMs fall short against their SLM counterparts. SLMs not only demonstrated superior results but also exhibited lower latency and reduced operational costs.

Probing The Efficacy of LLMs in Sample Difficulty Stratification

A core aspect of the research was dissecting the sample handling capabilities of both LLMs and SLMs. Through fine-grained analysis, LLMs were observed to handle complex or 'hard' samples adroitly, where SLMs would typically falter. This interestingly bifurcated pattern of competency suggests that while SLMs generally dominate, LLMs possess a niche capability that can be strategically exploited, particularly when SLMs struggle with certain difficult cases.

Adaptive Filter-then-rerank Paradigm

Leveraging the aforementioned insights, the authors innovatively propose an adaptive filter-then-rerank paradigm. In this framework, SLMs first filter samples by confidence, effectively bifurcating them into 'easy' and 'hard' categories. Subsequently, LLMs are prompted to rerank a focused subset of hard samples. This hybrid system, which refines the use of LLMs for IE tasks, consistently achieved an average of 2.4% F1 score improvement while maintaining acceptable time and resource investments—a compelling testament to the judicious combination of LLMs' strengths with the efficiency of SLMs.

Create an account to read this summary for free:


Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.