Prototypical Verbalizer for Prompt-based Few-shot Tuning

Published 18 Mar 2022 in cs.CL and cs.LG | (2203.09770v1)

Abstract: Prompt-based tuning for pre-trained LLMs (PLMs) has shown its effectiveness in few-shot learning. Typically, prompt-based tuning wraps the input text into a cloze question. To make predictions, the model maps the output words to labels via a verbalizer, which is either manually designed or automatically built. However, manual verbalizers heavily depend on domain-specific prior knowledge and human efforts, while finding appropriate label words automatically still remains challenging.In this work, we propose the prototypical verbalizer (ProtoVerb) which is built directly from training data. Specifically, ProtoVerb learns prototype vectors as verbalizers by contrastive learning. In this way, the prototypes summarize training instances and are able to enclose rich class-level semantics. We conduct experiments on both topic classification and entity typing tasks, and the results demonstrate that ProtoVerb significantly outperforms current automatic verbalizers, especially when training data is extremely scarce. More surprisingly, ProtoVerb consistently boosts prompt-based tuning even on untuned PLMs, indicating an elegant non-tuning way to utilize PLMs. Our codes are avaliable at https://github.com/thunlp/OpenPrompt.

Abstract PDF Upgrade to Chat

Authors (5)

Citations (89)

View on Semantic Scholar

Summary

The paper introduces ProtoVerb, a prototypical verbalizer that learns class-level prototypes via contrastive learning to improve few-shot tuning.
It applies prototype learning with the InfoNCE estimator to create semantic vectors from minimal data for tasks like topic classification and entity typing.
Experimental results show that ProtoVerb outperforms traditional verbalizers, boosting performance even with untuned pre-trained language models.

Prototypical Verbalizer for Prompt-based Few-shot Tuning

The paper "Prototypical Verbalizer for Prompt-based Few-shot Tuning" presents an innovative method for improving prompt-based tuning in pre-trained LLMs (PLMs), particularly in the context of few-shot learning scenarios. The core contribution is the introduction of a prototypical verbalizer (ProtoVerb) that leverages prototype vectors derived directly from training data using contrastive learning techniques.

Technical Background

Prompt-based tuning has recently emerged as a potent technique for few-shot learning, where the traditional fine-tuning approach encounters limitations due to the gap between pre-training and downstream tasks. This gap is especially pronounced when task-specific data is scarce. Prompt-based methods address this by re-framing tasks as cloze-style problems, utilizing templates and verbalizers to map LLM outputs to task-specific labels.

Contribution of ProtoVerb

The ProtoVerb provides a fresh approach to constructing verbalizers by directly learning prototype vectors as verbalizers through contrastive learning. This method bypasses the need for manual verbalizer design, which typically requires extensive domain knowledge and effort, and resolves the challenges posed by existing automatic verbalizer construction techniques.

Key Techniques:

Prototype Learning: ProtoVerb constructs prototype vectors which summarize class-level semantics by representing the central point of instances for each class. This is achieved using a contrastive learning framework inspired by the PCL method, optimizing both instance-instance and instance-prototype objectives.
Contrastive Learning: The prototypes are trained using the InfoNCE estimator, which facilitates effective learning of class-level semantic representation with limited data.
Application Scope: ProtoVerb's efficacy is demonstrated in both topic classification and entity typing tasks, showing superior performance especially in scenarios with extremely limited data. Remarkably, it enhances model performance even without additional tuning of the PLMs, illustrating its utility as a plug-and-play component.

Experimental Evaluation

The paper details extensive experiments across multiple datasets, demonstrating that ProtoVerb significantly outperforms existing automated verbalizers like search-based and soft verbalizers, particularly under few-shot conditions. Notably, even with untuned PLMs, ProtoVerb contributes to performance improvements, showcasing its robustness and adaptability.

Numerical Highlights:

ProtoVerb exhibits superior performance in low-resource settings (1-2 shots), outperforming conventional methods including manual verbalizers in several instances.
In ensemble scenarios, where ProtoVerb is combined with manual or other verbalizer types, further enhancements in classification performance are observed, highlighting its complementary nature.

Implications and Future Directions

ProtoVerb's introduction addresses critical limitations in current prompt-based tuning systems, providing a scalable solution that reduces dependency on manual intervention and domain-specific knowledge. The compelling performance of ProtoVerb under limited data configurations sets a precedent for further exploration into automatic construction of other components within prompt-based learning frameworks.

Theoretical and Practical Implications:

It opens avenues for integrating prototype-based mechanisms deeply within NLP tasks, catering to a wider array of classification problems where labeled data is costly or challenging to obtain.
Future research could explore the integration of ProtoVerb techniques with soft template frameworks or extend its utility to other tasks requiring non-tuning methods for PLMs.

Conclusion

Overall, this paper contributes significantly to the field of natural language processing by refining the methodology through which verbalizers are constructed, leading to more efficient and effective prompt-based tuning. ProtoVerb's demonstration of simplifying the adaptation of PLMs to specific tasks is not only theoretically appealing but also practically impactful, underlining its potential to facilitate more accessible and versatile AI systems.

Markdown Report Issue