Meta-learning Pathologies from Radiology Reports using Variance Aware Prototypical Networks

Published 22 Oct 2022 in cs.LG and cs.CL | (2210.13979v2)

Abstract: Large pretrained Transformer-based LLMs like BERT and GPT have changed the landscape of NLP. However, fine tuning such models still requires a large number of training examples for each target task, thus annotating multiple datasets and training these models on various downstream tasks becomes time consuming and expensive. In this work, we propose a simple extension of the Prototypical Networks for few-shot text classification. Our main idea is to replace the class prototypes by Gaussians and introduce a regularization term that encourages the examples to be clustered near the appropriate class centroids. Experimental results show that our method outperforms various strong baselines on 13 public and 4 internal datasets. Furthermore, we use the class distributions as a tool for detecting potential out-of-distribution (OOD) data points during deployment.