Foundation Model's Embedded Representations May Detect Distribution Shift (2310.13836v2)

Published 20 Oct 2023 in cs.LG and cs.CL

Abstract: Sampling biases can cause distribution shifts between train and test datasets for supervised learning tasks, obscuring our ability to understand the generalization capacity of a model. This is especially important considering the wide adoption of pre-trained foundational neural networks -- whose behavior remains poorly understood -- for transfer learning (TL) tasks. We present a case study for TL on the Sentiment140 dataset and show that many pre-trained foundation models encode different representations of Sentiment140's manually curated test set $M$ from the automatically labeled training set $P$, confirming that a distribution shift has occurred. We argue training on $P$ and measuring performance on $M$ is a biased measure of generalization. Experiments on pre-trained GPT-2 show that the features learnable from $P$ do not improve (and in fact hamper) performance on $M$. Linear probes on pre-trained GPT-2's representations are robust and may even outperform overall fine-tuning, implying a fundamental importance for discerning distribution shift in train/test splits for model interpretation.

References (42)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Foundation Model's Embedded Representations May Detect Distribution Shift (2310.13836v2)

Summary

Related Papers