On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification (2402.01274v3)

Published 2 Feb 2024 in cs.SD, cs.LG, and eess.AS

Abstract: In recent years, self-supervised learning has excelled for its capacity to learn robust feature representations from unlabelled data. Networks pretrained through self-supervision serve as effective feature extractors for downstream tasks, including Few-Shot Learning. While the evaluation of unsupervised approaches for few-shot learning is well-established in imagery, it is notably absent in acoustics. This study addresses this gap by assessing large-scale self-supervised models' performance in few-shot audio classification. Additionally, we explore the relationship between a model's few-shot learning capability and other downstream task benchmarks. Our findings reveal state-of-the-art performance in some few-shot problems such as SpeechCommandsv2, as well as strong correlations between speech-based few-shot problems and various downstream audio tasks.

References (20)

Authors (4)

Calum Heggan (4 papers)
Sam Budgett (3 papers)
Timothy Hospedales (101 papers)
Mehrdad Yaghoobi (17 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/ArxivSound/status/1756907054185291855

https://twitter.com/ArxivSound/status/1754526227690353064

On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification (2402.01274v3)

Summary

Related Papers

Tweets