Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 71 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 15 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 196 tok/s Pro
GPT OSS 120B 467 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

A Baseline for Few-Shot Image Classification (1909.02729v5)

Published 6 Sep 2019 in cs.LG, cs.CV, and stat.ML

Abstract: Fine-tuning a deep network trained with the standard cross-entropy loss is a strong baseline for few-shot learning. When fine-tuned transductively, this outperforms the current state-of-the-art on standard datasets such as Mini-ImageNet, Tiered-ImageNet, CIFAR-FS and FC-100 with the same hyper-parameters. The simplicity of this approach enables us to demonstrate the first few-shot learning results on the ImageNet-21k dataset. We find that using a large number of meta-training classes results in high few-shot accuracies even for a large number of few-shot classes. We do not advocate our approach as the solution for few-shot learning, but simply use the results to highlight limitations of current benchmarks and few-shot protocols. We perform extensive studies on benchmark datasets to propose a metric that quantifies the "hardness" of a few-shot episode. This metric can be used to report the performance of few-shot algorithms in a more systematic way.

Citations (550)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper introduces transductive fine-tuning as a robust baseline, leveraging unlabeled test data to optimize both feature extraction and classification.
  • It utilizes support-based initialization to maximize cosine similarity, outperforming state-of-the-art methods on datasets like Mini-ImageNet and Tiered-ImageNet.
  • The method demonstrates scalability on large datasets such as ImageNet-21k, challenging the need for complex meta-learning algorithms.

Few-Shot Image Classification Through Transductive Fine-Tuning

The paper "A Baseline for Few-Shot Image Classification" presents a systematic approach to few-shot learning by advocating for transductive fine-tuning as a robust and straightforward baseline. This method challenges the intricate models dominating the few-shot learning space, demonstrating that simplicity paired with careful design choices yields competitive or superior results.

Overview

The authors propose that fine-tuning a deep neural network, initially trained using cross-entropy loss, provides strong performance in few-shot learning scenarios. Specifically, this performance is enhanced when employing transductive fine-tuning—where the information from the test samples is utilized during inference. The approach outperforms state-of-the-art methods on standard datasets such as Mini-ImageNet and Tiered-ImageNet.

Key Contributions

  1. Transductive Fine-Tuning: The paper introduces a baseline that leverages unlabeled test data during fine-tuning, optimizing a model trained on a separate meta-training dataset. This involves adapting both the classifier and the feature extractor using information from the specific task at hand.
  2. Support-Based Initialization: Drawing from deep metric learning, the paper suggests initializing the classifier weights using support samples, thus maximizing cosine similarity between class weights and sample features.
  3. Benchmark Results: Conducting experiments across popular few-shot datasets, the authors demonstrate that their method surpasses existing benchmarks without needing specialized training per dataset or few-shot protocol.
  4. Scalability: The method has been tested on large-scale datasets like ImageNet-21k, illustrating its feasibility and robustness in few-shot scenarios involving significant data.

Numerical Results and Claims

The proposed approach achieves notable accuracies, such as 68.11% on the 1-shot, 5-way Mini-ImageNet task, significantly higher than existing methods. Achieving 58.04% in the 5-shot, 20-way scenario on ImageNet-21k, it showcases exceptional applicability to large-scale tasks.

Theoretical and Practical Implications

Theoretically, this paper challenges the perceived necessity of complex meta-learning algorithms, suggesting that improvements may instead derive from leveraging traditional supervised learning techniques alongside transductive methods. Practically, it opens new avenues for few-shot systems by emphasizing simplicity, robustness, and efficiency, particularly in dealing with vast and heterogeneous data sets.

Future Directions

The implications of transductive fine-tuning suggest potential exploration into hybrid models combining transduction with other semi-supervised learning techniques. Additionally, tweaking hyperparameters per dataset while maintaining a baseline configuration across multiple datasets could further enhance performance without over-specialization.

Conclusion

The paper advocates a reevaluation of the landscape of few-shot learning, asserting that simplicity empowered by transductive learning offers a reliable and scalable baseline. This approach not only underscores the potential advantages of straightforward techniques but also facilitates a better understanding of the inherent challenges and the true efficacy of emerging few-shot learning algorithms.

Youtube Logo Streamline Icon: https://streamlinehq.com