Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FOSTER: Feature Boosting and Compression for Class-Incremental Learning (2204.04662v2)

Published 10 Apr 2022 in cs.CV and cs.LG

Abstract: The ability to learn new concepts continually is necessary in this ever-changing world. However, deep neural networks suffer from catastrophic forgetting when learning new categories. Many works have been proposed to alleviate this phenomenon, whereas most of them either fall into the stability-plasticity dilemma or take too much computation or storage overhead. Inspired by the gradient boosting algorithm to gradually fit the residuals between the target model and the previous ensemble model, we propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively. Specifically, we first dynamically expand new modules to fit the residuals between the target and the output of the original model. Next, we remove redundant parameters and feature dimensions through an effective distillation strategy to maintain the single backbone model. We validate our method FOSTER on CIFAR-100 and ImageNet-100/1000 under different settings. Experimental results show that our method achieves state-of-the-art performance. Code is available at: https://github.com/G-U-N/ECCV22-FOSTER.

Citations (191)

Summary

  • The paper introduces a two-stage framework employing gradient boosting to reduce forgetting in class-incremental learning.
  • The paper enhances incremental learning by dynamically boosting features with new modules to capture residual errors and improve accuracy.
  • The paper employs a compression mechanism via knowledge distillation to maintain a compact, robust backbone network for scalability.

An Overview of FOSTER: Feature Boosting and Compression for Class-Incremental Learning

The paper "FOSTER: Feature Boosting and Compression for Class-Incremental Learning" addresses the challenge of catastrophic forgetting in deep neural networks when they are tasked with class-incremental learning. This problem arises as models struggle to retain previously learned knowledge when adapting to new classes, primarily due to limited storage for prior data and the resulting imbalanced presentation of new instances. The authors introduce a novel approach, FOSTER, which utilizes the concept of gradient boosting to systematically enhance and adapt features for incremental tasks, consequently mitigating forgetting while preserving model compactness.

Key Innovations and Approach

  1. Gradient Boosting Paradigm:
    • Inspired by the gradient boosting methodology, the authors propose a two-stage learning framework to address the balance between stability and plasticity in class-incremental learning. This framework allows the model to incrementally fit the residual errors between its current predictions and actual labels, effectively complementing the existing representation with additional modules.
  2. Feature Boosting:
    • The initial phase dynamically introduces new modules to handle the residual tasks. This process enhances the model's ability to learn new categories by adapting to distributional shifts inherent in incoming data streams. The approach is validated with modifications to both the architecture and the learning objective, ensuring that new layers focus on distinctive nuances of new classes while supporting a consolidated representation for older classes.
  3. Feature Compression:
    • To counteract indiscriminate expansion, which leads to increased computational overhead and potential overfitting, the authors incorporate a compression mechanism. Through knowledge distillation, redundant parameters and dimensions are distilled back into a singular, robust backbone network without substantial degradation in performance. This step is critical in preserving the scalability of the approach for long-term learning scenarios.

Experimental Validation

The authors validate FOSTER's efficacy on prevalent class-incremental learning datasets such as CIFAR-100 and ImageNet-100/1000, across various settings. The experimental results demonstrate that FOSTER consistently achieves state-of-the-art performance, outperforming contemporary methods in both accuracy and stability across tasks. The robustness of the proposed method is further supported by ablation studies, where componential insights into its boosting and balancing mechanisms are analyzed.

Practical and Theoretical Implications

  • Practical: FOSTER provides a feasible solution for real-world applications requiring continual updates without sacrificing past knowledge. Its compression ability ensures model sustainability in environments constrained by memory and computational power.
  • Theoretical: The paper offers a new perspective on extending ensemble learning techniques, such as gradient boosting, into a dynamic learning framework that accommodates continual adaptivity in neural networks. The approach underscores the potential integration of classical boosting strategies within neural models, serving as a catalyst for future research in autonomous and augmented learning paradigms.

Future Directions

The integration of feature boosting and compression suggests numerous avenues for subsequent research. Potential areas include enhancing the granularity of knowledge distillation processes, exploring alternative boosting architectures, and developing more intricate balancing strategies to further optimize the stability-plasticity trade-off. Additionally, investigating objective functions that could complement the current framework would prove beneficial in pushing the boundaries of incremental learning efficacy.

In conclusion, FOSTER emerges as an insightful contribution to the class-incremental learning domain, blending techniques from ensemble learning with contemporary neural network strategies to address the enduring challenges of catastrophic forgetting while maintaining model efficiency.