Adaptive Aggregation Networks for Class-Incremental Learning (2010.05063v3)

Published 10 Oct 2020 in cs.CV and stat.ML

Abstract: Class-Incremental Learning (CIL) aims to learn a classification model with the number of classes increasing phase-by-phase. An inherent problem in CIL is the stability-plasticity dilemma between the learning of old and new classes, i.e., high-plasticity models easily forget old classes, but high-stability models are weak to learn new classes. We alleviate this issue by proposing a novel network architecture called Adaptive Aggregation Networks (AANets), in which we explicitly build two types of residual blocks at each residual level (taking ResNet as the baseline architecture): a stable block and a plastic block. We aggregate the output feature maps from these two blocks and then feed the results to the next-level blocks. We adapt the aggregation weights in order to balance these two types of blocks, i.e., to balance stability and plasticity, dynamically. We conduct extensive experiments on three CIL benchmarks: CIFAR-100, ImageNet-Subset, and ImageNet, and show that many existing CIL methods can be straightforwardly incorporated into the architecture of AANets to boost their performances.

Authors (3)

Yaoyao Liu (19 papers)
Bernt Schiele (210 papers)
Qianru Sun (65 papers)

Citations (193)

View on Semantic Scholar

Summary

The paper presents Adaptive Aggregation Networks that integrate stable and plastic blocks to overcome catastrophic forgetting in class-incremental learning.
It employs dynamic weighting with a bilevel optimization framework to aggregate outputs from specialized residual blocks.
The approach achieves up to a 6% accuracy boost on CIFAR-100 while operating under strict memory constraints.

Adaptive Aggregation Networks for Class-Incremental Learning

The paper "Adaptive Aggregation Networks for Class-Incremental Learning" investigates a novel approach to address the stability-plasticity dilemma inherent in Class-Incremental Learning (CIL) systems. CIL tasks require a model to incrementally learn new classes while retaining the knowledge of previously learned classes. The stability-plasticity dilemma arises as high stability models resist learning new classes, while high plasticity models tend to forget previously learned classes, a phenomenon known as catastrophic forgetting.

In tackling this challenge, the authors propose an innovative network architecture called Adaptive Aggregation Networks (AANets). This architecture builds upon the fundamental ResNet architecture by incorporating two distinct types of residual blocks at each network level: a stable block and a plastic block. The stable block maintains previously acquired knowledge with fewer learnable parameters, whereas the plastic block adapts to new information with a greater number of learnable parameters. The outputs of these blocks are aggregated using dynamic weights that balmeasure both stability and plasticity, effectively facilitated through end-to-end optimization.

The authors conducted extensive experiments on three popular benchmarks: CIFAR-100, ImageNet-Subset, and the full ImageNet dataset. The experimental results demonstrate that AANets enhance the performance of several existing state-of-the-art CIL methods, such as iCaRL, LUCIR, Mnemonics Training, and PODNet. Notably, AANets achieved improved accuracy while incurring only minor memory overheads, even when confined to a strict memory budget.

Key Numerical Results

The incorporation of AANets led to substantial performance improvements across all evaluated benchmarks. For instance, AANets combined with LUCIR achieved a notable increase in classification accuracy, particularly in longer class sequences, with improvements up to 6% observed on the CIFAR-100 dataset under a 25-phase setting. Similarly, clear gains were recorded on the ImageNet-based benchmarks. Remarkably, even in scenarios necessitating strict memory management, AANets maintained its superior performance.

Implications and Future Directions

The introduction of AANets provides both theoretical and practical advancements in the field of continual learning. By dynamically balancing the stability and plasticity within neural networks, AANets offer an effective solution to mitigate catastrophic forgetting in CIL systems. Moreover, the proposed bilevel optimization framework for adapting aggregation weights presents a promising avenue for future research in enhancing adaptability and efficiency in machine learning models.

Given its adaptable and integrative design, AANets has the potential to be seamlessly incorporated into various architectures and learning paradigms beyond class-incremental learning. Future research may explore extending these principles to task-incremental and domain-incremental learning settings, as well as hybrid models that cater to heterogeneous data streams. Additionally, investigating the theoretical underpinnings of the stability-plasticity trade-off in deeply networked systems could yield further insights into developing more robust learning models.

Overall, the paper contributes significantly to addressing one of the most enduring challenges in AI—ensuring that learning models can efficiently adapt to new information without sacrificing previously acquired knowledge. Through the novel architecture of Adaptive Aggregation Networks, this research enriches the ongoing discourse around scalable and sustainable learning systems.

PDF Markdown

Adaptive Aggregation Networks for Class-Incremental Learning (2010.05063v3)

Summary

Adaptive Aggregation Networks for Class-Incremental Learning

Key Numerical Results

Implications and Future Directions

Related Papers