CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition (2004.00288v1)

Published 1 Apr 2020 in cs.CV

Abstract: As an emerging topic in face recognition, designing margin-based loss functions can increase the feature margin between different classes for enhanced discriminability. More recently, the idea of mining-based strategies is adopted to emphasize the misclassified samples, achieving promising results. However, during the entire training process, the prior methods either do not explicitly emphasize the sample based on its importance that renders the hard samples not fully exploited; or explicitly emphasize the effects of semi-hard/hard samples even at the early training stage that may lead to convergence issue. In this work, we propose a novel Adaptive Curriculum Learning loss (CurricularFace) that embeds the idea of curriculum learning into the loss function to achieve a novel training strategy for deep face recognition, which mainly addresses easy samples in the early training stage and hard ones in the later stage. Specifically, our CurricularFace adaptively adjusts the relative importance of easy and hard samples during different training stages. In each stage, different samples are assigned with different importance according to their corresponding difficultness. Extensive experimental results on popular benchmarks demonstrate the superiority of our CurricularFace over the state-of-the-art competitors.

Authors (8)

Yuge Huang (18 papers)
Yuhan Wang (93 papers)
Ying Tai (88 papers)
Xiaoming Liu (145 papers)
Pengcheng Shen (4 papers)
Shaoxin Li (8 papers)
Jilin Li (41 papers)
Feiyue Huang (76 papers)

Citations (456)

View on Semantic Scholar

Summary

The paper presents CurricularFace, which adaptively integrates curriculum learning into the loss function to dynamically emphasize easy and hard samples.
The approach uses an exponential moving average to estimate cosine similarity parameters, leading to improved convergence and robustness.
The method outperforms state-of-the-art models on benchmarks like LFW and MegaFace, particularly excelling with smaller architectures and challenging conditions.

CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition

The paper presents a novel approach to deep face recognition by introducing CurricularFace, an adaptive curriculum learning loss. The motivation behind this work arises from the limitations observed in the conventional margin-based and mining-based strategies for training Convolutional Neural Networks (CNNs) in face recognition tasks.

Methodology

CurricularFace innovatively integrates the principles of curriculum learning into the loss function design, aiming to modulate the training strategy based on the difficulty level of samples. The loss function emphasizes easy samples during initial stages and hard samples in later phases, adapting dynamically through the training process.

This adaptive approach contrasts with previous methods where either the emphasis on sample importance was absent, leading to underutilization of hard samples, or where an arbitrary focus on hard samples early in training risked convergence issues. CurricularFace achieves balance by adjusting the modulation coefficients of cosine similarities, using an automatically estimated parameter, $t$ , derived from a moving average of positive cosine similarities.

Key Technical Details

Softmax-based Classification Loss: Most traditional methods use this, but the paper argues it lacks discriminative power as it doesn't emphasize sample importance based on difficulty.
Margin-based and Mining-based Loss Functions: While methods like ArcFace and MV-Arc-Softmax either ignore sample importance or overly emphasize hard samples irrespective of the training stage, CurricularFace proposes an intermediate adaptive strategy.
Adaptive Curriculum Design: Samples are not predefined in order of difficulty; instead, they are dynamically assessed in each mini-batch. The importance assigned is based on the angle $\theta$ , defining difficulty.
Adaptive Estimation of $t$ : This parameter, crucial to the strategy, is not manually tuned but estimated using an exponential moving average of cosine similarities, providing stability and adaptability to training stages.

Results and Evaluation

The paper reports extensive experimentation across popular benchmarks, such as LFW, CFP-FP, CPLFW, and MegaFace. CurricularFace consistently outperforms state-of-the-art methods like ArcFace and MV-Arc-Softmax. Notably, it achieves superior results on pose and age-variation datasets and demonstrates robustness in convergence, particularly with smaller models like MobileFaceNet where ArcFace might struggle.

Implications and Future Directions

The introduction of adaptive curriculum learning into deep face recognition represents a strategic evolution, potentially influencing the design of future loss functions in AI. This work could spur further exploration into adaptive systems where training dynamics are tailored not only by the current model state but also by historical performance, expanding beyond face recognition to other domains demanding high discriminability.

Future investigations could refine the modulation function $N(\cdot)$ and explore adaptive handling of noisy samples, which may currently skew the difficulty assessment. Additionally, integrating similar adaptive strategies into different AI models and tasks may unveil further possibilities for improvement in model robustness and accuracy.

PDF Markdown