Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Overcoming Domain Drift in Online Continual Learning (2405.09133v1)

Published 15 May 2024 in cs.LG

Abstract: Online Continual Learning (OCL) empowers machine learning models to acquire new knowledge online across a sequence of tasks. However, OCL faces a significant challenge: catastrophic forgetting, wherein the model learned in previous tasks is substantially overwritten upon encountering new tasks, leading to a biased forgetting of prior knowledge. Moreover, the continual doman drift in sequential learning tasks may entail the gradual displacement of the decision boundaries in the learned feature space, rendering the learned knowledge susceptible to forgetting. To address the above problem, in this paper, we propose a novel rehearsal strategy, termed Drift-Reducing Rehearsal (DRR), to anchor the domain of old tasks and reduce the negative transfer effects. First, we propose to select memory for more representative samples guided by constructed centroids in a data stream. Then, to keep the model from domain chaos in drifting, a two-level angular cross-task Contrastive Margin Loss (CML) is proposed, to encourage the intra-class and intra-task compactness, and increase the inter-class and inter-task discrepancy. Finally, to further suppress the continual domain drift, we present an optional Centorid Distillation Loss (CDL) on the rehearsal memory to anchor the knowledge in feature space for each previous old task. Extensive experimental results on four benchmark datasets validate that the proposed DRR can effectively mitigate the continual domain drift and achieve the state-of-the-art (SOTA) performance in OCL.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
  2. F. Lyu, Q. Wu, F. Hu, Q. Wu, and M. Tan, “Attend and imagine: Multi-label image classification with visual attention and recurrent neural networks,” IEEE Transactions on Multimedia, vol. 21, no. 8, pp. 1971–1981, 2019.
  3. Q. Sun, F. Lyu, F. Shang, W. Feng, and L. Wan, “Exploring example influence in continual learning,” in Proceedings of the Advances in Neural Information Processing Systems, vol. 35, 2022, pp. 27 075–27 086.
  4. Q. Gao, Z. Luo, D. Klabjan, and F. Zhang, “Efficient architecture search for continual learning,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
  5. M. J. Mirza, M. Masana, H. Possegger, and H. Bischof, “An efficient domain-incremental learning approach to drive in all weather conditions,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 3001–3011.
  6. S. Thuseethan, S. Rajasegarar, and J. Yearwood, “Deep continual learning for emerging emotion recognition,” IEEE Transactions on Multimedia, vol. 24, pp. 4367–4380, 2021.
  7. M. S. Yasar and T. Iqbal, “Improving human motion prediction through continual learning,” arXiv preprint arXiv:2107.00544, 2021.
  8. Y. Liu, J. Wang, J. Li, S. Niu, and H. Song, “Class-incremental learning for wireless device identification in iot,” IEEE Internet of Things Journal, vol. 8, no. 23, pp. 17 227–17 235, 2021.
  9. K. Du, F. Lyu, F. Hu, L. Li, W. Feng, F. Xu, and Q. Fu, “Agcn: Augmented graph convolutional network for lifelong multi-label image recognition,” in Proceedings of the IEEE International Conference on Multimedia and Expo.   IEEE, 2022.
  10. R. M. French, “Catastrophic forgetting in connectionist networks,” Trends in cognitive sciences, vol. 3, no. 4, pp. 128–135, 1999.
  11. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska et al., “Overcoming catastrophic forgetting in neural networks,” Proceedings of the National Academy of Sciences of the United States of America, vol. 114, no. 13, pp. 3521–3526, 2017.
  12. M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “Continual learning: A comparative study on how to defy forgetting in classification tasks,” arXiv preprint arXiv:1909.08383, 2019.
  13. Z. Li and D. Hoiem, “Learning without forgetting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2935–2947, 2017.
  14. A. Chaudhry, P. K. Dokania, T. Ajanthan, and P. H. Torr, “Riemannian walk for incremental learning: Understanding forgetting and intransigence,” in Proceedings of the European Conference on Computer Vision, 2018.
  15. P. Dhar, R. V. Singh, K.-C. Peng, Z. Wu, and R. Chellappa, “Learning without memorizing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5138–5146.
  16. A. Mallya, D. Davis, and S. Lazebnik, “Piggyback: Adapting a single network to multiple tasks by learning to mask weights,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 67–82.
  17. J. Yoon, E. Yang, J. Lee, and S. J. Hwang, “Lifelong learning with dynamically expandable networks,” in Proceedings of the International Conference on Learning Representations, 2018.
  18. H. Li, P. Barnaghi, S. Enshaeifar, and F. Ganz, “Continual learning using bayesian neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 9, pp. 4243–4252, 2020.
  19. D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for continual learning,” in Proceedings of the Advances in Neural Information Processing Systems, 2017, pp. 6470–6479.
  20. A. Chaudhry, M. Ranzato, M. Rohrbach, and M. Elhoseiny, “Efficient lifelong learning with a-gem,” in Proceedings of the International Conference on Learning Representations, 2018.
  21. Y. Guo, M. Liu, T. Yang, and T. Rosing, “Learning with long-term remembering: Following the lead of mixed stochastic gradient,” arXiv preprint arXiv:1909.11763, 2019.
  22. L. Caccia, R. Aljundi, N. Asadi, T. Tuytelaars, J. Pineau, and E. Belilovsky, “New insights on reducing abrupt representation change in online continual learning,” in Proceedings of the International Conference on Learning Representations, 2022.
  23. J. Zenisek, G. Kronberger, J. Wolfartsberger, N. Wild, and M. Affenzeller, “Concept drift detection with variable interaction networks,” in Proceedings of the International Conference on Computer Aided Systems Theory.   Springer, 2019, pp. 296–303.
  24. S. Bhatia, A. Jain, S. Srivastava, K. Kawaguchi, and B. Hooi, “Memstream: Memory-based anomaly detection in multi-aspect streams with concept drift,” arXiv preprint arXiv:2106.03837, 2021.
  25. Q. Lao, X. Jiang, M. Havaei, and Y. Bengio, “A two-stream continual learning system with variational domain-agnostic feature replay,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 9, pp. 4466–4478, 2021.
  26. C. Deng, Q. Wu, Q. Wu, F. Hu, F. Lyu, and M. Tan, “Visual grounding via accumulated attention,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7746–7755.
  27. F. Lyu, W. Feng, and S. Wang, “vtgraphnet: Learning weakly-supervised scene graph for complex visual grounding,” Neurocomputing, pp. 51–60, 2020.
  28. M. B. Ring, “Child: A first step towards continual learning,” Machine Learning, vol. 28, pp. 77–104, 1997.
  29. S. Thrun, “Lifelong learning algorithms,” in Learning to learn.   Springer, 1998.
  30. S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “icarl: Incremental classifier and representation learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 2001–2010.
  31. L. Pellegrini, G. Graffieti, V. Lomonaco, and D. Maltoni, “Latent replay for real-time continual learning,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.   IEEE, 2020, pp. 10 203–10 209.
  32. G. Shen, S. Zhang, X. Chen, and Z.-H. Deng, “Generative feature replay with orthogonal weight modification for continual learning,” in Proceedings of the International Joint Conference on Neural Networks.   IEEE, 2021, pp. 1–8.
  33. G. M. van de Ven and A. S. Tolias, “Generative replay with feedback connections as a general strategy for continual learning,” arXiv preprint arXiv:1809.10635, 2018.
  34. T. Lesort, A. Gepperth, A. Stoian, and D. Filliat, “Marginal replay vs conditional replay for continual learning,” in Proceedings of International Conference on Artificial Neural Networks.   Springer, 2019, pp. 466–480.
  35. F. Lyu, S. Wang, W. Feng, Z. Ye, F. Hu, and S. Wang, “Multi-domain multi-task rehearsal for lifelong learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 10, 2021, pp. 8819–8827.
  36. A. Chaudhry, M. Rohrbach, M. Elhoseiny, T. Ajanthan, P. K. Dokania, P. H. S. Torr, and M. Ranzato, “On tiny episodic memories in continual learning,” arXiv preprint arXiv:1902.10486, 2019.
  37. R. Aljundi, M. Lin, B. Goujaud, and Y. Bengio, “Gradient based sample selection for online continual learning,” Proceedings of the Advances in Neural Information Processing Systems, vol. 32, pp. 11 817–11 826, 2019.
  38. P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, “Supervised contrastive learning,” Proceedings of the Advances in Neural Information Processing Systems, vol. 33, pp. 18 661–18 673, 2020.
  39. A. Jaiswal, A. R. Babu, M. Z. Zadeh, D. Banerjee, and F. Makedon, “A survey on contrastive self-supervised learning,” Technologies, vol. 9, no. 1, p. 2, 2020.
  40. T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proceedings of the International Conference on Machine Learning.   PMLR, 2020, pp. 1597–1607.
  41. K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9729–9738.
  42. K. Nagata and K. Hotta, “Margin contrastive learning with learnable-vector for continual learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 3570–3576.
  43. X. Yu, Y. Guo, S. Gao, and T. Rosing, “Scale: Online self-supervised lifelong learning without prior knowledge,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 2483–2494.
  44. H. Cha, J. Lee, and J. Shin, “Co2l: Contrastive continual learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 9516–9525.
  45. W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, “Sphereface: Deep hypersphere embedding for face recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017.
  46. W. Liu, Y. Wen, Z. Yu, and M. Yang, “Large-margin softmax loss for convolutional neural networks.” in Proceedings of the International Conference on Machine Learning, 2016, pp. 507–516.
  47. H. Wang, Y. Wang, Z. Zhou, X. Ji, D. Gong, J. Zhou, Z. Li, and W. Liu, “Cosface: Large margin cosine loss for deep face recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
  48. F. Wang, J. Cheng, W. Liu, and H. Liu, “Additive margin softmax for face verification,” IEEE Signal Processing Letters, vol. 25, no. 7, pp. 926–930, 2018.
  49. J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4690–4699.
  50. G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” STAT, vol. 1050, p. 9, 2015.
  51. M. Riemer, I. Cases, R. Ajemian, M. Liu, I. Rish, Y. Tu, and G. Tesauro, “Learning to learn without forgetting by maximizing transfer and minimizing interference,” in Proceedings of the International Conference on Learning Representations, 2019.
  52. A. Chaudhry, M. Rohrbach, M. Elhoseiny, T. Ajanthan, P. K. Dokania, P. H. Torr, and M. Ranzato, “On tiny episodic memories in continual learning,” arXiv preprint arXiv:1902.10486, 2019.
  53. A. Ayub and A. R. Wagner, “Centroid based concept learning for rgb-d indoor scene classification,” in Proceedings of the British Machine Vision Conference, 2019.
  54. ——, “Cognitively-inspired model for incremental learning using a few examples,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern RecognitionW, 2020, pp. 222–223.
  55. H. Lin, B. Zhang, S. Feng, X. Li, and Y. Ye, “Pcr: Proxy-based contrastive replay for online class-incremental continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  56. F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” in Proceedings of the International Conference on Machine Learning.   PMLR, 2017, pp. 3987–3995.
  57. C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, “The caltech-ucsd birds-200-2011 dataset,” 2011.
  58. C. H. Lampert, H. Nickisch, and S. Harmeling, “Learning to detect unseen object classes by between-class attribute transfer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.   IEEE, 2009, pp. 951–958.
  59. R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuytelaars, “Memory aware synapses: Learning what (not) to forget,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 139–154.
  60. P. Buzzega, M. Boschini, A. Porrello, D. Abati, and S. Calderara, “Dark experience for general continual learning: a strong, simple baseline,” Proceedings of the Advances in Neural Information Processing Systems, vol. 33, pp. 15 920–15 930, 2020.
  61. Z. Mai, R. Li, H. Kim, and S. Sanner, “Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3589–3599.
  62. G. Hu, W. Zhang, H. Ding, and W. Zhu, “Gradient episodic memory with a soft constraint for continual learning,” arXiv preprint arXiv:2011.07801, 2020.
  63. D. Shim, Z. Mai, J. Jeong, S. Sanner, H. Kim, and J. Jang, “Online class-incremental continual learning with adversarial shapley value,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 11, 2021, pp. 9630–9638.
  64. L. v. d. Maaten and G. Hinton, “Visualizing data using t-sne,” Journal of machine learning research, vol. 9, no. 11, 2008.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Fan Lyu (34 papers)
  2. Daofeng Liu (2 papers)
  3. Linglan Zhao (6 papers)
  4. Zhang Zhang (77 papers)
  5. Fanhua Shang (47 papers)
  6. Fuyuan Hu (20 papers)
  7. Wei Feng (208 papers)
  8. Liang Wang (512 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets