Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
9 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
40 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Model Meets New Normals: Test-time Adaptation for Unsupervised Time-series Anomaly Detection (2312.11976v2)

Published 19 Dec 2023 in cs.LG and cs.AI

Abstract: Time-series anomaly detection deals with the problem of detecting anomalous timesteps by learning normality from the sequence of observations. However, the concept of normality evolves over time, leading to a "new normal problem", where the distribution of normality can be changed due to the distribution shifts between training and test data. This paper highlights the prevalence of the new normal problem in unsupervised time-series anomaly detection studies. To tackle this issue, we propose a simple yet effective test-time adaptation strategy based on trend estimation and a self-supervised approach to learning new normalities during inference. Extensive experiments on real-world benchmarks demonstrate that incorporating the proposed strategy into the anomaly detector consistently improves the model's performance compared to the baselines, leading to robustness to the distribution shifts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Practical Approach to Asynchronous Multivariate Time Series Anomaly Detection and Localization. In Proc. the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).
  2. USAD: UnSupervised Anomaly Detection on Multivariate Time Series. In Proc. the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).
  3. LOF: Identifying Density-Based Local Outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16-18, 2000, Dallas, Texas, USA.
  4. Anomaly Detection under Distribution Shift. CoRR, abs/2303.13845.
  5. Improving test-time adaptation via shift-agnostic weight regularization and nearest source prototypes. In Proc. of the European Conference on Computer Vision (ECCV), 440–458. Springer.
  6. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Burstein, J.; Doran, C.; and Solorio, T., eds., Proc. of The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).
  7. AnoShift: A Distribution Shift Benchmark for Unsupervised Anomaly Detection. In NeurIPS.
  8. Domain-adversarial training of neural networks. The journal of machine learning research, 17(1): 2096–2030.
  9. TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks. In 2020 IEEE International Conference on Big Data (IEEE BigData 2020), Atlanta, GA, USA, December 10-13, 2020, 33–43. IEEE.
  10. LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks. In Proc. the AAAI Conference on Artificial Intelligence (AAAI).
  11. In Search of Lost Domain Generalization. In Proc. the International Conference on Learning Representations (ICLR).
  12. MADGAN: unsupervised medical anomaly detection GAN using multiple adjacent brain MRI slice reconstruction. BMC Bioinform., 22-S(2): 31.
  13. Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding. In Proc. the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).
  14. Towards a Rigorous Evaluation of Time-Series Anomaly Detection. In Proc. the AAAI Conference on Artificial Intelligence (AAAI).
  15. Reversible Instance Normalization for Accurate Time-Series Forecasting against Distribution Shift. In Proc. the International Conference on Learning Representations (ICLR).
  16. Revisiting Time Series Outlier Detection: Definitions and Benchmarks. In Proc. the Advances in Neural Information Processing Systems (NeurIPS).
  17. Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In International Conference on Machine Learning, 6028–6039. PMLR.
  18. Detection and identification of sensor anomaly for aerospace applications. In 2016 Annual Reliability and Maintainability Symposium (RAMS), 1–6. IEEE.
  19. Generative Adversarial Active Learning for Unsupervised Outlier Detection. IEEE Trans. Knowl. Data Eng., 32(8): 1517–1528.
  20. Non-stationary Transformers: Exploring the Stationarity in Time Series Forecasting. In NeurIPS.
  21. LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection. CoRR, abs/1607.00148.
  22. SWaT: a water treatment testbed for research and training on ICS security. In 2016 International Workshop on Cyber-physical Systems for Smart Water Networks, CySWater@CPSWeek 2016, Vienna, Austria, April 11, 2016.
  23. Muth, J. F. 1960. Optimal properties of exponentially weighted forecasts. Journal of the american statistical association, 55(290): 299–306.
  24. Efficient test-time model adaptation without forgetting. In Proc. the International Conference on Machine Learning (ICML), 16888–16905. PMLR.
  25. Deep Learning for Anomaly Detection: A Review. ACM Comput. Surv., 54(2): 38:1–38:38.
  26. A Multimodal Anomaly Detector for Robot-Assisted Feeding Using an LSTM-Based Variational Autoencoder. IEEE Robotics Autom. Lett.
  27. Anomaly Detection Using Forecasting Methods ARIMA and HWDS. In 32nd International Conference of the Chilean Computer Science Society, SCCC 2013, Temuco, Cautin, Chile, November 11-15, 2013, 63–66. IEEE Computer Society.
  28. Dataset shift in machine learning. Mit Press.
  29. A distributed anomaly detection model for wireless sensor networks based on the one-class principal component classifier. Int. J. Sens. Networks, 27(3): 200–214.
  30. A survey of deep active learning. ACM computing surveys (CSUR), 54(9): 1–40.
  31. Deep One-Class Classification. In Proc. the International Conference on Machine Learning (ICML).
  32. A Unifying Review of Deep and Shallow Anomaly Detection. Proc. IEEE, 109(5): 756–795.
  33. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS one, 10(3): e0118432.
  34. FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers. In Proc. the International Conference on Machine Learning (ICML).
  35. Online anomaly detection with concept drift adaptation using recurrent neural networks. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, COMAD/CODS 2018, Goa, India, January 11-13, 2018.
  36. Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery. In Information Processing in Medical Imaging - 25th International Conference, IPMI 2017, Boone, NC, USA, June 25-30, 2017, Proceedings, volume 10265 of Lecture Notes in Computer Science, 146–157. Springer.
  37. Support Vector Method for Novelty Detection. In Solla, S. A.; Leen, T. K.; and Müller, K., eds., Proc. the Advances in Neural Information Processing Systems (NeurIPS).
  38. Timeseries Anomaly Detection using Temporal Hierarchical One-Class Network. In Proc. the Advances in Neural Information Processing Systems (NeurIPS).
  39. Anomaly Detection for Tabular Data with Internal Contrastive Learning. In Proc. the International Conference on Learning Representations (ICLR).
  40. ITAD: Integrative Tensor-based Anomaly Detection System for Reducing False Positives of Satellite Systems. In Proc. the ACM Conference on Information and Knowledge Management (CIKM).
  41. Time series analysis and its applications: With R examples. Springer.
  42. Navigating the Metric Maze: A Taxonomy of Evaluation Metrics for Anomaly Detection in Time Series. CoRR, abs/2303.01272.
  43. Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network. In Proc. the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).
  44. Test-Time Training with Self-Supervision for Generalization under Distribution Shifts. In Proc. the International Conference on Machine Learning (ICML).
  45. Data domain description using support vectors. In 7th European Symposium on Artificial Neural Networks, ESANN 1999, Bruges, Belgium, April 21-23, 1999, Proceedings.
  46. Tent: Fully Test-Time Adaptation by Entropy Minimization. In Proc. the International Conference on Learning Representations (ICLR).
  47. A new online anomaly learning and detection for large-scale service of Internet of Thing. Pers. Ubiquitous Comput., 19(7): 1021–1031.
  48. Continual Test-Time Domain Adaptation. In Proc. of the IEEE conference on computer vision and pattern recognition (CVPR).
  49. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications. In Proc. the International Conference on World Wide Web (WWW).
  50. Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy. In Proc. the International Conference on Learning Representations (ICLR).
  51. Unsupervised Domain Adaptation for One-Stage Object Detector Using Offsets to Bounding Box. In Proc. of the European Conference on Computer Vision (ECCV), 691–708. Springer.
  52. Zinkevich, M. 2003. Online Convex Programming and Generalized Infinitesimal Gradient Ascent. In Proc. the International Conference on Machine Learning (ICML).
  53. Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection. In Proc. the International Conference on Learning Representations (ICLR).
  54. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In Proc. of the European Conference on Computer Vision (ECCV), 289–305.
Citations (7)

Summary

  • The paper introduces a test-time adaptation technique that uses exponential moving average trend estimation and self-supervised learning to adjust to evolving normal patterns.
  • The approach significantly boosts performance, improving metrics like F1 score, AUROC, and AUPRC across various datasets and outperforming state-of-the-art detectors.
  • The method offers scalable robustness by eliminating the need for retraining or extra labeled data to counter distribution shifts in time-series data.

Introduction

In recent developments within the field of monitoring systems that rely on time-series data, the ability to pinpoint anomalous events is vital to maintaining integrity across various domains, from industrial systems to cybersecurity. Time-series anomaly detection is the cornerstone of these efforts – a discipline centered on identifying incongruous data points within sequential data. Among the challenges faced, the most perturbing is the evolving concept of normality, often leading to a phenomenon known as the "new normal problem." Models trained on historical data struggle to adapt when the data distribution changes at test time – a serious concern that can lead to false alarms and reduced reliability of the detection systems.

New Normal Problem

Addressing the new normal problem requires consideration of the shift in the data distribution between the training and testing phases. Existing unsupervised anomaly detection models, such as MLP-based autoencoders, LSTM architectures, and others, have shown limitations in the face of distribution shifts. They falter by rigidly adhering to historical data, proving deficient in recognizing and adapting to contemporary normal patterns, which diminishes their efficacy considerably when confronted with distribution shifts.

Model Adaptation Strategies

To combat these challenges, the paper introduces a novel test-time adaptation strategy. The approach encompasses two key components: first, a trend estimation which uses an exponential moving average to adapt to current data trends, and second, a self-supervised learning process during inference that refines the model to discern new normal patterns. This self-updating mechanism ensures that the anomaly detection system remains attuned to the current data distribution without the need for explicit retraining or additional labeled data.

Experimental Validation

The methodology was subjected to rigorous testing across various datasets that encapsulate real-world distribution shifts. The experimental results present a compelling case for the proposed approach. It enhances the performance of baseline models, delivers robustness against distribution shifts, and demonstrates discernible improvements in key performance metrics such as F1 score, AUROC, and AUPRC. Notably, the approach shines in scenarios with marked distribution shifts, exhibiting substantial gains over state-of-the-art anomaly detectors.

Final Remarks

In summation, this work takes on the non-trivial new normal problem in unsupervised time-series anomaly detection, offering an elegant solution through a merge of trend estimation and a self-adaptive learning method. The approach's scalability and effectiveness, as confirmed by extensive empirical evidence, position it as a compelling advancement in the adaptability of anomaly detectors, ensuring operational consistency in an ever-evolving landscape of time-series data.

Youtube Logo Streamline Icon: https://streamlinehq.com