Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond Individual Input for Deep Anomaly Detection on Tabular Data (2305.15121v6)

Published 24 May 2023 in cs.LG

Abstract: Anomaly detection is vital in many domains, such as finance, healthcare, and cybersecurity. In this paper, we propose a novel deep anomaly detection method for tabular data that leverages Non-Parametric Transformers (NPTs), a model initially proposed for supervised tasks, to capture both feature-feature and sample-sample dependencies. In a reconstruction-based framework, we train an NPT to reconstruct masked features of normal samples. In a non-parametric fashion, we leverage the whole training set during inference and use the model's ability to reconstruct the masked features to generate an anomaly score. To the best of our knowledge, this is the first work to successfully combine feature-feature and sample-sample dependencies for anomaly detection on tabular datasets. Through extensive experiments on 31 benchmark tabular datasets, we demonstrate that our method achieves state-of-the-art performance, outperforming existing methods by 2.4% and 1.2% in terms of F1-score and AUROC, respectively. Our ablation study further proves that modeling both types of dependencies is crucial for anomaly detection on tabular data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Tabnet: Attentive interpretable tabular learning. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021, pp.  6679–6687. AAAI Press, 2021. doi: 10.1609/aaai.v35i8.16826. URL https://doi.org/10.1609/aaai.v35i8.16826.
  2. Layer normalization, 2016. URL https://arxiv.org/abs/1607.06450.
  3. Neural machine translation by jointly learning to align and translate, 2014. URL https://arxiv.org/abs/1409.0473.
  4. Classification-based anomaly detection for general data. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=H1lK_lBtvS.
  5. Lof: Identifying density-based local outliers. SIGMOD Rec., 29(2):93–104, may 2000. ISSN 0163-5808. doi: 10.1145/335191.335388. URL https://doi.org/10.1145/335191.335388.
  6. Unsupervised detection of lesions in brain MRI using constrained adversarial auto-encoders. In Medical Imaging with Deep Learning, 2018. URL https://openreview.net/forum?id=H1nGLZ2oG.
  7. BERT: pre-training of deep bidirectional transformers for language understanding. In Burstein, J., Doran, C., and Solorio, T. (eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp.  4171–4186. Association for Computational Linguistics, 2019. doi: 10.18653/v1/n19-1423. URL https://doi.org/10.18653/v1/n19-1423.
  8. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=YicbFdNTTy.
  9. Pidforest: Anomaly detection and certification via partial identification. In Neural Information Processing Systems, 2019. URL https://api.semanticscholar.org/CorpusID:202766416.
  10. Tabr: Tabular deep learning meets nearest neighbors in 2023, 2023.
  11. Drocc: Deep robust one-class classification. In III, H. D. and Singh, A. (eds.), Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pp.  3711–3721. PMLR, 13–18 Jul 2020. URL https://proceedings.mlr.press/v119/goyal20c.html.
  12. Why do tree-based models still outperform deep learning on typical tabular data? In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022. URL https://openreview.net/forum?id=Fp7__phQszn.
  13. Robust random cut forest based anomaly detection on streams. In International Conference on Machine Learning, 2016.
  14. ADBench: Anomaly detection benchmark. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022. URL https://openreview.net/forum?id=foA_SFQ9zo0.
  15. Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4):1479–1489, 2021. doi: 10.1109/TKDE.2019.2947676.
  16. Hawkins, D. M. The detection of errors in multivariate data using principal components. Journal of the American Statistical Association, 69(346):340–344, 1974. ISSN 01621459. URL http://www.jstor.org/stable/2285654.
  17. Financial fraud: A review of anomaly detection techniques and recent advances. Expert Systems with Applications, 193:116429, 2022. ISSN 0957-4174. doi: https://doi.org/10.1016/j.eswa.2021.116429. URL https://www.sciencedirect.com/science/article/pii/S0957417421017164.
  18. TANGOS: Regularizing tabular neural networks through gradient orthogonalization and specialization. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=n6H86gW8u0d.
  19. Well-tuned simple nets excel on tabular datasets. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021.
  20. Rapp: Novelty detection with reconstruction along projection pathway. In International Conference on Learning Representations, 2020.
  21. A mutual information maximization perspective of language representation learning. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Syx79eBKwr.
  22. Self-attention between datapoints: Going beyond individual input-output pairs in deep learning. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=wRXzOa2z5T.
  23. COPOD: Copula-based outlier detection. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, nov 2020. doi: 10.1109/icdm50108.2020.00135. URL https://doi.org/10.11092Ficdm50108.2020.00135.
  24. Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining, pp.  413–422, 2008. doi: 10.1109/ICDM.2008.17.
  25. Explainable deep one-class classification. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=A5VV3UyIQz.
  26. An empirical evaluation of deep learning for network anomaly detection. In 2018 International Conference on Computing, Networking and Communications (ICNC), pp.  893–898, 2018. doi: 10.1109/ICCNC.2018.8390278.
  27. Network constraints and multi-objective optimization for one-class classification. Neural Netw., 9(3):463–474, apr 1996. ISSN 0893-6080. doi: 10.1016/0893-6080(95)00120-4. URL https://doi.org/10.1016/0893-6080(95)00120-4.
  28. Image transformer. In International Conference on Machine Learning (ICML), 2018. URL http://proceedings.mlr.press/v80/parmar18a.html.
  29. Parzen, E. On Estimation of a Probability Density Function and Mode. The Annals of Mathematical Statistics, 33(3):1065 – 1076, 1962. doi: 10.1214/aoms/1177704472. URL https://doi.org/10.1214/aoms/1177704472.
  30. Acoustic novelty detection with adversarial autoencoders. In 2017 International Joint Conference on Neural Networks (IJCNN), pp.  3324–3330, 2017. doi: 10.1109/IJCNN.2017.7966273.
  31. Neural transformation learning for deep anomaly detection beyond images. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp.  8703–8714. PMLR, 2021. URL http://proceedings.mlr.press/v139/qiu21a.html.
  32. Efficient algorithms for mining outliers from large data sets. In Chen, W., Naughton, J. F., and Bernstein, P. A. (eds.), Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16-18, 2000, Dallas, Texas, USA, pp.  427–438. ACM, 2000. doi: 10.1145/342009.335437. URL https://doi.org/10.1145/342009.335437.
  33. Mean-shifted contrastive loss for anomaly detection. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, AAAI’23/IAAI’23/EAAI’23. AAAI Press, 2023. ISBN 978-1-57735-880-0. doi: 10.1609/aaai.v37i2.25309. URL https://doi.org/10.1609/aaai.v37i2.25309.
  34. Transformation based deep anomaly detection in astronomical images. In 2020 International Joint Conference on Neural Networks (IJCNN), pp.  1–8, 2020. doi: 10.1109/IJCNN48605.2020.9206997.
  35. A probabilistic resource allocating network for novelty detection. Neural Computation, 6(2):270–284, 1994. doi: 10.1162/neco.1994.6.2.270.
  36. Deep one-class classification. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp.  4393–4402. PMLR, 10–15 Jul 2018. URL http://proceedings.mlr.press/v80/ruff18a.html.
  37. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In Niethammer, M., Styner, M., Aylward, S., Zhu, H., Oguz, I., Yap, P.-T., and Shen, D. (eds.), Information Processing in Medical Imaging, pp.  146–157, Cham, 2017. Springer International Publishing. ISBN 978-3-319-59050-9.
  38. Support vector method for novelty detection. In Proceedings of the 12th International Conference on Neural Information Processing Systems, NIPS’99, pp.  582–588, Cambridge, MA, USA, 1999. MIT Press.
  39. Regularization learning networks: Deep learning for tabular datasets. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper/2018/file/500e75a036dc2d7d2fec5da1b71d36cc-Paper.pdf.
  40. Anomaly detection for tabular data with internal contrastive learning. In International Conference on Learning Representations, 2022.
  41. Tabular data: Deep learning is not all you need. In 8th ICML Workshop on Automated Machine Learning (AutoML), 2021. URL https://openreview.net/forum?id=vdgtepS1pV.
  42. Attackers are not stealthy: Statistical analysis of the well-known and infamous kdd network security dataset. In 2020 4th Conference on Cloud and Internet of Things (CIoT), pp.  1–8, 2020. doi: 10.1109/CIoT50422.2020.9244289.
  43. Learning and evaluating representations for deep one-class classification. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=HCSgyPUfeDj.
  44. SAINT: improved neural networks for tabular data via row attention and contrastive pre-training. CoRR, abs/2106.01342, 2021. URL https://arxiv.org/abs/2106.01342.
  45. Support vector data description. Machine Learning, 54:45–66, 01 2004. doi: 10.1023/B:MACH.0000008084.60811.49.
  46. TracInAD: Measuring influence for anomaly detection. In 2022 International Joint Conference on Neural Networks (IJCNN), pp.  1–6, 2022. doi: 10.1109/IJCNN55064.2022.9892058. URL https://doi.org/10.1109/IJCNN55064.2022.9892058.
  47. Attention is all you need. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
  48. Classification of imbalanced data: a review. International Journal of Pattern Recognition and Artificial Intelligence, 23:687–719, 11 2011. doi: 10.1142/S0218001409007326.
  49. Large batch optimization for deep learning: Training bert in 76 minutes. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Syx4wnEtvH.
  50. Lookahead optimizer: k steps forward, 1 step back. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/90fd4f88f588ae64038134f1eeaa023f-Paper.pdf.
  51. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In International Conference on Learning Representations, 2018.
Citations (6)

Summary

We haven't generated a summary for this paper yet.