Between-Sample Relationship in Learning Tabular Data Using Graph and Attention Networks (2306.06772v1)
Abstract: Traditional machine learning assumes samples in tabular data to be independent and identically distributed (i.i.d). This assumption may miss useful information within and between sample relationships in representation learning. This paper relaxes the i.i.d assumption to learn tabular data representations by incorporating between-sample relationships for the first time using graph neural networks (GNN). We investigate our hypothesis using several GNNs and state-of-the-art (SOTA) deep attention models to learn the between-sample relationship on ten tabular data sets by comparing them to traditional machine learning methods. GNN methods show the best performance on tabular data with large feature-to-sample ratios. Our results reveal that attention-based GNN methods outperform traditional machine learning on five data sets and SOTA deep tabular learning methods on three data sets. Between-sample learning via GNN and deep attention methods yield the best classification accuracy on seven of the ten data sets. This suggests that the i.i.d assumption may not always hold for most tabular data sets.
- Network representation learning: A survey. IEEE Transactions on Big Data, 6(1):3–28, 2020.
- William L Hamilton. Graph representation learning. Synthesis Lectures on Artifical Intelligence and Machine Learning, 14(3):1–159, 2020.
- Machine-learning applications of algorithmic randomness. In Proceedings of the Sixteenth International Conference on Machine Learning, ICML ’99, page 444–453, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc.
- Saint: Improved neural networks for tabular data via row attention and contrastive pre-training. 6 2022.
- Semi-supervised classification with graph convolutional networks. 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings, 9 2016.
- Graph attention networks. International Conference on Learning Representations, 10 2018.
- Graph attention auto-encoders. volume 2020-Novem, pages 989–996. IEEE, 11 2020.
- Well-tuned simple nets excel on tabular datasets. Advances in neural information processing systems, 34:23928–23941, 2021.
- Perturbation of deep autoencoder weights for model compression and classification of tabular data. Neural Networks, 156:160–169, 2022.
- Deep neural networks and tabular data: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Effectiveness of deep image embedding clustering methods on tabular data. arXiv preprint arXiv:2212.14111, 2022.
- A review on the attention mechanism of deep learning. Neurocomputing, 452:48–62, 9 2021.
- An attentive survey of attention models. ACM Transactions on Intelligent Systems and Technology, 12, 10 2021.
- Revisiting deep learning models for tabular data. Advances in Neural Information Processing Systems, 23:18932–18943, 2021.
- Self-attention between datapoints: Going beyond individual input-output pairs in deep learning. Advances in Neural Information Processing Systems, 34:28742–28756, 6 2021.
- Tabnet: Attentive interpretable tabular learning. pages 6679–6687, 8 2021.
- Handling missing data with graph representation learning. In Advances in Neural Information Processing Systems, volume 2020-Decem, pages 19075–19087, 2020.
- Learning Enhanced Representations for Tabular Data via Neighborhood Propagation. In Advances in Neural Information Processing Systems, oct 2022.
- TabGNN: Multiplex Graph Neural Network for Tabular Data Prediction. aug 2021.
- Shourav B. Rabbani (7 papers)
- Manar D. Samad (15 papers)