Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 28 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 16 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 197 tok/s Pro
GPT OSS 120B 471 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Fine-tuned In-Context Learning Transformers are Excellent Tabular Data Classifiers (2405.13396v2)

Published 22 May 2024 in cs.LG and stat.ML

Abstract: The recently introduced TabPFN pretrains an In-Context Learning (ICL) transformer on synthetic data to perform tabular data classification. In this work, we extend TabPFN to the fine-tuning setting, resulting in a significant performance boost. We also discover that fine-tuning enables ICL-transformers to create complex decision boundaries, a property regular neural networks do not have. Based on this observation, we propose to pretrain ICL-transformers on a new forest dataset generator which creates datasets that are unrealistic, but have complex decision boundaries. TabForest, the ICL-transformer pretrained on this dataset generator, shows better fine-tuning performance when pretrained on more complex datasets. Additionally, TabForest outperforms TabPFN on some real-world datasets when fine-tuning, despite having lower zero-shot performance due to the unrealistic nature of the pretraining datasets. By combining both dataset generators, we create TabForestPFN, an ICL-transformer that achieves excellent fine-tuning performance and good zero-shot performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Tabnet: Attentive interpretable tabular learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 6679–6687, 2021. Issue: 8.
  2. SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption. In International Conference on Learning Representations (ICLR). arXiv, March 2022. doi: 10.48550/arXiv.2106.15147. URL http://arxiv.org/abs/2106.15147. arXiv:2106.15147 [cs].
  3. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 1877–1901. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html.
  4. Trompt: Towards a Better Deep Neural Network for Tabular Data. In International Conference on Machine Learning (ICML), May 2023a. URL https://openreview.net/forum?id=0yNmeyteuS. arXiv:2305.18446 [cs].
  5. ReConTab: Regularized Contrastive Representation Learning for Tabular Data. In NeurIPS Workshop: Table Representation Learning, 2023b.
  6. XGBoost: A Scalable Tree Boosting System. In International Conference on Knowledge Discovery and Data Mining (KDD), pages 785–794, August 2016. doi: 10.1145/2939672.2939785. URL http://arxiv.org/abs/1603.02754. arXiv:1603.02754 [cs].
  7. A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning. In Advances in Neural Information Processing Systems (NeurIPS), 2024.
  8. A Survey on In-context Learning, June 2023. URL http://arxiv.org/abs/2301.00234. arXiv:2301.00234 [cs].
  9. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations (ICLR). arXiv, June 2021. URL http://arxiv.org/abs/2010.11929. arXiv:2010.11929 [cs].
  10. Scaling TabPFN: Sketching and Feature Selection for Tabular Prior-Data Fitted Networks. October 2023. URL https://openreview.net/forum?id=b0OhN0ii36.
  11. TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks, March 2024. URL http://arxiv.org/abs/2402.11137. arXiv:2402.11137 [cs].
  12. Revisiting Deep Learning Models for Tabular Data. In Advances in Neural Information Processing Systems (NeurIPS). arXiv, 2021. URL http://arxiv.org/abs/2106.11959. arXiv:2106.11959 [cs] version: 3.
  13. On Embeddings for Numerical Features in Tabular Deep Learning. In Advances in Neural Information Processing Systems (NeurIPS). arXiv, March 2022. URL http://arxiv.org/abs/2203.05556. arXiv:2203.05556 [cs].
  14. TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning, July 2023. URL http://arxiv.org/abs/2307.14338. arXiv:2307.14338 [cs].
  15. Why do tree-based models still outperform deep learning on tabular data? In Advances in Neural Information Processing Systems (NeurIPS). arXiv, July 2022. URL http://arxiv.org/abs/2207.08815. arXiv:2207.08815 [cs, stat].
  16. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  17. Support vector machines. IEEE Intelligent Systems and their Applications, 13(4):18–28, July 1998. ISSN 2374-9423. doi: 10.1109/5254.708428. URL https://ieeexplore.ieee.org/document/708428. Conference Name: IEEE Intelligent Systems and their Applications.
  18. TabLLM: Few-shot Classification of Tabular Data with Large Language Models. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 5549–5581. PMLR, April 2023. URL https://proceedings.mlr.press/v206/hegselmann23a.html. ISSN: 2640-3498.
  19. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems (NeurIPS). arXiv, December 2020. URL http://arxiv.org/abs/2006.11239. arXiv:2006.11239 [cs, stat].
  20. TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. In International Conference on Learning Representations (ICLR). arXiv, September 2023. doi: 10.48550/arXiv.2207.01848. URL http://arxiv.org/abs/2207.01848. arXiv:2207.01848 [cs, stat].
  21. Tabtransformer: Tabular data modeling using contextual embeddings. arXiv preprint arXiv:2012.06678, 2020.
  22. Well-tuned Simple Nets Excel on Tabular Datasets, November 2021. URL http://arxiv.org/abs/2106.11189. arXiv:2106.11189 [cs].
  23. Net-DNF: Effective Deep Modeling of Tabular Data. In International Conference on Learning Representations (ICLR), page 16, 2021.
  24. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html.
  25. Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems. Chemical Reviews, 121(16):9816–9872, August 2021. ISSN 0009-2665, 1520-6890. doi: 10.1021/acs.chemrev.1c00107. URL https://pubs.acs.org/doi/10.1021/acs.chemrev.1c00107.
  26. Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning. In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 28742–28756. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper/2021/hash/f1507aba9fc82ffa7cc7373c58f8a613-Abstract.html.
  27. TRANSFER LEARNING WITH DEEP TABULAR MODELS. In International Conference on Learning Representations (ICLR), 2023.
  28. Machine Learning in Agriculture: A Review. Sensors, 18(8):2674, August 2018. ISSN 1424-8220. doi: 10.3390/s18082674. URL https://www.mdpi.com/1424-8220/18/8/2674. Number: 8 Publisher: Multidisciplinary Digital Publishing Institute.
  29. Predicting Hard Rock Pillar Stability Using GBDT, XGBoost, and LightGBM Algorithms. Mathematics, 8(5):765, May 2020. ISSN 2227-7390. doi: 10.3390/math8050765. URL https://www.mdpi.com/2227-7390/8/5/765.
  30. In-Context Data Distillation with TabPFN. In NeurIPS Workshop: Table Representation Learning. arXiv, 2023. doi: 10.48550/arXiv.2402.06971. URL http://arxiv.org/abs/2402.06971. arXiv:2402.06971 [cs] version: 1.
  31. Calvin McCarter. What exactly has TabPFN learned to do?, 2024. URL https://iclr-blogposts.github.io/2024/blog/what-exactly-has-tabpfn-learned-to-do/.
  32. When Do Neural Nets Outperform Boosted Trees on Tabular Data? In Advances in Neural Information Processing Systems (NeurIPS) Track on Datasets and Benchmarks. arXiv, October 2023. URL http://arxiv.org/abs/2305.02997. arXiv:2305.02997 [cs, stat].
  33. STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables. In International Conference on Learning Representations (ICLR). arXiv, March 2023. URL http://arxiv.org/abs/2303.00918. arXiv:2303.00918 [cs].
  34. XGBoost Model for Chronic Kidney Disease Diagnosis. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17(6):2131–2140, November 2020. ISSN 1545-5963, 1557-9964, 2374-0043. doi: 10.1109/TCBB.2019.2911071. URL https://ieeexplore.ieee.org/document/8693581/.
  35. Deep Learning for Anomaly Detection: A Review. ACM Computing Surveys, 54(2):1–38, March 2022. ISSN 0360-0300, 1557-7341. doi: 10.1145/3439950. URL https://dl.acm.org/doi/10.1145/3439950.
  36. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12(85):2825–2830, 2011. ISSN 1533-7928. URL http://jmlr.org/papers/v12/pedregosa11a.html.
  37. CatBoost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper/2018/hash/14491b756b3a51daac41c24863285549-Abstract.html. arXiv:1706.09516 [cs].
  38. Predicting clicks: estimating the click-through rate for new ads. In International Conference on World Wide Web (WWW), pages 521–530, Banff Alberta Canada, May 2007. ACM. ISBN 978-1-59593-654-7. doi: 10.1145/1242572.1242643. URL https://dl.acm.org/doi/10.1145/1242572.1242643.
  39. High dimensional, tabular deep learning with an auxiliary knowledge graph. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
  40. Interpretable Machine Learning for TabPFN, March 2024. URL http://arxiv.org/abs/2403.10923. arXiv:2403.10923 [cs, stat].
  41. The Pitfalls of Simplicity Bias in Neural Networks. In Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 9573–9585. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/6cfe0e6127fa25df2a0ef2ae1067d915-Abstract.html.
  42. Regularization Learning Networks: Deep Learning for Tabular Datasets. In Advances in Neural Information Processing Systems (NeurIPS), volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper/2018/hash/500e75a036dc2d7d2fec5da1b71d36cc-Abstract.html.
  43. Tabular data: Deep learning is not all you need. Information Fusion, 81:84–90, May 2022. ISSN 1566-2535. doi: 10.1016/j.inffus.2021.11.011. URL https://www.sciencedirect.com/science/article/pii/S1566253521002360.
  44. SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training. In NeurIPS Workshop: Table Representation Learning. arXiv, June 2021. URL http://arxiv.org/abs/2106.01342. arXiv:2106.01342 [cs, stat].
  45. Self-supervised Representation Learning from Random Data Projectors. In NeurIPS Workshop: Table Representation Learning, 2023.
  46. SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning. In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 18853–18865. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper/2021/hash/9c8661befae6dbcd08304dbf4dcaf0db-Abstract.html.
  47. OpenML: networked science in machine learning. ACM SIGKDD Explorations Newsletter, 15(2):49–60, June 2014. ISSN 1931-0145, 1931-0153. doi: 10.1145/2641190.2641198. URL https://dl.acm.org/doi/10.1145/2641190.2641198.
  48. Attention is All you Need. In Advances in Neural Information Processing Systems (NeurIPS), volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
  49. VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain. In Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 11033–11043. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/7d97667a3e056acab9aaf653807b4a03-Abstract.html.
  50. Tabular Data: Is Attention All You Need? In International Conference on Learning Representations (ICLR). arXiv, February 2024. URL http://arxiv.org/abs/2402.03970. arXiv:2402.03970 [cs].
  51. Towards Foundation Models for Learning on Tabular Data, October 2023. URL http://arxiv.org/abs/2310.07338. arXiv:2310.07338 [cs].
  52. Deep Learning Based Recommender System: A Survey and New Perspectives. ACM Computing Surveys, 52(1):1–38, January 2020. ISSN 0360-0300, 1557-7341. doi: 10.1145/3285029. URL https://dl.acm.org/doi/10.1145/3285029.
  53. Unlocking the Transferability of Tokens in Deep Models for Tabular Data. In NeurIPS Workshop: Table Representation Learning, 2023.
  54. XTab: Cross-table Pretraining for Tabular Transformers. In International Conference on Learning Representations (ICLR), June 2023. URL https://openreview.net/forum?id=uGORNDmIdr.
Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com