Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 164 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 72 tok/s Pro
Kimi K2 204 tok/s Pro
GPT OSS 120B 450 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Automated Model Selection for Tabular Data (2401.00961v2)

Published 1 Jan 2024 in cs.LG and cs.AI

Abstract: Structured data in the form of tabular datasets contain features that are distinct and discrete, with varying individual and relative importances to the target. Combinations of one or more features may be more predictive and meaningful than simple individual feature contributions. R's mixed effect linear models library allows users to provide such interactive feature combinations in the model design. However, given many features and possible interactions to select from, model selection becomes an exponentially difficult task. We aim to automate the model selection process for predictions on tabular datasets incorporating feature interactions while keeping computational costs small. The framework includes two distinct approaches for feature selection: a Priority-based Random Grid Search and a Greedy Search method. The Priority-based approach efficiently explores feature combinations using prior probabilities to guide the search. The Greedy method builds the solution iteratively by adding or removing features based on their impact. Experiments on synthetic demonstrate the ability to effectively capture predictive feature combinations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (8)
  1. Uci machine learning repository, 2017. URL http://archive. ics. uci. edu/ml, 7(1).
  2. xdeepfm: Combining explicit and implicit feature interactions for recommender systems. CoRR, abs/1803.05170.
  3. DNN2LR: interpretation-inspired feature crossing for real-world tabular data. CoRR, abs/2008.09775.
  4. Retrieval & interaction machine for tabular data prediction. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, page 1379–1389, New York, NY, USA. Association for Computing Machinery.
  5. Including Multi-feature Interactions and Redundancy for Feature Ranking in Mixed Datasets, pages 239–255.
  6. Interactive feature generation via learning adjacency tensor of feature graph. CoRR, abs/2007.14573.
  7. A novel feature selection method considering feature interaction. Pattern Recognition, 48(8):2656–2666.
  8. Searching for interacting features in subset selection. Intelligent Data Analysis, 13(2):207–228.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.