A Mixed-Integer Programming Approach to Training Dense Neural Networks (2201.00723v2)
Abstract: Artificial Neural Networks (ANNs) are prevalent machine learning models that are applied across various real-world classification tasks. However, training ANNs is time-consuming and the resulting models take a lot of memory to deploy. In order to train more parsimonious ANNs, we propose a novel mixed-integer programming (MIP) formulation for training fully-connected ANNs. Our formulations can account for both binary and rectified linear unit (ReLU) activations, and for the use of a log-likelihood loss. We present numerical experiments comparing our MIP-based methods against existing approaches and show that we are able to achieve competitive out-of-sample performance with more parsimonious models.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.