Emergent Mind

Learning Sparse Additive Models with Interactions in High Dimensions

(1604.05307)
Published Apr 18, 2016 in cs.LG , cs.IT , math.IT , and stat.ML

Abstract

A function $f: \mathbb{R}d \rightarrow \mathbb{R}$ is referred to as a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum{l \in \mathcal{S}}\phi{l}(xl)$, where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $\phil$'s and $\mathcal{S}$ to be unknown, the problem of estimating $f$ from its samples has been studied extensively. In this work, we consider a generalized SPAM, allowing for second order interaction terms. For some $\mathcal{S}1 \subset [d], \mathcal{S}2 \subset {[d] \choose 2}$, the function $f$ is assumed to be of the form: $$f(\mathbf{x}) = \sum{p \in \mathcal{S}1}\phi{p} (xp) + \sum{(l,l{\prime}) \in \mathcal{S}2}\phi{(l,l{\prime})} (x{l},x{l{\prime}}).$$ Assuming $\phi{p},\phi{(l,l{\prime})}$, $\mathcal{S}1$ and, $\mathcal{S}2$ to be unknown, we provide a randomized algorithm that queries $f$ and exactly recovers $\mathcal{S}1,\mathcal{S}2$. Consequently, this also enables us to estimate the underlying $\phip, \phi_{(l,l{\prime})}$. We derive sample complexity bounds for our scheme and also extend our analysis to include the situation where the queries are corrupted with noise -- either stochastic, or arbitrary but bounded. Lastly, we provide simulation results on synthetic data, that validate our theoretical findings.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.