Emergent Mind

Geometric compression of invariant manifolds in neural nets

(2007.11471)
Published Jul 22, 2020 in cs.LG and stat.ML

Abstract

We study how neural networks compress uninformative input space in models where data lie in $d$ dimensions, but whose label only vary within a linear manifold of dimension $d\parallel < d$. We show that for a one-hidden layer network initialized with infinitesimal weights (i.e. in the feature learning regime) trained with gradient descent, the first layer of weights evolve to become nearly insensitive to the $d\perp=d-d\parallel$ uninformative directions. These are effectively compressed by a factor $\lambda\sim \sqrt{p}$, where $p$ is the size of the training set. We quantify the benefit of such a compression on the test error $\epsilon$. For large initialization of the weights (the lazy training regime), no compression occurs and for regular boundaries separating labels we find that $\epsilon \sim p{-\beta}$, with $\beta\text{Lazy} = d / (3d-2)$. Compression improves the learning curves so that $\beta\text{Feature} = (2d-1)/(3d-2)$ if $d\parallel = 1$ and $\beta\text{Feature} = (d + d\perp/2)/(3d-2)$ if $d\parallel > 1$. We test these predictions for a stripe model where boundaries are parallel interfaces ($d\parallel=1$) as well as for a cylindrical boundary ($d\parallel=2$). Next we show that compression shapes the Neural Tangent Kernel (NTK) evolution in time, so that its top eigenvectors become more informative and display a larger projection on the labels. Consequently, kernel learning with the frozen NTK at the end of training outperforms the initial NTK. We confirm these predictions both for a one-hidden layer FC network trained on the stripe model and for a 16-layers CNN trained on MNIST, for which we also find $\beta\text{Feature}>\beta_\text{Lazy}$.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.