Emergent Mind

Understanding training and generalization in deep learning by Fourier analysis

(1808.04295)
Published Aug 13, 2018 in cs.LG , cs.AI , math.OC , math.ST , stat.ML , and stat.TH

Abstract

Background: It is still an open research area to theoretically understand why Deep Neural Networks (DNNs)equipped with many more parameters than training data and trained by (stochastic) gradient-based methodsoften achieve remarkably low generalization error. Contribution: We study DNN training by Fourier analysis. Our theoretical framework explains: i) DNN with (stochastic) gradient-based methods often endows low-frequency components of the target function with a higher priority during the training; ii) Small initialization leads to good generalization ability of DNN while preserving the DNN's ability to fit any function. These results are further confirmed by experiments of DNNs fitting the following datasets, that is, natural images, one-dimensional functions and MNIST dataset.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.