Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Understanding training and generalization in deep learning by Fourier analysis (1808.04295v4)

Published 13 Aug 2018 in cs.LG, cs.AI, math.OC, math.ST, stat.ML, and stat.TH

Abstract: Background: It is still an open research area to theoretically understand why Deep Neural Networks (DNNs)---equipped with many more parameters than training data and trained by (stochastic) gradient-based methods---often achieve remarkably low generalization error. Contribution: We study DNN training by Fourier analysis. Our theoretical framework explains: i) DNN with (stochastic) gradient-based methods often endows low-frequency components of the target function with a higher priority during the training; ii) Small initialization leads to good generalization ability of DNN while preserving the DNN's ability to fit any function. These results are further confirmed by experiments of DNNs fitting the following datasets, that is, natural images, one-dimensional functions and MNIST dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Zhiqin John Xu (1 paper)
Citations (88)

Summary

We haven't generated a summary for this paper yet.