Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 427 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

A closer look at the approximation capabilities of neural networks (2002.06505v1)

Published 16 Feb 2020 in cs.LG, math.FA, and stat.ML

Abstract: The universal approximation theorem, in one of its most general versions, says that if we consider only continuous activation functions $\sigma$, then a standard feedforward neural network with one hidden layer is able to approximate any continuous multivariate function $f$ to any given approximation threshold $\varepsilon$, if and only if $\sigma$ is non-polynomial. In this paper, we give a direct algebraic proof of the theorem. Furthermore we shall explicitly quantify the number of hidden units required for approximation. Specifically, if $X\subseteq \mathbb{R}n$ is compact, then a neural network with $n$ input units, $m$ output units, and a single hidden layer with $\binom{n+d}{d}$ hidden units (independent of $m$ and $\varepsilon$), can uniformly approximate any polynomial function $f:X \to \mathbb{R}m$ whose total degree is at most $d$ for each of its $m$ coordinate functions. In the general case that $f$ is any continuous function, we show there exists some $N\in \mathcal{O}(\varepsilon{-n})$ (independent of $m$), such that $N$ hidden units would suffice to approximate $f$. We also show that this uniform approximation property (UAP) still holds even under seemingly strong conditions imposed on the weights. We highlight several consequences: (i) For any $\delta > 0$, the UAP still holds if we restrict all non-bias weights $w$ in the last layer to satisfy $|w| < \delta$. (ii) There exists some $\lambda>0$ (depending only on $f$ and $\sigma$), such that the UAP still holds if we restrict all non-bias weights $w$ in the first layer to satisfy $|w|>\lambda$. (iii) If the non-bias weights in the first layer are \emph{fixed} and randomly chosen from a suitable range, then the UAP holds with probability $1$.

Citations (15)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.