Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation (2305.18460v3)

Published 29 May 2023 in cs.LG, cs.NA, and math.NA

Abstract: The study of universal approximation properties (UAP) for neural networks (NN) has a long history. When the network width is unlimited, only a single hidden layer is sufficient for UAP. In contrast, when the depth is unlimited, the width for UAP needs to be not less than the critical width $w*_{\min}=\max(d_x,d_y)$, where $d_x$ and $d_y$ are the dimensions of the input and output, respectively. Recently, \cite{cai2022achieve} shows that a leaky-ReLU NN with this critical width can achieve UAP for $Lp$ functions on a compact domain ${K}$, \emph{i.e.,} the UAP for $Lp({K},\mathbb{R}{d_y})$. This paper examines a uniform UAP for the function class $C({K},\mathbb{R}{d_y})$ and gives the exact minimum width of the leaky-ReLU NN as $w_{\min}=\max(d_x,d_y)+\Delta (d_x, d_y)$, where $\Delta (d_x, d_y)$ is the additional dimensions for approximating continuous functions with diffeomorphisms via embedding. To obtain this result, we propose a novel lift-flow-discretization approach that shows that the uniform UAP has a deep connection with topological theory.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Barron, A. R. Approximation and estimation bounds for artificial neural networks. Machine learning, 14(1):115–133, 1994.
  2. Expressiveness of Neural Networks Having Width Equal or Below the Input Dimension. arxiv:2011.04923, 2020.
  3. On decision regions of narrow deep neural networks. Neural networks, 140:121–129, 2021.
  4. $L^p$ Approximation of maps by diffeomorphisms. Calculus of Variations and Partial Differential Equations, 16(2):147–164, 2003.
  5. Cai, Y. Achieve the minimum width of neural networks for universal approximation. arXiv preprint arXiv:2209.11395, 2022.
  6. Caponigro, M. Orientation preserving diffeomorphisms and flows of control-affine systems. IFAC Proceedings Volumes, 44(1):8016–8021, 2011.
  7. Chong, K. F. E. A closer look at the approximation capabilities of neural networks. In International Conference on Learning Representations, 2020.
  8. Cybenko, G. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.
  9. Daniely, A. Depth separation for neural networks. In Conference on Learning Theory, 2017.
  10. Vanilla feedforward neural networks as a discretization of dynamic systems. arXiv preprint arXiv:2209.10909, 2022.
  11. Approximating Continuous Functions by ReLU Nets of Minimal Width. arXiv preprint arXiv:1710.11278, 2018.
  12. Hirsch, M. W. Differential Topology. Springer New York, 1976.
  13. Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural networks, 4(2):251–257, 1991.
  14. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
  15. Neural Autoregressive Flows. In International Conference on Machine Learning, 2018.
  16. Hwang, G. Minimum width for deep, narrow mlp: A diffeomorphism and the whitney embedding theorem approach. arXiv preprint arXiv:2308.15873, 2023.
  17. Johnson, J. Deep, Skinny Neural Networks are not Universal Approximators. In International Conference on Learning Representations, 2019.
  18. Minimum width for universal approximation using relu networks on compact domain. arXiv preprint arXiv:2309.10402, 2023.
  19. Universal Approximation of Residual Flows in Maximum Mean Discrepancy. International Conference on Machine Learning workshop, 2021.
  20. Representational Power of Restricted Boltzmann Machines and Deep Belief Networks. Neural Computation, 20(6):1631–1649, 2008.
  21. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks, 6(6):861–867, 1993.
  22. The Expressive Power of Neural Networks: A View from the Width. In Neural Information Processing Systems, 2017.
  23. Montufar, G. F. Universal approximation depth and errors of narrow belief networks with discrete units. Neural Computation, 26(7):1386–1407, 2014.
  24. Neural Networks Should Be Wide Enough to Learn Disconnected Decision Regions. In International Conference on Machine Learning, 2018.
  25. Minimum Width for Universal Approximation. In International Conference on Learning Representations, 2021.
  26. Residual networks as geodesic flows of diffeomorphisms. arXiv preprint arXiv:1805.09585, 2018.
  27. Neural ODE control for classification, approximation and transport. arXiv: 2104.05278, 2021.
  28. Deep, Narrow Sigmoid Belief Networks Are Universal Approximators. Neural Computation, 20(11):2629–2636, 2008.
  29. Universal approximation power of deep residual neural networks through the lens of control. IEEE Transactions on Automatic Control, 68, 2023.
  30. Telgarsky, M. Benefits of depth in neural networks. In Conference on learning theory, 2016.
  31. Coupling-based Invertible Neural Networks Are Universal Diffeomorphism Approximators. In Neural Information Processing Systems, 2020a.
  32. Universal Approximation Property of Neural Ordinary Differential Equations. Neural Information Processing Systems 2020 Workshop on Differential Geometry meets Deep Learning, 2020b.
  33. Whitney, H. The Self-Intersections of a Smooth n-Manifold in 2n-Space. Annals of Mathematics, 45(2):220–246, 1944.
  34. Approximation Capabilities of Neural ODEs and Invertible Residual Networks. In International Conference on Machine Learning, 2019.
Citations (10)

Summary

We haven't generated a summary for this paper yet.