Shallow ReLU neural networks and finite elements (2403.05809v1)
Abstract: We point out that (continuous or discontinuous) piecewise linear functions on a convex polytope mesh can be represented by two-hidden-layer ReLU neural networks in a weak sense. In addition, the numbers of neurons of the two hidden layers required to weakly represent are accurately given based on the numbers of polytopes and hyperplanes involved in this mesh. The results naturally hold for constant and linear finite element functions. Such weak representation establishes a bridge between shallow ReLU neural networks and finite element functions, and leads to a perspective for analyzing approximation capability of ReLU neural networks in $Lp$ norm via finite element functions. Moreover, we discuss the strict representation for tensor finite element functions via the recent tensor neural networks.
- Understanding deep neural networks with rectified linear units. arXiv preprint arXiv:1611.01491, 2016.
- S. C. Brenner. The mathematical theory of finite element methods. Springer, 2008.
- Improved bounds on neural complexity for representing piecewise linear functions. Advances in Neural Information Processing Systems, 35:7167–7180, 2022.
- G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.
- Nonlinear approximation and (deep) ReLU networks. Constructive Approximation, 55(1):127–172, 2022.
- Barron spaces and the compositional function spaces for neural network models. arXiv preprint arXiv:1906.08039, 2019.
- M. S. Floater. Generalized barycentric coordinates and applications. Acta Numerica, 24:161–214, 2015.
- J. Håstad. Tensor rank is NP-complete. In Automata, Languages and Programming: 16th International Colloquium Stresa, Italy, July 11–15, 1989 Proceedings 16, pages 451–460. Springer, 1989.
- ReLU deep neural networks and linear finite elements. arXiv preprint arXiv:1807.03973, 2018.
- Towards lower bounds on the depth of ReLU neural networks. Advances in Neural Information Processing Systems, 34:3336–3348, 2021.
- Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
- MIONet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing, 44(6):A3490–A3514, 2022.
- C. W. Lee. Some recent results on convex polytopes. Contemporary Math, 114:3–19, 1990.
- Y. Li and Y. Yuan. Convergence analysis of two-layer neural networks with ReLU activation. Advances in neural information processing systems, 30, 2017.
- Dying ReLU and Initialization: Theory and Numerical Examples. arXiv preprint arXiv:1903.06733, 2019.
- S. McCarty. Piecewise linear functions representable with infinite width shallow ReLU neural networks. Proceedings of the American Mathematical Society, Series B, 10(27):296–310, 2023.
- V. Nair and G. E. Hinton. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010.
- On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv preprint arXiv:1312.6098, 2013.
- J. W. Siegel and J. Xu. High-order approximation rates for shallow neural networks with cosine and ReLUksuperscriptReLU𝑘{\rm ReLU}^{k}roman_ReLU start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT activation functions. Applied and Computational Harmonic Analysis, 58:1–26, 2022.
- Tensor neural network and its numerical integration. arXiv preprint arXiv:2207.02754, 2022.
- D. Yarotsky. Error bounds for approximations with deep ReLU networks. Neural Networks, 94:103–114, 2017.