Deep Network with Approximation Error Being Reciprocal of Width to Power of Square Root of Depth (2006.12231v6)
Abstract: A new network with super approximation power is introduced. This network is built with Floor ($\lfloor x\rfloor$) or ReLU ($\max{0,x}$) activation function in each neuron and hence we call such networks Floor-ReLU networks. For any hyper-parameters $N\in\mathbb{N}+$ and $L\in\mathbb{N}+$, it is shown that Floor-ReLU networks with width $\max{d,\, 5N+13}$ and depth $64dL+3$ can uniformly approximate a H\"older function $f$ on $[0,1]d$ with an approximation error $3\lambda d{\alpha/2}N{-\alpha\sqrt{L}}$, where $\alpha \in(0,1]$ and $\lambda$ are the H\"older order and constant, respectively. More generally for an arbitrary continuous function $f$ on $[0,1]d$ with a modulus of continuity $\omega_f(\cdot)$, the constructive approximation rate is $\omega_f(\sqrt{d}\,N{-\sqrt{L}})+2\omega_f(\sqrt{d}){N{-\sqrt{L}}}$. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of $\omega_f(r)$ as $r\to 0$ is moderate (e.g., $\omega_f(r) \lesssim r\alpha$ for H\"older continuous functions), since the major term to be considered in our approximation rate is essentially $\sqrt{d}$ times a function of $N$ and $L$ independent of $d$ within the modulus of continuity.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.