The universal approximation theorem for complex-valued neural networks (2012.03351v2)

Published 6 Dec 2020 in math.FA, cs.LG, and stat.ML

Abstract: We generalize the classical universal approximation theorem for neural networks to the case of complex-valued neural networks. Precisely, we consider feedforward networks with a complex activation function $\sigma : \mathbb{C} \to \mathbb{C}$ in which each neuron performs the operation $\mathbb{C}^N \to \mathbb{C}, z \mapsto \sigma(b + w^T z)$ with weights $w \in \mathbb{C}^N$ and a bias $b \in \mathbb{C}$, and with $\sigma$ applied componentwise. We completely characterize those activation functions $\sigma$ for which the associated complex networks have the universal approximation property, meaning that they can uniformly approximate any continuous function on any compact subset of $\mathbb{C}^d$ arbitrarily well. Unlike the classical case of real networks, the set of "good activation functions" which give rise to networks with the universal approximation property differs significantly depending on whether one considers deep networks or shallow networks: For deep networks with at least two hidden layers, the universal approximation property holds as long as $\sigma$ is neither a polynomial, a holomorphic function, or an antiholomorphic function. Shallow networks, on the other hand, are universal if and only if the real part or the imaginary part of $\sigma$ is not a polyharmonic function.

Citations (51)

View on Semantic Scholar

Summary

The paper demonstrates that complex-valued neural networks can uniformly approximate any continuous function on compact sets when employing appropriate activation functions.
It reveals that shallow networks require activation functions that are not almost polyharmonic, while deep architectures succeed unless the activations are polynomial or holomorphic.
The findings expand theoretical boundaries and offer practical design insights for leveraging complex-valued neural networks in areas like signal processing and quantum computing.

An Overview of the Universal Approximation Theorem for Complex-Valued Neural Networks

The paper "The Universal Approximation Theorem for Complex-Valued Neural Networks" by Felix Voigtlaender explores the extension of the classical universal approximation theorem to neural networks with complex-valued components. The research encapsulates an examination of feedforward networks using a complex activation function $\sigma: \mathbb{C} \to \mathbb{C}$ , where the operation of each neuron involves a transformation $\mathbb{C}^N \to \mathbb{C}, \, z \mapsto \sigma(b + w^T z)$ with weights $w \in \mathbb{C}^N$ and a bias $b \in \mathbb{C}$ .

Key Contributions and Results

The significant contributions of this paper include characterizing the types of activation functions $\sigma$ that endow complex-valued neural networks (CVNNs) with the universal approximation property. This means such networks can uniformly approximate any continuous function on any compact subset of $\mathbb{C}^d$ , to an arbitrary degree of accuracy. This research represents a pioneering effort in comprehensively understanding the approximation capacities of CVNNs, offering insights for both shallow (single hidden layer) and deep (multiple hidden layers) architectures as follows:

Shallow Networks: For shallow networks, it is established that the universal approximation property holds if and only if the activation function $\sigma$ is not almost polyharmonic. A function is deemed almost polyharmonic if it coincides almost everywhere with a polyharmonic function, implicating that the real or imaginary part of $\sigma$ is constrained in this way, the network cannot be universal.
Deep Networks: Interestingly, the paper unveils a divergence for deeper networks (those with at least two hidden layers). The universal approximation property for deep CVNNs is upheld unless $\sigma$ is either a polynomial in $z$ and $\overline{z}$ or is holomorphic/antiholomorphic, adding a layer of complexity and indeed feasibility for more varieties of activation functions compared to shallow networks.

Theoretical and Practical Implications

The theoretical implications of these findings are manifold. First, they extend the field of function approximation by neural networks into the complex plane, which is of paramount importance for domains where complex numbers naturally arise, such as signal processing or quantum computing. Further, the determination that a broader class of activations allows universality in deeper networks could influence architectural decisions in designing neural networks for specific tasks.

On a practical level, the authors suggest potential for enhancing model performance in complex-valued domains by leveraging the universal approximation theorem's results. This can lead to refined training methodologies and optimization strategies, as manipulating activation functions stands out as a potent tool for achieving desired approximation capabilities.

Future Directions

The paper leaves open several avenues for further exploration. Subsequent research might consider refining the classification of activation functions beyond the locally bounded and continuity parameters set by this paper. Additionally, the exploration of complex activation functions without continuity assumptions could potentially unlock new network capabilities, which this paper hints at through nuanced examples.

In conclusion, this work represents a significant advance in the field of neural networks, expanding the classical understanding of universal approximation to more intricate, complex-valued settings. The intricate paper not only deepens theoretical understanding but throws light on practical applications and future research pathways, thus enriching the broader landscape of artificial intelligence and machine learning.

PDF Markdown

Related Papers

Tweets

https://twitter.com/s_scardapane/status/1841136238494937155