Real-Time FJ/MAC PDE Solvers via Tensorized, Back-Propagation-Free Optical PINN Training (2401.00413v2)

Published 31 Dec 2023 in cs.LG, cs.ET, and eess.SP

Abstract: Solving partial differential equations (PDEs) numerically often requires huge computing time, energy cost, and hardware resources in practical applications. This has limited their applications in many scenarios (e.g., autonomous systems, supersonic flows) that have a limited energy budget and require near real-time response. Leveraging optical computing, this paper develops an on-chip training framework for physics-informed neural networks (PINNs), aiming to solve high-dimensional PDEs with fJ/MAC photonic power consumption and ultra-low latency. Despite the ultra-high speed of optical neural networks, training a PINN on an optical chip is hard due to (1) the large size of photonic devices, and (2) the lack of scalable optical memory devices to store the intermediate results of back-propagation (BP). To enable realistic optical PINN training, this paper presents a scalable method to avoid the BP process. We also employ a tensor-compressed approach to improve the convergence and scalability of our optical PINN training. This training framework is designed with tensorized optical neural networks (TONN) for scalable inference acceleration and MZI phase-domain tuning for \textit{in-situ} optimization. Our simulation results of a 20-dim HJB PDE show that our photonic accelerator can reduce the number of MZIs by a factor of $1.17\times 10^3$, with only $1.36$ J and $1.15$ s to solve this equation. This is the first real-size optical PINN training framework that can be applied to solve high-dimensional PDEs.

References (31)

Citations (2)

View on Semantic Scholar

Summary

The paper proposes a BP-free training method using tensorized optical networks to enable real-time solutions for complex PDEs.
It leverages Mach-Zehnder interferometers and SPSA for efficient on-chip gradient estimation and enhanced convergence.
Simulation results on a 20-dimensional HJB PDE demonstrate reduced device footprint, lower energy consumption, and robust performance.

Introduction

Partial differential equations (PDEs) are mathematical models that describe various phenomena across engineering and science disciplines. Solving PDEs, especially high-dimensional ones, usually requires substantial computational resources and time, making real-time applications, such as in autonomous systems or medical imaging, challenging. Advanced computational methods such as physics-informed neural networks (PINNs) have been explored to mitigate these challenges but have their limitations, primarily related to high computational costs and energy consumption during training.

Optical Computing for PDEs

To address these challenges, optical neural network (ONN) accelerators have emerged as a potential solution due to their high-speed processing capabilities. However, training PINNs on photonic chips introduces several obstacles, such as the large footprint of photonic devices and the difficulty of executing back-propagation (BP) on chip. The paper proposes an innovative solution by employing a BP-free approach to estimate gradients and derivatives necessary for PINN training. It also leverages tensor-compressed formats to reduce the number of necessary photonic devices and improve training convergence.

Scalable Optical PINN Training Framework

The proposed framework integrates tensorized optical neural networks (TONNs) for scalable inference acceleration and relies on Mach-Zehnder interferometers (MZIs) for on-chip optimization. The framework operates without storing intermediate results from multiple BP processes, thus addressing a significant scalability challenge. Results from simulations of a 20-dimensional Hamilton-Jacobi-BeLLMan (HJB) PDE suggest significant reductions in the number of MZI devices, along with an impressive reduction in energy and time required to solve the equation.

System Design and Results

The system comprises two designs for the optical neural networks based on tensor-train (TT) decompositions—TONN-1 and TONN-2. TONN-1 utilizes parallel processing in both space and wavelength domains, while TONN-2 employs time multiplexing with a smaller footprint but increased latency. This framework uses a gradient estimation method called Simultaneous Perturbation Stochastic Approximation (SPSA) and leverages a tensor-compressed model to reduce variables and improve convergence. Simulation results have provided validation for the effectiveness of the BP-free, tensor-compressed PINN training method, showing that it achieves better robustness and performance compared to conventional off-chip training methods, particularly in the presence of hardware imperfections.

Conclusion and Outlook

This paper presents a novel optical training framework that has taken a significant step towards enabling the real-time solution of high-dimensional PDEs on integrated photonic platforms. It represents an advancement in AI and optical computing by introducing a scalable and energy-efficient training method for PINNs. Future research is geared towards scaling up the training framework, exploring fast MZI tuning techniques, and demonstrating an integrated electro-photonic system that could revolutionize the way complex PDEs are solved in real-time applications.

PDF Markdown

Related Papers

Tweets

https://twitter.com/940029426229641216/status/1742257029828223045