Hidden Physics Models: Machine Learning of Nonlinear Partial Differential Equations (1708.00588v2)

Published 2 Aug 2017 in cs.AI, cs.LG, math.AP, and stat.ML

Abstract: While there is currently a lot of enthusiasm about "big data", useful data is usually "small" and expensive to acquire. In this paper, we present a new paradigm of learning partial differential equations from {\em small} data. In particular, we introduce \emph{hidden physics models}, which are essentially data-efficient learning machines capable of leveraging the underlying laws of physics, expressed by time dependent and nonlinear partial differential equations, to extract patterns from high-dimensional data generated from experiments. The proposed methodology may be applied to the problem of learning, system identification, or data-driven discovery of partial differential equations. Our framework relies on Gaussian processes, a powerful tool for probabilistic inference over functions, that enables us to strike a balance between model complexity and data fitting. The effectiveness of the proposed approach is demonstrated through a variety of canonical problems, spanning a number of scientific domains, including the Navier-Stokes, Schr\"odinger, Kuramoto-Sivashinsky, and time dependent linear fractional equations. The methodology provides a promising new direction for harnessing the long-standing developments of classical methods in applied mathematics and mathematical physics to design learning machines with the ability to operate in complex domains without requiring large quantities of data.

Citations (1,063)

View on Semantic Scholar

Summary

The paper proposes a novel framework that leverages Gaussian process priors and backward Euler time stepping to learn nonlinear PDEs from small datasets.
The method accurately identifies parameters in canonical problems like Burgers’, KdV, and Navier-Stokes equations, demonstrating notable data efficiency.
By balancing model complexity and data fitting via negative log marginal likelihood minimization, the approach offers robust system identification and discovery.

Hidden Physics Models: Machine Learning of Nonlinear Partial Differential Equations

Introduction

The paper "Hidden Physics Models: Machine Learning of Nonlinear Partial Differential Equations" by Maziar Raissi and George Em Karniadakis introduces a novel framework for learning partial differential equations (PDEs) from small data. The emphasis of the work is on leveraging the underlying physical laws, expressed through time-dependent and nonlinear PDEs, to extract meaningful patterns from high-dimensional data typically generated in experiments. This methodology is positioned as useful for system identification and data-driven discovery of PDEs, with a particular focus on the integration of probabilistic machine learning techniques, specifically Gaussian processes, to strike a balance between model complexity and data fitting.

Methodology

The proposed framework is predicated on the use of Gaussian processes, allowing for probabilistic inference over functions. Key components of this framework include:

Parametrized Nonlinear PDEs: The general form of the PDEs considered is expressed as $h_t + \mathcal{N}_x^\lambda h = 0$ , where $h(t,x)$ denotes the latent solution and $\mathcal{N}_x^\lambda$ is a nonlinear operator parametrized by $\lambda$ .
Backward Euler Time Stepping Scheme: Given noisy measurements at consecutive time steps $t^{n-1}$ and $t^n$ , the model employs the backward Euler time stepping scheme to approximate the underlying PDE, leading to a discretized equation that forms the basis for the Gaussian process regression.
Gaussian Process Priors: The methodology places a Gaussian process prior over the latent functions, enabling the capture of nonlinear operators’ structure in the correlating covariance functions.
Multi-output Gaussian Processes: By coupling the outputs at different time steps, the proposed multi-output Gaussian processes incorporate the physical constraints and underlying PDEs directly into the covariance functions.

Learning and Inference

The learning process involves minimizing the negative log marginal likelihood, which automatically balances data fit and model complexity, safeguarding against overfitting. This minimization captures the hyper-parameters of the covariance functions alongside the parameters of the PDEs.

Results

The framework was tested across a variety of canonical problems, including Burgers' equation, the Korteweg-de Vries (KdV) equation, the Kuramoto-Sivashinsky equation, the nonlinear Schrödinger equation, the Navier-Stokes equations, and fractional PDEs. Key findings and results include:

Burgers' Equation: With only 140 data points from two snapshots, the method accurately identified the parameters of the Burgers' equation, demonstrating the framework's efficiency in data-scarce scenarios.
KdV Equation: Utilizing merely 220 data points, the model successfully identified the parameters of the complex KdV equation, illustrating its capability in handling dispersive phenomena.
Kuramoto-Sivashinsky Equation: Despite the chaotic and spatiotemporal nature of the system, the identified parameters were accurate, leveraging 600 data points.
Nonlinear Schrödinger Equation: The algorithm accurately estimated parameters using only 100 data points, validating its application in optical and quantum mechanical wave propagation.
Navier-Stokes Equations: For the two-dimensional fluid flow past a cylinder at Reynolds number 100, the framework identified the parameters using 500 data points, highlighting its robustness in fluid dynamics.
Fractional Equations: The model accurately identified the fractional order of the operators from small datasets, underscoring its adaptability to anomalous diffusion and non-local interactions.

Implications and Future Directions

This work provides a strong foundation for the data-efficient learning of complex physical systems governed by PDEs. Practically, it opens avenues for enhanced system identification and discovery in engineering, physics, and applied mathematics where data scarcity is a significant challenge. Theoretically, it integrates classical methods in applied mathematics with modern machine learning, particularly Bayesian modeling and Gaussian processes.

Directions for future research include addressing the computational limitations associated with inverting dense covariance matrices, optimizing the framework for larger datasets using recursive Kalman updates or variational inference, and expanding the method's applicability to more complex systems and multi-scale phenomena.

Conclusion

The framework presented by Raissi and Karniadakis marks a significant step in the fusion of machine learning and classical physics, demonstrating the potential of Gaussian processes in learning and inferring PDEs from limited data. The results across a range of canonical problems emphasize the model's robustness, efficiency, and flexibility, paving the way for further advancements in both theoretical development and practical applications.

PDF Markdown