Papers
Topics
Authors
Recent
2000 character limit reached

A Structure-Preserving Kernel Method for Learning Hamiltonian Systems (2403.10070v2)

Published 15 Mar 2024 in stat.ML, cs.LG, and math.DS

Abstract: A structure-preserving kernel ridge regression method is presented that allows the recovery of nonlinear Hamiltonian functions out of datasets made of noisy observations of Hamiltonian vector fields. The method proposes a closed-form solution that yields excellent numerical performances that surpass other techniques proposed in the literature in this setup. From the methodological point of view, the paper extends kernel regression methods to problems in which loss functions involving linear functions of gradients are required and, in particular, a differential reproducing property and a Representer Theorem are proved in this context. The relation between the structure-preserving kernel estimator and the Gaussian posterior mean estimator is analyzed. A full error analysis is conducted that provides convergence rates using fixed and adaptive regularization parameters. The good performance of the proposed estimator together with the convergence rate is illustrated with various numerical experiments.

Citations (1)

Summary

  • The paper presents a novel structure-preserving kernel ridge regression that reliably recovers Hamiltonian functions from noisy observations.
  • It extends the Representer Theorem to a differential setting, yielding a closed-form, convex estimator that preserves the symplectic structure of the dynamics.
  • The method achieves superior performance and lower computational cost compared to neural approaches, with rigorous error bounds and convergence guarantees.

Structure-Preserving Kernel Methods for Learning Hamiltonian Systems

Introduction and Problem Setting

This work introduces a structure-preserving kernel ridge regression approach for learning Hamiltonian systems from noisy vector field observations. The primary goal is to recover potentially high-dimensional and nonlinear Hamiltonian functions from finite, noise-corrupted samples of Hamiltonian vector fields. The method is designed to guarantee that the learned vector field is genuinely Hamiltonian, as opposed to approaches that may violate the symplectic or gradient structure required by Hamiltonian dynamics.

The inverse problem targeted is: Given samples {z(n),xσ2(n)}n=1N\{\mathbf{z}^{(n)}, \mathbf{x}_{\sigma^2}^{(n)}\}_{n=1}^N, where each z(n)\mathbf{z}^{(n)} is a state in phase space and xσ2(n)\mathbf{x}_{\sigma^2}^{(n)} is a noisy observation of the Hamiltonian vector field at z(n)\mathbf{z}^{(n)}, infer the underlying scalar-valued Hamiltonian HH. The observed data obeys

xσ2(n)=JH(z(n))+ε(n),\mathbf{x}_{\sigma^2}^{(n)} = J \nabla H(\mathbf{z}^{(n)}) + \bm{\varepsilon}^{(n)},

where JJ is the canonical symplectic matrix and ε(n)\bm{\varepsilon}^{(n)} is noise with variance σ2\sigma^2.

Structure-Preserving Kernel Ridge Regression Framework

The proposed learning strategy constrains the hypothesis class to Hamiltonian vector fields Xh=JhX_h = J\nabla h with hh restricted to a Reproducing Kernel Hilbert Space (RKHS) HK\mathcal{H}_K. The estimation problem is formulated as

h^λ,N=argminhHK1Nn=1NXh(z(n))xσ2(n)2+λhHK2,\widehat{h}_{\lambda,N} = \arg\min_{h \in \mathcal{H}_K} \frac{1}{N}\sum_{n=1}^{N} \|X_h(\mathbf{z}^{(n)}) - \mathbf{x}_{\sigma^2}^{(n)}\|^2 + \lambda \|h\|_{\mathcal{H}_K}^2,

with Tikhonov regularization parameter λ\lambda.

A central technical contribution is the extension of the Representer Theorem to this structure-preserving context, leading to what the authors denote as the "Differential Representer Theorem". This results in a closed-form solution for the estimator,

h^λ,N=i=1Nc^i,1K(z(i),),\widehat{h}_{\lambda,N} = \sum_{i=1}^N \langle \widehat{\mathbf{c}}_i, \nabla_1 K(\mathbf{z}^{(i)}, \cdot) \rangle,

where c^\widehat{\mathbf{c}} is computed using a differential Gram matrix involving derivatives of the kernel. The solution is convex and unique due to the positive semidefiniteness of the differential Gram matrix.

Connection to Gaussian Process Regression

The authors establish conditions under which the posterior mean of a Gaussian process regression (GPR) with a suitable kernel and noise model coincides with the structure-preserving kernel estimator. Specifically, when λ=σ2N\lambda = \frac{\sigma^2}{N}, the GPR posterior mean and the structure-preserving kernel estimate coincide, even though the regression loss involves gradients due to the Hamiltonian structure: ϕN=h^λ,N=1N(ANAN+λI)1ANxσ2,N\overline{\phi}_N = \widehat{h}_{\lambda,N} = \frac{1}{\sqrt{N}} (A_N^* A_N + \lambda I)^{-1} A_N^* \mathbf{x}_{\sigma^2,N} where ANA_N encodes the action of the symplectic gradient at each data point.

This equivalence is only valid when the regularization/noise relationship is precisely matched, which the authors highlight as a subtlety not addressed in earlier works claiming broader GPR equivalence.

Error Analysis and Convergence Rates

A detailed error and convergence analysis is provided using both Γ\Gamma-convergence and operator techniques:

1. PAC bounds for a fixed λ\lambda:

  • The estimator converges in RKHS norm to the ground truth as the number of samples increases, under mild smoothness assumptions.

2. Rates with adaptive regularization (λNα\lambda \sim N^{-\alpha}):

  • The estimator converges at rate

    h^λ,NHHKNmin{αγ,12(13α)},\|\widehat{h}_{\lambda,N} - H\|_{\mathcal{H}_K} \lesssim N^{-\min\left\{\alpha \gamma, \frac{1}{2}(1-3\alpha)\right\}},

    for α(0,13)\alpha \in (0,\frac{1}{3}) and γ\gamma determined by a source smoothness condition.

  • Under an additional coercivity assumption linking L2L^2 and RKHS norms, rates improve to allow larger α<1/2\alpha < 1/2.

3. Flow Approximation Guarantees:

  • Theoretical guarantees are given that the learned Hamiltonian induces flows close to those generated by the true Hamiltonian, in the sup norm over initial conditions and time, with explicit dependence on RKHS error.

Differential Reproducing Property and Structure

A central analytical component is the formulation and proof of a differential reproducing property for differentiable kernels on unbounded domains. For a sufficiently regular kernel KK, gradients and higher derivatives of functions in HK\mathcal{H}_K remain in the RKHS and satisfy their own reproducing identities. This property is critical to defining the learning problem in a mathematically rigorous fashion and proves robust enough for non-compact domains with Gaussian kernels and certain Sobolev kernels.

Numerical Experiments

The numerical paper demonstrates superior performance of the structure-preserving kernel estimator in a range of classical, non-convex, and singular Hamiltonian systems. For instance:

  • Double pendulum system: The estimated potential closely matches ground truth, with low error in the region populated by data. Figure 1

Figure 1

Figure 1

Figure 1: Double pendulum—ground truth, reconstruction, and error after vertical shift.

  • Non-convex potentials and systems with singularities: Even with highly non-convex or singular potentials, the method captures qualitative features and achieves lower reconstruction error compared to Hamiltonian neural network (HNN) methods particularly when training data is limited or the objective function is highly non-convex. Figure 2

Figure 2

Figure 2

Figure 2: Frenkel–Kontorova model—ground truth, kernel reconstruction, and shifted error map.

  • Noise robustness and sample complexity: Experiments across noise levels and sample sizes confirm the quantitative error bounds and convergence behavior predicted theoretically.
  • Comparison with HNNs: The kernel method achieves lower error and dramatically lower computational cost than HNNs that require iterative gradient optimization.

Implications and Perspectives

The structure-preserving kernel regression paradigm offers several theoretical and practical advantages:

  • Guaranteed Structure: By construction, all learned vector fields are Hamiltonian, preserving invariants and symplectic geometry.
  • Closed-form Solutions: The convexity and closed-form estimator distinguishes this approach from neural network methods, which may require heuristics to enforce structure and are sensitive to initialization.
  • Theoretical Guarantees: The analysis covers both finite and infinite data regimes, providing sharp error and convergence rates depending on kernel and system regularity.
  • Generalizability: The formalism can be extended beyond Hamiltonian systems to any dynamics where the vector field is a linear action on function gradients, potentially handling generalized gradient flows and port-Hamiltonian systems.

Further theoretical advances may extend this framework to systems on manifolds, controlled systems, and statistical learning from trajectory (as opposed to vector field) data, necessitating an overview with symplectic and variational integrator theory.

Conclusion

This paper formulates and solves the structure-preserving learning of Hamiltonian systems via a kernel ridge regression approach, with strong theoretical guarantees and practical advantages for high-dimensional nonlinear systems. Its closed-form solution, rigorous error bounds, and superior empirical performance with small datasets position it as a robust alternative to neural net-based approaches for learning physical dynamical systems (2403.10070).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.