- The paper proposes a novel forecasting approach combining an LSTM-based RNN with a low-rank Gaussian copula process to model high-dimensional time series.
- It employs marginal transformations and a low-rank covariance parametrization to efficiently capture time-varying correlations and complex non-Gaussian distributions.
- Empirical results show improved performance, achieving up to 10% and 40% gains in CRPS and CRPS-Sum compared to state-of-the-art baselines.
High-Dimensional Time Series Forecasting with Gaussian Copula Processes
This paper introduces a novel approach for high-dimensional multivariate time series forecasting by combining an RNN-based model with a low-rank Gaussian copula process. The method addresses the computational challenges associated with estimating high-dimensional covariance matrices and modeling non-Gaussian marginal distributions, enabling the modeling of time-varying correlations across thousands of time series. The authors demonstrate significant accuracy improvements over state-of-the-art baselines on several real-world datasets.
Model Architecture
The proposed model leverages an autoregressive RNN to capture temporal dependencies in individual time series, combined with a Gaussian copula process to model the joint emission distribution (Figure 1).







Figure 1: Illustration of the model parametrization, highlighting the random sampling of dimensions during training and the shared parameters across time series.
Specifically, the model consists of the following components:
- RNN-based State Space Model: An LSTM-RNN is used to model the temporal evolution of each individual time series. The LSTM is unrolled independently for each series, but the parameters are tied across all series. The hidden state hi,t for each time series i at time t is updated based on the previous state and the previous observation zi,t−1.
hi,t=φθh(hi,t−1,zi,t−1)
- Marginal Transformation with Empirical CDF: To handle different scales and non-Gaussian data, each time series' marginal distribution is modeled separately using a non-parametric estimate of its CDF, F^i. This CDF estimate is then used as the marginal transformation in a Gaussian copula, effectively decoupling the estimation of marginal distributions from the temporal dynamics and dependency structure. The transformation fi:D→R is defined as fi=Φ−1∘F^i, where Φ is the CDF of the standard normal distribution.
- Gaussian Copula Process: A Gaussian copula is used to model the joint distribution of the transformed observations. The copula is parametrized by a mean vector (ht) and a covariance matrix Σ(ht), both of which are functions of the LSTM states ht.
p(zt∣ht)=N([f1(z1,t),f2(z2,t),…,fN(zN,t)]T∣ (ht),Σ(ht)).
- Low-Rank Covariance Structure: To reduce the computational complexity and the number of parameters, the covariance matrix Σ(ht) is parametrized as the sum of a diagonal matrix Dt and a low-rank matrix VtVtT, where Vt∈RN×r and r≪N. This parametrization allows the likelihood to be evaluated using O(Nr2+r3) operations instead of O(N3).
Σ(ht)=Dt+VtVtT
Implementation Details
The model is trained by minimizing the negative log-likelihood of the observed data using stochastic gradient descent. To handle long time series, a data augmentation strategy is employed where fixed-size slices of the time series are randomly sampled during training. The parameters of the LSTM, the mean and covariance mappings are learned jointly from the data.
The authors also propose a Gaussian Process (GP) interpretation of the model, where the distribution of the transformed data xt is viewed as a GP evaluated at points yi,t=[hi,t;ei]T. This GP perspective enables training the model on random subsets of the time series in each iteration, further alleviating memory constraints and allowing the model to be applied to very high-dimensional datasets.
Experimental Results
The proposed method was evaluated on several real-world datasets, including exchange rates, solar power production, electricity consumption, traffic occupancy rates, taxi demand, and Wikipedia page views. The results demonstrate that the proposed model achieves significant accuracy improvements over state-of-the-art baselines, including VAR, GARCH, and various LSTM architectures.
The authors also present a synthetic experiment to demonstrate that the model can recover complex time-varying low-rank covariance patterns from multi-dimensional observations (Figure 2).







Figure 2: Comparison of true and predicted covariance matrices after training on synthetic data, showing the model's ability to capture complex covariance patterns.
Specifically, the GP-Copula model improves CRPS and CRPS-Sum by 10% and 40% respectively, compared to the second best models. Furthermore, the proposed model achieves these improvements with a significantly smaller number of parameters compared to other deep learning-based approaches.
Conclusion
The paper presents a method for probabilistic forecasting of high-dimensional multivariate time series using a low-rank Gaussian copula process. The proposed approach addresses the computational challenges associated with high-dimensional covariance estimation and non-Gaussian data, and achieves significant accuracy improvements over state-of-the-art baselines. The authors suggest that the ability to estimate high-dimensional time-varying covariance matrices could enable several applications in anomaly detection, imputation, and graph analysis for time series data.