Time Series Forecasting With Deep Learning: A Survey (2004.13408v2)

Published 28 Apr 2020 in stat.ML and cs.LG

Abstract: Numerous deep learning architectures have been developed to accommodate the diversity of time series datasets across different domains. In this article, we survey common encoder and decoder designs used in both one-step-ahead and multi-horizon time series forecasting -- describing how temporal information is incorporated into predictions by each model. Next, we highlight recent developments in hybrid deep learning models, which combine well-studied statistical models with neural network components to improve pure methods in either category. Lastly, we outline some ways in which deep learning can also facilitate decision support with time series data.

Authors (2)

Bryan Lim (30 papers)
Stefan Zohren (81 papers)

Citations (1,015)

View on Semantic Scholar

Summary

Time Series Forecasting With Deep Learning: A Survey

The paper "Time Series Forecasting With Deep Learning: A Survey" by Bryan Lim and Stefan Zohren offers a comprehensive overview of modern deep learning architectures designed for time series forecasting. This survey outlines the prevalent encoder and decoder designs applied to one-step-ahead and multi-horizon forecasting scenarios. It further explores the recent trends in hybrid models, which integrate traditional statistical approaches with deep learning components to enhance predictive performance. Additionally, the authors discuss the application of deep learning techniques in decision support systems, particularly through methods in interpretability and counterfactual prediction.

Deep Learning Architectures for Time Series Forecasting

The core objective of time series forecasting is to predict future values of a target variable based on historical data. The paper categorizes deep learning models into three primary types: Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Attention Mechanisms.

Convolutional Neural Networks (CNNs) leverage causal convolutions to ensure that only past information is utilized for predictions. The authors emphasize the resemblance of CNNs to finite impulse response (FIR) filters in digital signal processing. One significant detail is the use of dilated convolutions, which enable the models to manage long-term dependencies by aggregating information at different time intervals efficiently.

Recurrent Neural Networks (RNNs), including LSTMs, rely on an internal memory state recursively updated with new information. The paper draws an analogy between RNNs and infinite impulse response (IIR) filters. The efficacy of RNNs in sequence modeling, particularly in natural language processing tasks, justifies their application in time series forecasting.

Attention Mechanisms address the challenge of learning long-term dependencies by dynamically weighting the importance of past information. The paper highlights the success of Transformer architectures in natural language processing and their subsequent adaptation for time series forecasting. Attention mechanisms provide interpretative benefits by indicating which time points significantly influence predictions.

Multi-Horizon Forecasting Models

The need for multi-horizon forecasts in various applications necessitates models that can predict a series of future values. The authors describe two main approaches: iterative and direct methods.

Iterative Methods produce forecasts recursively by feeding predictions of previous steps as inputs for subsequent steps. This approach allows for generalization from one-step-ahead models but can accumulate error over longer horizons. In contrast,

Direct Methods or sequence-to-sequence approaches utilize encoders and decoders to produce forecasts directly for all time steps in the future horizon. This method mitigates error propagation inherent in iterative approaches and accommodates known future inputs.

Incorporating Domain Knowledge with Hybrid Models

A distinct contribution of the paper is the detailed analysis of hybrid models, which blend statistical models with deep learning components.

Non-Probabilistic Hybrid Models, such as ES-RNN, integrate exponential smoothing to capture non-stationary trends, while neural networks handle additional effects. This integration leverages domain expertise, reducing overfitting risks in small data regimes.

Probabilistic Hybrid Models couple deep neural networks with probabilistic generative models like Gaussian processes and linear state space models. Networks generate parameters for predictive distributions, allowing for the incorporation of uncertainty in forecasts.

Facilitating Decision Support Using Deep Neural Networks

The authors argue that beyond accuracy, models should assist in decision-making processes. They discuss two key areas:

Interpretability: Post-hoc techniques such as LIME and SHAP, and gradient-based methods help interpret neural network models. Moreover, attention weights in architectures provide inherent interpretability by quantifying the significance of temporal features.

Counterfactual Predictions and Causal Inference: Addressing time-dependent confounding effects, the paper mentions modern approaches like inverse-probability-of-treatment-weighting and G-computation frameworks, which make deep learning adept at providing unbiased causal estimates over time.

Conclusions and Future Directions

The paper concludes by identifying limitations and future research directions in deep learning for time series forecasting. Specifically, it points out the necessity for models capable of handling irregular time intervals and hierarchical structures in data. The development of continuous-time models via Neural Ordinary Differential Equations and architectures that account for hierarchical dependencies represent promising directions for future research.

In summary, this survey provides an in-depth examination of the state-of-the-art in deep learning for time series forecasting, highlighting key architectures, hybrid models, and their implications for enhanced decision support.

PDF Markdown