Online Data Augmentation for Forecasting with Deep Learning (2404.16918v2)
Abstract: Deep learning approaches are increasingly used to tackle forecasting tasks involving datasets with multiple univariate time series. A key factor in the successful application of these methods is a large enough training sample size, which is not always available. Synthetic data generation techniques can be applied in these scenarios to augment the dataset. Data augmentation is typically applied offline before training a model. However, when training with mini-batches, some batches may contain a disproportionate number of synthetic samples that do not align well with the original data characteristics. This work introduces an online data augmentation framework that generates synthetic samples during the training of neural networks. By creating synthetic samples for each batch alongside their original counterparts, we maintain a balanced representation between real and synthetic data throughout the training process. This approach fits naturally with the iterative nature of neural network training and eliminates the need to store large augmented datasets. We validated the proposed framework using 3797 time series from 6 benchmark datasets, three neural architectures, and seven synthetic data generation techniques. The experiments suggest that online data augmentation leads to better forecasting performance compared to offline data augmentation or no augmentation approaches. The framework and experiments are publicly available.
- International Journal of Forecasting 27(3), 822–844 (2011)
- Pattern Recognition 120, 108,148 (2021)
- International journal of forecasting 32(2), 303–312 (2016)
- Business Intelligence: Second European Summer School, eBISS 2012, Brussels, Belgium, July 15-21, 2012, Tutorial Lectures 2 pp. 62–77 (2013)
- Machine Learning 109(11), 1997–2028 (2020)
- Journal of Intelligent Information Systems 59(2), 415–433 (2022)
- In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 6989–6997 (2023)
- J. Off. Stat 6(1), 3–73 (1990)
- In: Statistical models in S, pp. 309–376. Routledge (2017)
- Efron, B.: Bootstrap methods: another look at the jackknife. In: Breakthroughs in statistics: Methodology and distribution, pp. 569–593. Springer (1992)
- In: 2017 IEEE international conference on data mining (ICDM), pp. 865–870. IEEE (2017)
- Friedman, J.H.: Stochastic gradient boosting. Computational statistics & data analysis 38(4), 367–378 (2002)
- Knowledge-Based Systems 233, 107,518 (2021)
- OTexts (2018)
- International Journal of Forecasting 36(1), 167–177 (2020)
- Kahn, K.B.: How to measure the impact of a forecast error on an enterprise? The Journal of Business Forecasting 22(1), 21 (2003)
- Statistical Analysis and Data Mining: The ASA Data Science Journal 13(4), 354–376 (2020)
- Kunsch, H.R.: The jackknife and the bootstrap for general stationary observations. The annals of Statistics pp. 1217–1241 (1989)
- Lahiri, S.N.: Resampling methods for dependent data. Springer Science & Business Media (2013)
- International Journal of Forecasting 37(4), 1748–1764 (2021)
- Journal of forecasting 1(2), 111–153 (1982)
- International journal of forecasting 16(4), 451–476 (2000)
- International Journal of forecasting 34(4), 802–808 (2018)
- In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7689–7693. IEEE (2020)
- arXiv preprint arXiv:1905.10437 (2019)
- arXiv preprint arXiv:1904.08779 (2019)
- International journal of forecasting 36(3), 1181–1191 (2020)
- arXiv preprint arXiv:2312.01344 (2023)
- Journal of big data 6(1), 1–48 (2019)
- In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp. 1394–1401. IEEE (2018)
- Smyl, S.: A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. International Journal of Forecasting 36(1), 75–85 (2020)
- arXiv preprint arXiv:2002.12478 (2020)
- In: The eleventh international conference on learning representations (2022)
- In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp. 11,106–11,115 (2021)
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.