Emergent Mind

Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

(2404.15772)
Published Apr 24, 2024 in cs.LG

Abstract

Long-term time series forecasting (LTSF) provides longer insights into future trends and patterns. Over the past few years, deep learning models especially Transformers have achieved advanced performance in LTSF tasks. However, LTSF faces inherent challenges such as long-term dependencies capturing and sparse semantic characteristics. Recently, a new state space model (SSM) named Mamba is proposed. With the selective capability on input data and the hardware-aware parallel computing algorithm, Mamba has shown great potential in balancing predicting performance and computational efficiency compared to Transformers. To enhance Mamba's ability to preserve historical information in a longer range, we design a novel Mamba+ block by adding a forget gate inside Mamba to selectively combine the new features with the historical features in a complementary manner. Furthermore, we apply Mamba+ both forward and backward and propose Bi-Mamba+, aiming to promote the model's ability to capture interactions among time series elements. Additionally, multivariate time series data in different scenarios may exhibit varying emphasis on intra- or inter-series dependencies. Therefore, we propose a series-relation-aware decider that controls the utilization of channel-independent or channel-mixing tokenization strategy for specific datasets. Extensive experiments on 8 real-world datasets show that our model achieves more accurate predictions compared with state-of-the-art methods.

Architecture of Bi-Mamba4TS: processes time series via embedding layers, Bi-Mamba Encoders, and MLP projector.

Overview

  • The Bi-Mamba4TS model introduces a novel approach to enhance time series forecasting by integrating bidirectional Mamba models, aiming to improve handling of long-range dependencies and computational efficiency.

  • It features a series-relation-aware (SRA) decider based on the Pearson correlation coefficient, facilitating adaptive strategy selection between channel-independent and channel-mixing scenarios based on dataset characteristics.

  • Extensive testing on diverse datasets shows that Bi-Mamba4TS outperforms existing models in long-term, multivariate time series forecasting, demonstrating its practical utility and superior performance.

Bi-Mamba4TS: Enhancing Long-Term Time Series Forecasting with a Bidirectional Mamba Model

Introduction

Time series forecasting (TSF) plays a pivotal role across numerous domains such as traffic management, energy, and finance, particularly where long-term forecasting is paramount. Although Transformer-based models have gained traction in this regard due to their capacity to model long-range dependencies, their quadratic computational complexity remains a significant bottleneck. Recently, state space models (SSM), and notably the Mamba model, have emerged as effective alternatives due to their linear computational complexity and robustness in handling long sequences. Building on this, we introduce the Bi-Mamba4TS model which integrates bidirectional Mamba models to enhance the capability of time series forecasting.

Model Architecture

The Bi-Mamba4TS employs a novel approach to model both "channel-independent" and "channel-mixing" scenarios via a mechanism that evaluates dataset characteristics to decide on the appropriate strategy. This is governed by a "series-relation-aware" (SRA) decider that leverages the Pearson correlation coefficient, providing an objective basis for strategy selection. Input data is decomposed into patches to enrich local semantic information. This patch-wise tokenization not only helps in reducing computational load but also enhances the model's ability to capture intricate evolutionary patterns in the data.

Main Contributions:

  • We propose a new SSM-based model, Bi-Mamba4TS, harnessing the power of bidirectional Mamba encoders, which enhances the modeling of long-range dependencies in time series data.
  • The model introduces a decision-making mechanism (SRA decider) based on Pearson correlation coefficients to autonomously decide between channel-independent and channel-mixing strategies based on the dataset characteristics.
  • Extensive experiments on diverse real-world datasets show that Bi-Mamba4TS achieves superior forecasting accuracy compared to existing state-of-the-art methods.

Model Evaluation and Results

In rigorous experiments across seven varied real-world datasets, Bi-Mamba4TS consistently outperformed other leading models in long-term multivariate time-series forecasting. The model not only excelled in terms of prediction accuracy but also demonstrated efficiency in computational resource utilization. These experiments underscore the effectiveness of Bi-Mamba4TS in a practical setting, making it a valuable tool for various real-life applications requiring accurate and efficient long-term forecasting.

Model Efficiency and Ablation Study

Efficiency analysis reiterated that Bi-Mamba4TS balances well between accuracy and computational demands. The ablation studies further validated the importance of bidirectional encoding and adaptive strategy selection in enhancing forecasting performance. The model demonstrates robustness across different parameter settings, emphasizing its practical utility.

Future Directions

The promising results invite further exploration into more complex and dynamic scenarios, such as adaptive forecasting in rapidly changing environments. Future work could also delve into refining the SRA decider to accommodate more nuanced dataset characteristics and exploring the integration of Bi-Mamba4TS with other forecasting frameworks to leverage complementary strengths.

Conclusion

Bi-Mamba4TS sets a new benchmark in long-term time series forecasting by effectively addressing the computational inefficiencies of traditional models and introducing an adaptive mechanism that aligns model strategy with data characteristics. Its superior performance, backed by rigorous experimental validation, makes it a potent tool for a wide range of applications in time series forecasting.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

YouTube