Spatiotemporal Pattern Mining for Nowcasting Extreme Earthquakes in Southern California (2012.14336v3)

Published 20 Dec 2020 in physics.geo-ph, cs.CV, and cs.LG

Abstract: Geoscience and seismology have utilized the most advanced technologies and equipment to monitor seismic events globally from the past few decades. With the enormous amount of data, modern GPU-powered deep learning presents a promising approach to analyze data and discover patterns. In recent years, there are plenty of successful deep learning models for picking seismic waves. However, forecasting extreme earthquakes, which can cause disasters, is still an underdeveloped topic in history. Relevant research in spatiotemporal dynamics mining and forecasting has revealed some successful predictions, a crucial topic in many scientific research fields. Most studies of them have many successful applications of using deep neural networks. In Geology and Earth science studies, earthquake prediction is one of the world's most challenging problems, about which cutting-edge deep learning technologies may help discover some valuable patterns. In this project, we propose a deep learning modeling approach, namely \tseqpre, to mine spatiotemporal patterns from data to nowcast extreme earthquakes by discovering visual dynamics in regional coarse-grained spatial grids over time. In this modeling approach, we use synthetic deep learning neural networks with domain knowledge in geoscience and seismology to exploit earthquake patterns for prediction using convolutional long short-term memory neural networks. Our experiments show a strong correlation between location prediction and magnitude prediction for earthquakes in Southern California. Ablation studies and visualization validate the effectiveness of the proposed modeling method.

Citations (5)

View on Semantic Scholar

Summary

The paper presents EQPred, a hybrid deep learning model combining convolutional autoencoders and TCNs to nowcast extreme earthquakes by learning spatial and temporal dependencies.
It effectively transforms daily seismic catalogs into 2D energy grids and leverages skip connections and attention modules to improve prediction accuracy on rare major events.
The approach outperforms various baselines with notable metrics (MAE=0.0483, precision=0.9563, recall=0.9016), promising real-time hazard assessment in seismically active regions.

Deep Spatiotemporal Learning for Nowcasting Extreme Earthquakes in Southern California

Introduction

Nowcasting of large-magnitude, rare earthquakes remains a central challenge in seismology, presenting unique requirements for modeling both spatial and temporal dependencies in high-dimensional seismic data. The paper introduces EQPred, a deep learning framework specifically designed to mine spatiotemporal earthquake patterns from regional seismic catalogs, representing events as daily 2D spatial grids and learning their evolution using a hybrid autoencoder and Temporal Convolutional Network (TCN) architecture. The paper focuses on Southern California, leveraging its dense seismicity and rich instrumental record, and aims to forecast the probability of forthcoming major earthquakes—here defined as events of magnitude $\geq 4.5$ .

Dataset Construction and Spatiotemporal Representation

The core of EQPred's approach is transforming sparse, event-based earthquake catalogs into structured spatiotemporal inputs amenable to convolutional and recurrent neural architectures. Earthquake events (time, location, magnitude) between 1990–2019 are discretized into a 2D spatial grid (longitude-latitude bins, $\sim$ 11km per grid cell), with each day's cell value representing the accumulated released seismic energy, computed as $E = 10^{1.5 M}$ for magnitude $M$ .

Figure 1: Catalog visualization for Southern California: (a) epicenter map; (b) satellite image; (c) active faults; (d) seismic event heatmap.

A pivotal empirical observation is the extreme class imbalance: out of $444,\!589$ recorded events, only $237$ exceed $M \geq 4.5$ —an order of magnitude separation that makes rare event prediction both computationally and statistically challenging.

Figure 3: Magnitude-filtered event distributions, highlighting the sparsity of high-magnitude events compared to lower-magnitude occurrences.

EQPred Model Architecture

EQPred is a hybrid model comprising a convolutional autoencoder for spatial representation learning (spatial filtering of daily “maps” of seismicity/energy) and a TCN with auxiliary temporal attention for sequence modeling. The pipeline operates as follows:

A multi-layer 2D convolutional autoencoder maps each daily earthquake energy grid into a compact latent feature vector, using symmetric skip connections to mitigate vanishing gradient effects and preserve spatial detail.
The bottleneck latent vectors are temporally concatenated and supplied to a TCN-based sequence model, which implements causal, dilated convolutions to capture dependencies over long temporal spans. An optional local temporal attention module modulates the representation, allowing the model to differentially weight salient historical inputs—a mechanism inspired by architectural elements in Transformer-based models.
The autoencoder is trained in a self-supervised fashion, minimizing mean squared error (MSE) for map reconstruction (using only “normal” days, i.e., non-major events), while the TCN predictor is trained with Nash–Sutcliffe model efficiency (NSE) loss, to ensure high-fidelity matching between forecasted and actual event probability distributions.
Figure 2: Schematic of EQPred system: convolutional autoencoder (left) with skip connections for spatial encoding, TCN-based temporal sequence model (right) with attention and prediction head for earthquake nowcasting.

Training and Implementation Details

All models were implemented in TensorFlow 2, leveraging both CPUs and NVidia K80 GPUs.
Conv2D encoder/decoder filter sizes: [4, 16, 32, 64]
Latent bottleneck dimension: tuned between 16–1024 (trade-off between expressivity and regularization)
TCN dilation rates: exponential scheduling $2^i$ , enabling both short- and long-term skill
Batch size: 16, 64, or 128 for spatial modeling; batch size 1 in temporal modeling to maintain statefulness
Early stopping with checkpointed minimum validation loss
Event definition: target is a binary label indicating whether $M \geq 4.5$ event occurs in the next prediction window

Performance Evaluation and Ablation

A comprehensive benchmarking is performed against 11 alternative models, including fully connected MLPs, standard LSTM architectures, Conv2D + MLP/LSTM/Conv1D, ConvLSTM2D-FC, etc.

Key Results:

EQPred achieves MAE=0.0483, precision=0.9563, recall=0.9016, F1=0.9251, and NSE=0.9323 for $M \geq 4.5$ event prediction—substantially outperforming all baselines.
Conv2D-based autoencoders (even without skip-connections) yield large improvements over MLPs, highlighting the value of spatial locality for plate tectonic pattern mining.
Ablation studies reveal that both skip connections in the spatial encoder and temporal attention in the sequence module independently improve performance, but neither is singularly responsible for EQPred’s gain.
Figure 4: Representative nowcasting output using EQPred; model predicts likelihood of imminent strong earthquake given a full spatiotemporal window of inputs.

Real-World and Scientific Implications

EQPred’s main advance is the explicit joint learning of space and time via domain-specific neural architectures, enabling improved rare event detection over both naively supervised baselines and “black-box” dense neural models that ignore spatial constraints. The method exploits inherent locality of seismic energy release along faults as well as temporal clustering due to aftershock/foreshock sequences—factors which generic LSTM or MLP models cannot leverage.

Practically, the pipeline provides a basis for early-warning in geophysically active regions, supporting real-time operational objectives in hazard assessment and disaster management. For deployment, the system can be continually retrained with streaming data using the rolling-window input formulation, and its compact model footprint allows efficient inference on both cloud and edge platforms.

Strong numerical performance on extreme class imbalance underscores the architecture’s robustness to limited ground truth, though accuracy is likely to saturate for even rarer $M \geq 6$ events or in heterogeneous tectonic settings.

Implementation Considerations and Extensions

The 2D convolutional autoencoder is critical for extracting spatially coherent features; kernel sizes, layer depth, and skip connections should be tuned aggressively to maximize information retention under severe class imbalance.
The TCN module’s dilation hyperparameters control the effective receptive field—models intended for rapid nowcasting (short-lead prediction) versus longer-range forecasting should be configured accordingly.
For large-scale regional modeling, data pipeline parallelization is essential; TensorFlow’s data streaming APIs can be leveraged for continuous batching and online inference.
Extensible design: the current approach models only magnitude and location but could be augmented with auxiliary data (e.g., satellite remote sensing, GPS deformation, geoelectricity) as multi-modal channels for further boost in predictive power.

Limitations and Future Directions

Current limitations include reliance on a single-region dataset; transferability to other tectonic settings remains untested. Ground-truth bias and catalog incompleteness may affect real-world deployment, and ground motion, not just event occurrence, should be considered for utility-scale early warning.

Long-term, advances in spatiotemporal attention mechanisms, integration of physics-based features, and multimodal sensory fusion (e.g., EO data) can further improve prediction quality and robustness. EQPred provides a modular canonical architecture for such generalizations.

Conclusion

EQPred introduces a methodologically rigorous, empirically validated deep learning approach for spatiotemporal rare event prediction in earthquake data. By combining spatial encoding via convolutional autoencoders with temporal modeling through TCNs and attention mechanisms, the framework demonstrates significant gains over standard baselines in nowcasting major earthquakes under real-world conditions characterized by severe class imbalance and observational noise. The hybrid architecture, pipeline, and deployment strategies provide a template for broader applications in scientific forecasting and spatiotemporal risk assessment.