Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dynamic Multi-Network Mining of Tensor Time Series (2402.11773v2)

Published 19 Feb 2024 in cs.LG, cs.AI, cs.IT, and math.IT

Abstract: Subsequence clustering of time series is an essential task in data mining, and interpreting the resulting clusters is also crucial since we generally do not have prior knowledge of the data. Thus, given a large collection of tensor time series consisting of multiple modes, including timestamps, how can we achieve subsequence clustering for tensor time series and provide interpretable insights? In this paper, we propose a new method, Dynamic Multi-network Mining (DMM), that converts a tensor time series into a set of segment groups of various lengths (i.e., clusters) characterized by a dependency network constrained with l1-norm. Our method has the following properties. (a) Interpretable: it characterizes the cluster with multiple networks, each of which is a sparse dependency network of a corresponding non-temporal mode, and thus provides visible and interpretable insights into the key relationships. (b) Accurate: it discovers the clusters with distinct networks from tensor time series according to the minimum description length (MDL). (c) Scalable: it scales linearly in terms of the input data size when solving a non-convex problem to optimize the number of segments and clusters, and thus it is applicable to long-range and high-dimensional tensors. Extensive experiments with synthetic datasets confirm that our method outperforms the state-of-the-art methods in terms of clustering accuracy. We then use real datasets to demonstrate that DMM is useful for providing interpretable insights from tensor time series.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Time-series clustering–a decade review. Information systems 53 (2015), 16–38.
  2. Time series motifs discovery under DTW allows more robust discovery of conserved structure. Data Mining and Knowledge Discovery 35 (2021), 863–910.
  3. STG2Seq: Spatial-Temporal Graph to Sequence Model for Multi-step Passenger Demand Forecasting. In IJCAI. 1981–1987.
  4. Analysis of Temporal Tensor Datasets on Product Grassmann Manifold. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4869–4877.
  5. Donald J. Berndt and James Clifford. 1994. Using Dynamic Time Warping to Find Patterns in Time Series. In Knowledge Discovery in Databases: Papers from the 1994 AAAI Workshop, Seattle, Washington, USA, July 1994. Technical Report WS-94-03. 359–370.
  6. Ric: Parameter-free noise-robust clustering. TKDD 1, 3 (2007), 10–es.
  7. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Found. Trends Mach. Learn. 3, 1 (2011), 1–122.
  8. Facets: Fast Comprehensive Mining of Coevolving High-order Time Series. In KDD. 79–88.
  9. Techniques for interpretable machine learning. Commun. ACM 63, 1 (2019), 68–77.
  10. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 3 (2008), 432–441.
  11. Tensor analysis with n-mode generalized difference subspace. Expert Systems with Applications 171 (2021), 114559.
  12. Peter D Grünwald. 2007. The minimum description length principle. MIT press.
  13. Network Inference via the Time-Varying Graphical Lasso. In KDD. 205–213.
  14. Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data. In KDD. 215–223.
  15. Efficient Covariance Estimation from Temporal Data. arXiv preprint arXiv:1905.13276 (2019).
  16. Shoji Hirano and Shusaku Tsumoto. 2006. Cluster analysis of time-series medical data based on the trajectory representation and multiscale comparison techniques. In ICDM. IEEE, 896–901.
  17. Network of Tensor Time Series. In WWW, Jure Leskovec, Marko Grobelnik, Marc Najork, Jie Tang, and Leila Zia (Eds.). 2425–2437.
  18. Ssmf: Shifting seasonal matrix factorization. Advances in Neural Information Processing Systems 34 (2021), 3863–3873.
  19. Eamonn Keogh. 2002. Exact Indexing of Dynamic Time Warping. In VLDB (Hong Kong, China). 406–417.
  20. An Online Algorithm for Segmenting Time Series. In Proceedings of the 2001 IEEE International Conference on Data Mining, 29 November - 2 December 2001, San Jose, California, USA. IEEE Computer Society, 289–296.
  21. Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM review 51, 3 (2009), 455–500.
  22. Generalizing tensor decomposition for n-ary relational knowledge bases. In WWW. 1104–1114.
  23. Anant Madabhushi and George Lee. 2016. Image analysis and machine learning in digital pathology: Challenges and opportunities. Medical image analysis 33 (2016), 170–175.
  24. AutoPlait: Automatic Mining of Co-Evolving Time Sequences. In SIGMOD. 193–204.
  25. Non-Linear Mining of Competing Local Activities. In WWW.
  26. Sparse graphical modeling via stochastic complexity. In Proceedings of the 2017 SIAM International Conference on Data Mining. SIAM, 723–731.
  27. Driver modeling based on driving behavior and its evaluation in driver identification. IEEE 95, 2 (2007), 427–437.
  28. Node-Based Learning of Multiple Gaussian Graphical Models. J. Mach. Learn. Res. 15, 1 (jan 2014), 445–488.
  29. Estimating time-varying brain connectivity networks from functional MRI time series. NeuroImage 103 (2014), 427–443.
  30. Fast and Multi-aspect Mining of Complex Time-stamped Event Streams. In WWW. 1638–1649.
  31. Network analysis of a financial market based on genuine correlation and threshold method. Physica A: Statistical Mechanics and its Applications 390, 21 (2011), 3835–3841.
  32. Streaming pattern discovery in multiple time-series. (2005).
  33. Claudia Plant and Christian Böhm. 2011. Inconco: interpretable clustering of numerical and categorical objects. In KDD. 1127–1135.
  34. Multivariate Clustering by Dynamics. In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence. AAAI Press, 633–638.
  35. Multilinear Dynamical Systems for Tensor Time Series. In NIPS. 2634–2642.
  36. Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature machine intelligence 1, 5 (2019), 206–215.
  37. Havard Rue and Leonhard Held. 2005. Gaussian Markov random fields: theory and applications. CRC press.
  38. Correlating Financial Time Series with Micro-Blogging Activity. In WSDM (Seattle, Washington, USA). Association for Computing Machinery, New York, NY, USA, 513–522.
  39. AutoCyclone: Automatic Mining of Cyclic Online Activities with Robust Tensor Factorization. In WWW (Perth, Australia). 213–221.
  40. The cluster graphical lasso for improved estimation of Gaussian graphical models. Computational statistics & data analysis 85 (2015), 23–36.
  41. Temporal Pattern Detection in Time-Varying Graphical Models. In ICPR. 4481–4488.
  42. Latent Variable Time-varying Network Inference. In KDD. 2338–2346.
  43. Statistical Models Coupling Allows for Complex Local Multivariate Time Series Analysis. In KDD. 1593–1603.
  44. Discovering similar multidimensional trajectories. In Proceedings 18th international conference on data engineering. IEEE, 673–684.
  45. Trend-Aware Tensor Factorization for Job Skill Demand Analysis.. In IJCAI. 3891–3897.
  46. Matt Wytock and Zico Kolter. 2013. Sparse Gaussian conditional random fields: Algorithms, theory, and application to energy forecasting. In International conference on machine learning. PMLR, 1265–1273.
  47. Yimin Xiong and Dit-Yan Yeung. 2004. Time series clustering with ARMA mixtures. Pattern Recognition 37, 8 (2004), 1675–1689.
  48. Xiang Xuan and Kevin Murphy. 2007. Modeling Changing Dependency Structure in Multivariate Time Series. In ICML (Corvalis, Oregon, USA). Association for Computing Machinery, New York, NY, USA, 1055–1062.
  49. Ming Yuan and Yi Lin. 2006. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B: Statistical Methodology 68, 1 (2006), 49–67.
  50. Cautionary tales on air-quality improvement in Beijing. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 473, 2205 (2017), 20170457.
  51. A review of subsequence time series clustering. The Scientific World Journal 2014 (2014).
Citations (1)

Summary

We haven't generated a summary for this paper yet.