CFTM: Continuous time fractional topic model (2402.01734v2)
Abstract: In this paper, we propose the Continuous Time Fractional Topic Model (cFTM), a new method for dynamic topic modeling. This approach incorporates fractional Brownian motion~(fBm) to effectively identify positive or negative correlations in topic and word distribution over time, revealing long-term dependency or roughness. Our theoretical analysis shows that the cFTM can capture these long-term dependency or roughness in both topic and word distributions, mirroring the main characteristics of fBm. Moreover, we prove that the parameter estimation process for the cFTM is on par with that of LDA, traditional topic models. To demonstrate the cFTM's property, we conduct empirical study using economic news articles. The results from these tests support the model's ability to identify and track long-term dependency or roughness in topics over time.
- D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003.
- H. Jelodar, Y. Wang, C. Yuan, X. Feng, X. Jiang, Y. Li, and L. Zhao, “Latent dirichlet allocation (lda) and topic modeling: models, applications, a survey,” Multimedia Tools and Applications, vol. 78, pp. 15 169–15 211, 2019.
- D. M. Blei and J. D. Lafferty, “A correlated topic model of science,” The annals of applied statistics, vol. 1, no. 1, pp. 17–35, 2007.
- ——, “Dynamic topic models,” in Proceedings of the 23rd international conference on Machine learning, 2006, pp. 113–120.
- A. Abdelrazek, Y. Eid, E. Gawish, W. Medhat, and A. Hassan, “Topic modeling algorithms and applications: A survey,” Information Systems, vol. 112, p. 102131, 2023.
- C. Wang, D. M. Blei, and D. Heckerman, “Continuous time dynamic topic models,” in UAI’08 Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, 2008.
- L. Hong and B. D. Davison, “Empirical study of topic modeling in twitter,” in Proceedings of the first workshop on social media analytics, 2010, pp. 80–88.
- X. Wang, C. Zhai, X. Hu, and R. Sproat, “Mining correlated bursty topic patterns from coordinated text streams,” in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 2007, pp. 784–793.
- X. Wang and A. McCallum, “Topics over time: a non-markov continuous-time model of topical trends,” in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006, pp. 424–433.
- J. Leskovec, L. Backstrom, and J. Kleinberg, “Meme-tracking and the dynamics of the news cycle,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009, pp. 497–506.
- B. B. Mandelbrot and J. W. Van Ness, “Fractional brownian motions, fractional noises and applications,” SIAM review, vol. 10, no. 4, pp. 422–437, 1968.
- S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” Journal of the American society for information science, vol. 41, no. 6, pp. 391–407, 1990.
- T. Hofmann, “Probabilistic latent semantic indexing,” in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, 1999, pp. 50–57.
- B. Jeong, J. Yoon, and J.-M. Lee, “Social media mining for product planning: A product opportunity mining approach based on topic modeling and sentiment analysis,” International Journal of Information Management, vol. 48, pp. 280–290, 2019.
- S. Chowdhury and J. Zhu, “Towards the ontology development for smart transportation infrastructure planning via topic modeling,” in ISARC. Proceedings of the international symposium on automation and robotics in construction, vol. 36. IAARC Publications, 2019, pp. 507–514.
- A. Ambrosino, M. Cedrini, J. B. Davis, S. Fiori, M. Guerzoni, and M. Nuccio, “What topic modeling could reveal about the evolution of economics,” Journal of Economic Methodology, vol. 25, no. 4, pp. 329–348, 2018.
- K. Takano, T. Okada, Y. Shimizu, and K. Nakagawa, “Text mining of future dividend policy sentences from annual securities reports,” in 2023 14th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI). IEEE, 2023, pp. 281–286.
- K. Nakagawa, S. Sashida, R. Kitajima, and H. Sakai, “What do good integrated reports tell us?: An empirical study of japanese companies using text-mining,” in 2020 9th International Congress on Advanced Applied Informatics (IIAI-AAI). IEEE, 2020, pp. 516–521.
- J. Mcauliffe and D. Blei, “Supervised topic models,” Advances in neural information processing systems, vol. 20, 2007.
- T. Manabe, K. Nakagawa, and K. Hidawa, “Identification of b2b brand components and their performance’s relevance using a business card exchange network,” in Pacific Rim Knowledge Acquisition Workshop. Springer, 2021, pp. 152–167.
- S. Sashida and K. Nakagawa, “Stock return prediction with ssestm model using quarterly japanese company handbook,” in 2021 10th International Congress on Advanced Applied Informatics (IIAI-AAI). IEEE, 2021, pp. 01–04.
- R. Alghamdi and K. Alfalqi, “A survey of topic modeling in text mining,” Int. J. Adv. Comput. Sci. Appl.(IJACSA), vol. 6, no. 1, 2015.
- J. Bell, “The kolmogorov continuity theorem, hölder continuity, and kolmogorov–chentsov theorem,” Lecture Notes, University of Toronto, 2015.
- T. E. Duncan, Y. Hu, and B. Pasik-Duncan, “Stochastic calculus for fractional brownian motion i. theory,” SIAM Journal on Control and Optimization, vol. 38, no. 2, pp. 582–612, 2000.
- S. Ferreira and B. Karali, “Do earthquakes shake stock markets?” PloS one, vol. 10, no. 7, p. e0133319, 2015.
- M. Krzyżanowski, “Brexit and the imaginary of ‘crisis’: a discourse-conceptual analysis of european news media,” Critical discourse studies, vol. 16, no. 4, pp. 465–490, 2019.
- R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” Advances in neural information processing systems, vol. 31, 2018.
- L. Kong, J. Sun, and C. Zhang, “Sde-net: Equipping deep neural networks with uncertainty estimates,” arXiv preprint arXiv:2008.10546, 2020.
- K. Hayashi and K. Nakagawa, “Fractional sde-net: Generation of time series data with long-term memory,” in 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 2022.
- B. Gao and L. Pavel, “On the properties of the softmax function with application in game theory and reinforcement learning,” arXiv preprint arXiv:1704.00805, 2017.