The Dynamic Embedded Topic Model (1907.05545v2)

Published 12 Jul 2019 in cs.CL and stat.ML

Abstract: Topic modeling analyzes documents to learn meaningful patterns of words. For documents collected in sequence, dynamic topic models capture how these patterns vary over time. We develop the dynamic embedded topic model (D-ETM), a generative model of documents that combines dynamic latent Dirichlet allocation (D-LDA) and word embeddings. The D-ETM models each word with a categorical distribution parameterized by the inner product between the word embedding and a per-time-step embedding representation of its assigned topic. The D-ETM learns smooth topic trajectories by defining a random walk prior over the embedding representations of the topics. We fit the D-ETM using structured amortized variational inference with a recurrent neural network. On three different corpora---a collection of United Nations debates, a set of ACL abstracts, and a dataset of Science Magazine articles---we found that the D-ETM outperforms D-LDA on a document completion task. We further found that the D-ETM learns more diverse and coherent topics than D-LDA while requiring significantly less time to fit.

Authors (3)

Adji B. Dieng (12 papers)
Francisco J. R. Ruiz (22 papers)
David M. Blei (111 papers)

Citations (84)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings (2022)
Keyword Assisted Embedded Topic Model (2021)
Exclusive Topic Modeling (2021)
Topic Modeling in Embedding Spaces (2019)
Topic Modeling over Short Texts by Incorporating Word Embeddings (2016)