Emergent Mind

Domain-topic models with chained dimensions: charting an emergent domain of a major oncology conference

(1912.13349)
Published Dec 31, 2019 in cs.DL , physics.data-an , and physics.soc-ph

Abstract

This paper presents a contribution to the study of bibliographic corpora in the context of science mapping. Starting from a graph representation of documents and their textual dimension, we observe that stochastic block models (SBMs) can provide a simultaneous clustering of documents and words that we call a domain-topic model. Previous work by (Gerlach et al., 2018) investigated the resulting topics, or word clusters, while ours focuses on the study of the document clusters, which we call domains. To enable the synthetic description and interactive navigation of domains, we introduce measures and interfaces relating both types of clusters, which reflect the structure of the graph and the model. We then present a procedure that, starting from the document clusters, extends the block model to also cluster arbitrary metadata attributes of the documents. We call this procedure a domain-chained model, and our previous measures and interfaces can be directly transposed to read the metadata clusters. We provide an example application to a corpus that is relevant to current STS research, and an interesting case for our approach: the 1995-2017 collection of abstracts presented at ASCO, the main annual oncology research conference. Through a sequence of domain-topic and domain-chained models, we identify and describe a particular group of domains in ASCO that have notably grown through the last decades, and which we relate to the establishment of "oncopolicy" as a major concern in oncology.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.