Emergent Mind

Graph-Community Detection for Cross-Document Topic Segment Relationship Identification

(1606.04081)
Published Jun 13, 2016 in cs.CL , cs.IR , and cs.SI

Abstract

In this paper we propose a graph-community detection approach to identify cross-document relationships at the topic segment level. Given a set of related documents, we automatically find these relationships by clustering segments with similar content (topics). In this context, we study how different weighting mechanisms influence the discovery of word communities that relate to the different topics found in the documents. Finally, we test different mapping functions to assign topic segments to word communities, determining which topic segments are considered equivalent. By performing this task it is possible to enable efficient multi-document browsing, since when a user finds relevant content in one document we can provide access to similar topics in other documents. We deploy our approach in two different scenarios. One is an educational scenario where equivalence relationships between learning materials need to be found. The other consists of a series of dialogs in a social context where students discuss commonplace topics. Results show that our proposed approach better discovered equivalence relationships in learning material documents and obtained close results in the social speech domain, where the best performing approach was a clustering technique.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.