Emergent Mind

Abstract

Dynamic diversificationfinding a set of data points with maximum diversity from a time-dependent sample poolis an important task in recommender systems, web search, database search, and notification services, to avoid showing users duplicate or very similar items. The incremental cover tree (ICT) with high computational efficiency and flexibility has been applied to this task, and shown good performance. Specifically, it was empirically observed that ICT typically provides a set with its diversity only marginally ($\sim 1/ 1.2$ times) worse than the greedy max-min (GMM) algorithm, the state-of-the-art method for static diversification with its performance bound optimal for any polynomial time algorithm. Nevertheless, the known performance bound for ICT is 4 times worse than this optimal bound. With this paper, we aim to fill this very gap between theory and empirical observations. For achieving this, we first analyze variants of ICT methods, and derive tighter performance bounds. We then investigate the gap between the obtained bound and empirical observations by using specially designed artificial data for which the optimal diversity is known. Finally, we analyze the tightness of the bound, and show that the bound cannot be further improved, i.e., this paper provides the tightest possible bound for ICT methods. In addition, we demonstrate a new use of dynamic diversification for generative image samplers, where prototypes are incrementally collected from a stream of artificial images generated by an image sampler.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.