Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Consistent procedures for cluster tree estimation and pruning (1406.1546v1)

Published 5 Jun 2014 in stat.ML

Abstract: For a density $f$ on ${\mathbb R}d$, a {\it high-density cluster} is any connected component of ${x: f(x) \geq \lambda}$, for some $\lambda > 0$. The set of all high-density clusters forms a hierarchy called the {\it cluster tree} of $f$. We present two procedures for estimating the cluster tree given samples from $f$. The first is a robust variant of the single linkage algorithm for hierarchical clustering. The second is based on the $k$-nearest neighbor graph of the samples. We give finite-sample convergence rates for these algorithms which also imply consistency, and we derive lower bounds on the sample complexity of cluster tree estimation. Finally, we study a tree pruning procedure that guarantees, under milder conditions than usual, to remove clusters that are spurious while recovering those that are salient.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Kamalika Chaudhuri (122 papers)
  2. Sanjoy Dasgupta (41 papers)
  3. Samory Kpotufe (29 papers)
  4. Ulrike von Luxburg (51 papers)
Citations (63)

Summary

We haven't generated a summary for this paper yet.