Tree density estimation (2111.11971v5)
Abstract: We study the problem of estimating the density $f(\boldsymbol x)$ of a random vector ${\boldsymbol X}$ in $\mathbb Rd$. For a spanning tree $T$ defined on the vertex set ${1,\dots ,d}$, the tree density $f_{T}$ is a product of bivariate conditional densities. An optimal spanning tree minimizes the Kullback-Leibler divergence between $f$ and $f_{T}$. From i.i.d. data we identify an optimal tree $T*$ and efficiently construct a tree density estimate $f_n$ such that, without any regularity conditions on the density $f$, one has $\lim_{n\to \infty} \int |f_n(\boldsymbol x)-f_{T*}(\boldsymbol x)|d\boldsymbol x=0$ a.s. For Lipschitz $f$ with bounded support, $\mathbb E \left{ \int |f_n(\boldsymbol x)-f_{T*}(\boldsymbol x)|d\boldsymbol x\right}=O\big(n{-1/4}\big)$, a dimension-free rate.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.