Learned Nonlinear Predictor for Critically Sampled 3D Point Cloud Attribute Compression (2311.13539v2)
Abstract: We study 3D point cloud attribute compression via a volumetric approach: assuming point cloud geometry is known at both encoder and decoder, parameters $\theta$ of a continuous attribute function $f: \mathbb{R}3 \mapsto \mathbb{R}$ are quantized to $\hat{\theta}$ and encoded, so that discrete samples $f_{\hat{\theta}}(\mathbf{x}i)$ can be recovered at known 3D points $\mathbf{x}_i \in \mathbb{R}3$ at the decoder. Specifically, we consider a nested sequences of function subspaces $\mathcal{F}{(p)}{l_0} \subseteq \cdots \subseteq \mathcal{F}{(p)}_L$, where $\mathcal{F}l{(p)}$ is a family of functions spanned by B-spline basis functions of order $p$, $f_l*$ is the projection of $f$ on $\mathcal{F}_l{(p)}$ represented as low-pass coefficients $F_l*$, and $g_l*$ is the residual function in an orthogonal subspace $\mathcal{G}_l{(p)}$ (where $\mathcal{G}_l{(p)} \oplus \mathcal{F}_l{(p)} = \mathcal{F}{l+1}{(p)}$) represented as high-pass coefficients $G_l*$. In this paper, to improve coding performance over \cite{do2023volumetric}, we study predicting $f_{l+1}*$ at level $l+1$ given $f_l*$ at level $l$ and encoding of $G_l*$ for the $p=1$ case (RAHT($1$)). For the prediction, we formalize RAHT(1) linear prediction in MPEG-PCC in a theoretical framework, and propose a new nonlinear predictor using a polynomial of bilateral filter. We derive equations to efficiently compute the critically sampled high-pass coefficients $G_l*$ amenable to encoding. We optimize parameters in our resulting feed-forward network on a large training set of point clouds by minimizing a rate-distortion Lagrangian. Experimental results show that our improved framework outperforms the MPEG G-PCC predictor by $11\%$--$12\%$ in bit rate.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.