TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes (2405.20283v4)

Published 30 May 2024 in cs.CV and cs.GR

Abstract: We introduce TetSphere Splatting, a Lagrangian geometry representation designed for high-quality 3D shape modeling. TetSphere splatting leverages an underused yet powerful geometric primitive -- volumetric tetrahedral meshes. It represents 3D shapes by deforming a collection of tetrahedral spheres, with geometric regularizations and constraints that effectively resolve common mesh issues such as irregular triangles, non-manifoldness, and floating artifacts. Experimental results on multi-view and single-view reconstruction highlight TetSphere splatting's superior mesh quality while maintaining competitive reconstruction accuracy compared to state-of-the-art methods. Additionally, TetSphere splatting demonstrates versatility by seamlessly integrating into generative modeling tasks, such as image-to-3D and text-to-3D generation.

Citations (4)

View on Semantic Scholar

Summary

The paper introduces TSSplat as an explicit Lagrangian approach that deforms TetSpheres for high-quality, efficient 3D geometry reconstruction.
It employs a two-stage optimization leveraging rendering loss and bi-harmonic regularization to refine deformation while preventing mesh inversion.
TSSplat outperforms neural implicit and Eulerian methods by achieving faster optimization, lower memory usage, and robust handling of complex topologies.

This paper introduces TetSphere Splatting (TSSplat), a novel 3D shape representation designed for high-quality geometry reconstruction with computational efficiency (2405.20283). It addresses limitations found in prevalent methods like neural implicit fields (NeRF, NeuS) and explicit Eulerian grid-based approaches (DMTet), which often suffer from high computational costs, slow optimization, require post-processing steps (like Marching Cubes) that can degrade mesh quality, and struggle with thin structures or topological changes.

TSSplat proposes an explicit, Lagrangian approach. Unlike Eulerian methods that define geometry on a fixed grid, Lagrangian methods track the deformation of geometric primitives through space. Here, the primitives are tetrahedral spheres (TetSpheres) – tetrahedralized meshes of spheres. The final 3D shape is represented as the union of multiple deformed TetSpheres.

Core Concepts and Implementation:

Representation: The geometry is composed of $M$ $M$ TetSpheres. Each TetSphere is a mesh with $N$ $N$ vertices and $T$ $T$ tetrahedra. The key idea is to deform the initial spheres by optimizing the positions $\mathbf{x} \in \mathbb{R}^{3NM}$ $x \in R^{3 NM}$ of all vertices across all spheres.
- Advantages over other representations:
  - vs. Neural Implicit/Eulerian Explicit (DMTet): TSSplat avoids neural networks, leading to faster optimization and lower memory usage. It directly produces a mesh, eliminating the need for iso-surface extraction (like Marching Cubes or Marching Tetrahedra) which can introduce artifacts or resolution limitations. Its Lagrangian nature makes it less prone to floating artifacts common in grid-based methods.
  - vs. Surface Meshes: Volumetric tetrahedral meshes are more robust to self-intersections and topological changes during optimization compared to surface meshes, which often require complex remeshing.
  - vs. Gaussian Splatting: Tetrahedralization imposes local connectivity constraints between vertices, providing a clear interior/exterior definition and enabling volumetric regularization, leading to higher-quality, smoother surfaces compared to reconstructing surfaces from Gaussian points.
Initialization: An algorithm called "silhouette coverage" is used to initialize the centers and radii of the TetSpheres. It constructs a coarse voxel grid from input multi-view image silhouettes, identifies candidate centers within the object volume, and uses linear programming to select a minimal subset of candidates whose corresponding spheres cover all candidate voxels.
Optimization: A two-stage optimization process refines the shape and appearance:
- Stage 1: Geometry Optimization: Deforms the initial TetSpheres by optimizing vertex positions $\mathbf{x}$ $x$ . The optimization minimizes an objective function combining:
  - Rendering Loss ( $\mathbf{\Phi}(R(\mathbf{x}))$ ): Matches the rendered appearance of the deformed TetSpheres (using a differentiable rasterizer) with input multi-view images. The loss function $\mathbf{\Phi}$ can adapt to different inputs (e.g., $l_1$ for color, MSE for depth, cosine loss for normals, or SDS loss for generative tasks).
  - Bi-harmonic Energy ( $||\mathbf{L}\mathbf{F}_\mathbf{x}||_2^2$ ): Regularizes the smoothness of the deformation gradient field $\mathbf{F}_\mathbf{x}$ across adjacent tetrahedra. $\mathbf{F}_\mathbf{x}$ measures the local deformation of each tetrahedron, and $\mathbf{L}$ is a Laplacian matrix based on tetrahedra connectivity. This encourages smooth deformation while preserving sharp features, unlike smoothing vertex positions directly.
  - Non-inversion Penalty ( $\sum_{i,j}(\mathrm{min}\{0, \mathrm{det}(\mathbf{F}^{(i, j)}_\mathbf{x})\})^2$ ): Penalizes tetrahedra that invert (turn inside-out) during deformation, ensuring local injectivity ( $\mathrm{det}(\mathbf{F}) > 0$ ). The optimization is performed using gradient descent, with weights for the regularization terms ( $w_1, w_2$ ) adjusted via a cosine scheduler.

# Simplified Optimization Objective (Eq. 2)
minimize L_total = L_rendering + w1 * L_biharmonic + w2 * L_non_inversion
where:
  L_rendering = rendering_loss(render(deformed_tetspheres), target_images)
  L_biharmonic = ||Laplacian(deformation_gradients)||²
  L_non_inversion = sum(max(0, -determinant(deformation_gradient))²)

* Stage 2: Texture/Material Optimization: Optimizes surface appearance (color texture or PBR materials) using differentiable rendering. Since the mesh topology remains fixed throughout the deformation (unlike DMTet), texture parameterization only needs to be done once initially, improving efficiency. For sparse views, an MLP mapping vertex positions to material parameters can be used.

Applications and Results:

The paper demonstrates TSSplat on:
- Single-view 3D Reconstruction: Using multi-view images/normals generated by models like Wonder3d as input.
- Image-to-3D Generation: Using Score Distillation Sampling (SDS) loss with multi-view images generated from an initial coarse NeRF.
- Text-to-3D Generation: Using SDS loss driven by text prompts and potentially diffusion models conditioned on geometry (e.g., normal maps).
Evaluation: TSSplat is evaluated quantitatively on the GSO dataset for single-view reconstruction, using standard metrics (Chamfer Distance, Volumetric IoU) and introducing new mesh quality metrics:
- Area-Length Ratio (ALR): Measures triangle quality (higher is better, closer to equilateral).
- Manifoldness Rate (MR): Percentage of generated meshes that are manifold.
- Connected Component Discrepancy (CC Diff.): Difference in the number of connected components compared to the ground truth, indicating floating artifacts.
Performance: TSSplat achieves competitive reconstruction accuracy and significantly outperforms baselines on the mesh quality metrics. It shows superior qualitative results in generative tasks, producing smoother yet detailed meshes, especially for thin structures. It also demonstrates significantly better computational performance (faster optimization speed, lower memory usage) compared to NeRF, NeuS, and DMTet-based methods in SDS settings.

Practical Implications:

Provides a computationally efficient, explicit alternative for 3D reconstruction and generation, yielding high-quality manifold meshes directly without complex post-processing.
Its Lagrangian nature and volumetric regularization make it robust for complex topologies and thin structures.
The fixed topology during optimization simplifies texture/material mapping.
Can be integrated into existing generative pipelines (e.g., using SDS loss) as a replacement geometry representation, potentially accelerating training and reducing resource requirements.

Limitations:

The union of spheres does not guarantee topology preservation from the initial state if spheres merge or separate significantly, although it can represent arbitrary final topologies.

PDF Markdown

Related Papers

Tweets

https://twitter.com/GuoMh14/status/1889292197805142107