Spurfies

Spurfies: Sparse Surface Reconstruction using Local Geometry Priors

¹Max Planck Institute for Informatics, Saarland Informatics Campus, Germany, ²Saarland University, Saarland Informatrics Campus, Germany

Abstract

We introduce Spurfies, a novel method for sparse-view surface reconstruction that disentangles appearance and geometry information to utilize local geometry priors trained on synthetic data. Recent research heavily focuses on 3D reconstruction using dense multi-view setups, typically requiring hundreds of images. However, these methods often struggle with few-view scenarios. Existing sparse-view reconstruction techniques often rely on multi-view stereo networks that need to learn joint priors for geometry and appearance from a large amount of data. In contrast, we introduce a neural point representation that disentangles geometry and appearance to train a local geometry prior using a subset of the synthetic ShapeNet dataset only. During inference, we utilize this surface prior as additional constraint for surface and appearance reconstruction from sparse input views via differentiable volume rendering, restricting the space of possible solutions. We validate the effectiveness of our method on the DTU dataset and demonstrate that it outperforms previous state of the art by 35% in surface quality while achieving competitive novel view synthesis quality. Moreover, in contrast to previous works, our method can be applied to larger, unbounded scenes, such as Mip-NeRF 360.

Method

Method overview: 1) Preprocess: given a sparse set of input views, we make use of DUSt3R to predict points \(\mathcal{P}\). Representation: The points serve as basis for a neural point representation that stores disentangled features \(\mathbf{f}^a\), \(\mathbf{f}^g\) for geometry and appearance on each point. Local Prior (top): We learn a local geometry prior \(G_\textnormal{LP}\) & \(G_\textnormal{REG}\) over a subset of shapes from the synthetic ShapeNet dataset by optimizing to predict ground truth SDF. 3) Spurfies (bottom): We make use of the prior for surface reconstruction from sparse images, only optimizing the latent codes \(\mathbf{f}^a, \mathbf{f}^g\) and the color MLPs \(A_\textnormal{LP}\) & \(A_\textnormal{REG}\) to reconstruct images via volume rendering.

Results

DTU

Qualitative mesh reconstruction comparison on DTU. Compared to previous state-of-the-art sparse-view methods, our reconstruction demonstrates superior completeness in regions with less view overlap. Our closest competitor is NeuSurf, which also reconstructs high quality surfaces on the object-centric DTU scenes.

Qualitative comparison. of mesh reconstruction with the point-based mesh reconstruction methods. In contrast to our approach, point-based mesh reconstruction methods often show missing areas, even when initialized with DUST3R point clouds.

Sampled points from the reconstructed mesh on few scans from DTU dataset.

Mip-NeRF 360

Qualitative mesh reconstruction comparison on Mip-NeRF 360. While NeuSurf also produces good results on object-centric scenes it fails on larger, unbounded scenes. In contrast, our local prior generalizes well to Mip-NeRF 360. Please refer to the appendix for additional results.

Analysis

Learned geometry latent codes visualized via PCA. We observe similar features for points with same surface orientations.

We extend our analysis to demonstrate the potential of our optimized geometry latent codes for point cloud clustering. Below fig illustrates this capability using the Stanford bunny model, where we present six distinct clusters derived from these codes.

Clustering of optimized geometry latent codes based on six orientations. The geometry latent codes add local descriptive information to point clouds.