FPRF: Feed-Forward Photorealistic Style Transfer of Large-Scale 3D Neural Radiance Fields (2401.05516v1)
Abstract: We present FPRF, a feed-forward photorealistic style transfer method for large-scale 3D neural radiance fields. FPRF stylizes large-scale 3D scenes with arbitrary, multiple style reference images without additional optimization while preserving multi-view appearance consistency. Prior arts required tedious per-style/-scene optimization and were limited to small-scale 3D scenes. FPRF efficiently stylizes large-scale 3D scenes by introducing a style-decomposed 3D neural radiance field, which inherits AdaIN's feed-forward stylization machinery, supporting arbitrary style reference images. Furthermore, FPRF supports multi-reference stylization with the semantic correspondence matching and local AdaIN, which adds diverse user control for 3D scene styles. FPRF also preserves multi-view consistency by applying semantic matching and style transfer processes directly onto queried features in 3D space. In experiments, we demonstrate that FPRF achieves favorable photorealistic quality 3D scene stylization for large-scale scenes with diverse reference images. Project page: https://kim-geonu.github.io/FPRF/
- Unpaired motion style transfer from video to animation. ACM Transactions on Graphics (TOG), 39(4): 64–1.
- Building Rome in a day. In IEEE International Conference on Computer Vision (ICCV).
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In IEEE International Conference on Computer Vision (ICCV), 5855–5864.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5470–5479.
- Emerging properties in self-supervised vision transformers. In IEEE International Conference on Computer Vision (ICCV).
- Upst-nerf: Universal photorealistic style transfer of neural radiance fields for 3d scene. arXiv preprint arXiv:2208.07059.
- Stylizing 3d scene via implicit representation and hypernetwork. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 1475–1484.
- Photowct2: Compact autoencoder for photorealistic style transfer resulting from blockwise training and skip connections of high-frequency residuals. In IEEE Winter Conf. on Applications of Computer Vision (WACV).
- The Cityscapes Dataset for Semantic Urban Scene Understanding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Unified implicit neural stylization. In European Conference on Computer Vision, 636–654. Springer.
- K-Planes: Explicit Radiance Fields in Space, Time, and Appearance. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5501–5510.
- An automated method for large-scale, ground-based city model acquisition. International Journal of Computer Vision, 60: 5–24.
- Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12460–12469.
- Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 35(6): 1397–1409.
- Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization. In IEEE International Conference on Computer Vision (ICCV).
- Multimodal unsupervised image-to-image translation. In Proceedings of the European conference on computer vision (ECCV), 172–189.
- Stylizednerf: consistent 3d scene stylization as stylized nerf via 2d-3d mutual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18342–18352.
- Hdr-plenoxels: Self-calibrating high dynamic range radiance fields. In European Conference on Computer Vision, 384–401. Springer.
- Lerf: Language embedded radiance fields. In IEEE International Conference on Computer Vision (ICCV).
- Decomposing nerf for editing via feature field distillation. In Advances in Neural Information Processing Systems (NeurIPS).
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 740–755. Springer.
- StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8338–8348.
- Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG), 38(4): 1–14.
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In European Conference on Computer Vision (ECCV).
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4): 1–15.
- Snerf: stylized neural implicit representations for 3d scenes. arXiv preprint arXiv:2207.02363.
- Nichol, K. 2016. Painter by numbers, Wikiart, 2016. URL https://www. kaggle. com/c/painter-by-numbers/overview.
- Detailed real-time urban 3d reconstruction from video. International Journal of Computer Vision, 78: 143–167.
- 3dsnet: Unsupervised shape-to-shape 3d style transfer. arXiv preprint arXiv:2011.13388.
- Hand keypoint detection in single images using multiview bootstrapping. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
- Photo tourism: Exploring photo collections in 3D. In SIGGRAPH Conference Proceedings.
- Block-NeRF: Scalable Large Scene Neural View Synthesis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Neural Feature Fusion Fields: 3D distillation of self-supervised 2D image representations. In International Conference on 3D Vision (3DV), 443–453. IEEE.
- Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 12922–12931.
- Lipschitz regularity of deep neural networks: analysis and efficient estimation. Advances in Neural Information Processing Systems, 31.
- Neural Pose Transfer by Spatially Adaptive Instance Normalization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- CCPL: contrastive coherence preserving loss for versatile style transfer. In European Conference on Computer Vision, 189–206. Springer.
- Arf: Artistic radiance fields. In European Conference on Computer Vision, 717–733. Springer.
- Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 20712–20721.
- Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields. In International Conference on Learning Representations (ICLR).
- Very large-scale global sfm by distributed motion averaging. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4568–4577.
- Kim Youwang (9 papers)
- Tae-Hyun Oh (75 papers)
- Geonu Kim (8 papers)