研究目的
Estimating the shape and reflectance properties of an object using a single image acquired 'in-the-wild' to enable applications in 3D design, image editing, and augmented reality.
研究成果
The proposed method successfully estimates arbitrary shape and spatially-varying BRDF from a single mobile phone image, outperforming previous methods. Key innovations include the cascade network for iterative refinement, a global illumination rendering layer, and a large-scale synthetic dataset. Future work could address limitations through better data augmentation, explicit handling of interreflections, and applications in material editing and augmented reality.
研究不足
The network does not handle improperly exposed images well (e.g., saturations from flash), may not fully model long-range interreflections due to image-space CNN limitations, and roughness prediction can be inconsistent for the same material. Depth prediction errors are higher than normal predictions, and category-specific biases in shape datasets might affect generalization.
1:Experimental Design and Method Selection:
A deep convolutional neural network (CNN) framework is designed with a cascade structure for iterative refinement and a global illumination rendering layer. The method uses a physically-motivated approach to regress shape (depth and surface normals) and SVBRDF (diffuse albedo and specular roughness) from a single image.
2:Sample Selection and Data Sources:
A large-scale synthetic dataset is generated with 216,000 images, using procedurally generated shapes combined with SVBRDFs from the Adobe Stock material dataset (688 materials) and environment maps from the Laval Indoor HDR dataset. Real data is captured using an iPhone 10 with flash enabled in indoor environments.
3:List of Experimental Equipment and Materials:
iPhone 10 for image capture, Adobe Lightroom app for linear image capture, GPU-accelerated rendering with Optix for synthetic data generation, and standard computing hardware for training and inference.
4:Experimental Procedures and Operational Workflow:
Images are captured with flash and environment illumination. The network is trained sequentially: first the global illumination prediction network (GINet), then the cascade stages for shape and SVBRDF estimation. Training uses Adam optimizer with specific learning rates and batch sizes. Inference involves passing the input image through the cascade network to output shape and reflectance parameters.
5:Data Analysis Methods:
Quantitative evaluation uses L2 loss for parameters (albedo, normal, roughness, depth, environment map) and image reconstruction. Qualitative analysis includes visual comparisons and rendering under novel lighting conditions.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容