研究目的
To infer albedo, shape, and illumination from a single human image for realistic relighting, specifically addressing light occlusion in spherical harmonics formulation to avoid unnaturally bright hollowed regions.
研究成果
The paper presents the first method for occlusion-aware relighting of full-body human images using CNNs to infer light transport maps directly. It achieves more realistic results than previous techniques by accounting for light occlusion, even with a small and carefully prepared dataset. The approach demonstrates that CNNs can learn geometric information from silhouettes, and it enables efficient relighting through dot-product calculations. Future work could extend to more physically accurate models and higher-quality datasets.
研究不足
The method only handles diffuse albedo, not specular components, due to dataset limitations. It may fail with conditions dissimilar to the training data, such as harsh illuminations or unusual lights. The use of second-order spherical harmonics might not capture high-frequency signals from occlusion effectively. Self-supervised learning was not successful due to the high degrees of freedom in light transport maps.
1:Experimental Design and Method Selection:
The methodology involves supervised learning using convolutional neural networks (CNNs) to infer albedo maps, light transport maps (encoding occlusion as nine spherical harmonics coefficients per pixel), and illumination from single human images. The design is based on the spherical harmonics lighting formulation with occlusion consideration, inspired by precomputed radiance transfer techniques.
2:Sample Selection and Data Sources:
A synthetic human image dataset is created using scanned 3D human figures from the BUFF dataset (74 models) and commercial sources (271 models), totaling 345 models split into 276 for training and 69 for testing. An illumination dataset is derived from the Laval Indoor HDR dataset, processed to 40 training and 10 testing illuminations.
3:List of Experimental Equipment and Materials:
Equipment includes a PC with NVIDIA GeForce GTX 1080 Ti GPUs. Software uses Python and the chainer library. Materials involve 3D human models and environment maps.
4:Experimental Procedures and Operational Workflow:
Images are rendered at 1024x1024 resolution with aligned front-facing figures. CNNs are trained using Adam optimizer with a learning rate of
5:0002 and batch size 1 for 60 epochs. Inference involves feeding masked images to the network to output albedo, light transport, and light maps, followed by relighting via dot products. Data Analysis Methods:
Quantitative evaluation uses RMSE and SSIM metrics for shading, transport, normal, ambient occlusion, light, and albedo components. Qualitative comparisons are made with alternative methods like SfSNet and SfSNet-AO.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容