研究目的
To improve the accuracy and robustness of 3D human pose estimation from depth images by extending the Random Tree Walk algorithm with multiple hypotheses and stochastic optimization methods to handle unlearned poses and reduce failures due to occlusions and high complexity.
研究成果
The proposed method significantly improves pose estimation accuracy, especially for agile joints like wrists and ankles, by combining multiple hypotheses with stochastic optimization. It achieves higher mAP compared to the original RTW, with the best results from combining ICP and GA. Future work should focus on refining the cost function and body model for more precise estimations.
研究不足
The sphere model is an approximation of the human body and may not capture fine details, leading to stagnation in optimization after a few iterations. The cost function and model could be inaccurate, limiting further refinement. The approach requires multiple hypotheses and iterations, which increases computational cost, running at approximately 35 fps on a single-threaded CPU, potentially not real-time for high-frame-rate applications.
1:Experimental Design and Method Selection:
The approach combines the discriminative Random Tree Walk (RTW) method with generative optimization techniques, including iterative closest point (ICP) and a genetic algorithm (GA), to generate and refine multiple pose hypotheses. The RTW is used for initial hypothesis generation, and optimization is performed to minimize a cost function based on a sphere-based human body model.
2:Sample Selection and Data Sources:
A dataset of 26,372 depth images with known true joint positions is used, split into 80% for training the RTW forests and 20% for testing. The images have a resolution of 320x240 pixels and include challenging scenes with self-occlusions.
3:List of Experimental Equipment and Materials:
Depth camera for capturing images, standard desktop CPU (e.g., Intel i5) for processing. No specific models or brands are mentioned for the equipment.
4:Experimental Procedures and Operational Workflow:
Hypotheses are generated using five sources: RTW with random start position shifts, RTW initialized from the last frame's pose, RTW with contributions from individual trees, Kalman filter predictions, and the last frame's pose. These hypotheses are optimized over iterations using ICP and GA, with the best hypothesis selected as the final pose.
5:Data Analysis Methods:
Performance is evaluated using mean average precision (mAP), where a joint is considered correctly estimated if its distance to the true position is less than 10 cm. The cost function includes terms for data alignment, silhouette matching, collision avoidance, and symmetry.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容