研究目的
To present a hardware implementation of the Gaussian Mixture Model algorithm for background modelling and foreground object segmentation that can handle ultra-high resolution video streams (up to 4K at 60 fps) in real-time, addressing memory bandwidth constraints and evaluating performance on FPGA and GPU platforms.
研究成果
The FPGA implementation of the GMM algorithm successfully processes 4K video at 60 fps in real-time, outperforming GPU implementations in terms of performance and energy efficiency. The proposed adaptations (grayscale input and common background models) reduce memory requirements with acceptable accuracy loss. Future work should focus on improving memory bandwidth utilization and exploring lossless compression for further optimization.
研究不足
The memory bandwidth of the FPGA platform limits the achievable throughput (only 57% of maximum theoretical transfer speed was achieved). The use of fixed-point arithmetic in hardware may lead to slightly deteriorated results compared to simulation. The implementation is specific to the ZCU102 board and may require adaptation for other hardware platforms. Accuracy is reduced when using simplified models (e.g., grayscale or common background models), with wrong classification rates up to 11%.
1:Experimental Design and Method Selection:
The study involves designing and implementing the GMM algorithm in hardware (FPGA) and software (GPU using CUDA) to process 4K video streams. Theoretical models include the GMM algorithm with adaptations for high resolution, such as using grayscale images and common background models for neighboring pixels to reduce memory usage.
2:Sample Selection and Data Sources:
Ultra-high resolution video sequences recorded with a 4K camera were used for testing.
3:List of Experimental Equipment and Materials:
ZCU102 development board with Xilinx Zynq UltraScale+ MPSoC device, NVIDIA GeForce GTX 1050m and GTX 1080 GPUs, HDMI
4:0 source (PC computer), 4K monitor. Experimental Procedures and Operational Workflow:
The video signal is received via HDMI, processed by the GMM module (with variants for separate or common background models), and output to a display. The system includes custom AXI memory controllers for synchronization with external RAM. Performance metrics (TP, TN, FP, FN) are compared against OpenCV implementation.
5:Data Analysis Methods:
Evaluation using accuracy metrics (true positive, true negative, false positive, false negative rates) and performance measurements (frames per second, power consumption, resource utilization on FPGA and GPU).
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容