研究目的
To demonstrate the feasibility of using dynamic partial reconfiguration (DPR) for time-sharing an FPGA by multiple realtime computer vision pipelines, achieving useful performance despite high reconfiguration time.
研究成果
The paper demonstrates the feasibility of using DPR for time-sharing realtime computer vision pipelines, achieving useful frame rates (up to 60 fps for 720p and up to 30 fps for 1080p) on a Xilinx ZC706 board. The developed optimizations—overlapping reconfiguration and processing, amortization via multi-frame bundles, configurable interconnect, and downsampling—are essential to overcome reconfiguration time limitations. Future work could focus on improving DPR speeds and reducing resource overhead.
研究不足
The high reconfiguration time of DPR on current FPGAs limits performance, especially for higher-resolution video streams (e.g., 1080p) and when multiple RPs need reconfiguration. The framework requires non-trivial infrastructure logic resources, and downsampling may be necessary to maintain realtime performance, reducing effective frame rates.
1:Experimental Design and Method Selection:
The methodology involves designing an FPGA runtime framework that uses dynamic partial reconfiguration (DPR) to time-share the FPGA fabric among multiple computer vision pipelines. The framework includes a static region and reconfigurable partitions (RPs), with optimizations such as overlapping reconfiguration and processing, round-robin scheduling with multi-frame bundles, a configurable streaming interconnect, and downsampling video streams.
2:Sample Selection and Data Sources:
The experiments use video streams from a camera at 720p@60 fps and 1080p@60 fps resolutions. Vision modules include edge detection, color-based object tracking, template tracking, corner detection, blob detection, Gaussian blur, and background subtraction.
3:List of Experimental Equipment and Materials:
Equipment includes a Xilinx ZC706 development board with an Xilinx XCZ7045 Zynq SoC FPGA, a VITA 2000-sensor camera, and standard monitors for HDMI output. Software tools include Xilinx Vivado HLS for module development.
4:Experimental Procedures and Operational Workflow:
The runtime manager on an embedded ARM processor handles pipeline instantiation, execution, and time-sharing. Steps include assigning stages to RPs, reconfiguring RPs via DPR, configuring interconnects and DMA engines, and starting pipeline execution with staggered-start to overlap processing and reconfiguration. Performance is evaluated by measuring frames-per-second (fps) for time-shared pipelines under different conditions.
5:Data Analysis Methods:
Performance is analyzed by quantifying DPR reconfiguration times and achieved fps for various pipeline configurations, using metrics such as the number of RPs reconfigured and the number of frames processed per timeslice (g).
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容