研究目的
To overcome the long compression time of the C-DPCM algorithm for hyperspectral images by implementing it in parallel on GPUs using CUDA, with optimizations to reduce processing time while maintaining accuracy.
研究成果
The parallel implementation of the C-DPCM algorithm on GPUs significantly reduces compression time from hours to seconds with minimal accuracy loss. Optimization strategies, including shared memory, multi-stream, and multi-GPU techniques, are effective. The method for assigning classes to GPUs based on spectral line counts minimizes processing time. This approach enhances the practicality of hyperspectral image compression for real-time applications.
研究不足
The study is limited to the AVIRIS dataset and specific GPU models (GeForce GTX 1080Ti and TITAN X). The parallel implementation may not fully leverage overlapping computations due to high kernel resource usage, and the multi-GPU optimization requires heuristic methods for class assignment due to computational complexity. Floating-point operations introduce minor rounding errors, though accuracy loss is negligible.
1:Experimental Design and Method Selection:
The study involves parallel implementation of the C-DPCM algorithm on GPUs using CUDA. Three optimization strategies are employed: shared memory and registers, multi-stream technique, and multi-GPU technique. The least square method is used for calculating prediction coefficients, with parallel implementations for matrix multiplication, determinant calculation, and matrix inversion.
2:Sample Selection and Data Sources:
The AVIRIS dataset is used, consisting of twelve hyperspectral images: five 16-bit calibrated images (e.g., aviris_sc
3:cal), five 16-bit uncalibrated images (e.g., aviris_scraw), and two 12-bit uncalibrated images (maine_scraw and hawaii_scraw). Each image has 512 lines and 224 bands, with varying samples per line. List of Experimental Equipment and Materials:
GPUs include GeForce GTX 1080Ti and TITAN X models from NVIDIA, with specifications detailed in Table I. CPU is a
4:67 GHz Intel Core iSoftware includes CUDA for parallel computing. Experimental Procedures and Operational Workflow:
The C-DPCM algorithm steps include clustering spectral lines using k-means, calculating prediction coefficients via least square method, computing prediction and residual images, and encoding. Parallel implementations involve designing thread grids, using shared memory and registers for efficiency, and distributing classes across streams or GPUs. Data is transferred asynchronously, and kernel functions handle computations.
5:Data Analysis Methods:
Performance is evaluated based on compression time, speedup, bit rate (bpppb), and accuracy comparison between serial and parallel implementations. Profiling tools like nvprof are used to measure CUDA metrics such as occupancy and memory throughput.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容