[Lecture Notes in Computer Science] Euro-Par 2018: Parallel Processing Workshops Volume 11339 (Euro-Par 2018 International Workshops, Turin, Italy, August 27-28, 2018, Revised Selected Papers) || Modeling and Optimizing Data Transfer in GPU-Accelerated Optical Coherence Tomography

DOI：10.1007/978-3-030-10549-5_33 出版年份：2019 更新时间：2025-09-23 15:23:52

摘要： Signal processing of optical coherence tomography (OCT) has become a bottleneck for using OCT in medical and industrial applications. Recently, GPUs gained more importance as compute device to achieve video frame rate of 25 frames/s. Therefore, we develop a CUDA implementation of an OCT signal processing chain: We focus on reformulating the signal processing algorithms in terms of high-performance libraries like CUBLAS and CUFFT. Additionally, we use NVIDIA’s stream concept to overlap computations and data transfers. Performance results are presented for two Pascal GPUs and validated with a derived performance model. The model gives an estimate for the overall execution time for the OCT signal processing chain, including compute and transfer times.

关键词： GPU OCT CUDA Performance model

作者： Tobias Schr?dter，David Pallasch，Sandra Wienke，Robert Schmitt，Matthias S. Müller

AI智能分析

纠错

研究概述实验方案设备清单

研究目的

To develop a GPU-accelerated implementation of the OCT signal processing chain using CUDA to achieve video frame rates of 25 frames/s, and to derive a performance model for predicting execution times including compute and data transfer aspects.

研究成果

The GPU implementation achieved significant speed-ups, with OCT sync providing 5-7 times faster processing than the serial CPU version and OCT async further improving this to 8-21 times faster. The performance model accurately predicted runtimes with deviations below 15%, enabling estimates for processing larger data sets up to 2048×24576 px at video rates. This work demonstrates the feasibility of using GPUs for real-time OCT signal processing in medical and industrial applications, with potential for further optimizations such as direct GPU display to eliminate copy operations.

研究不足

The study is limited to consumer GPUs with Pascal architecture, and the performance model may underestimate runtimes for small data sizes due to using maximum bandwidth parameters. The model assumes contiguous memory access patterns, which may not fully capture scattered accesses in some kernels. Future work could extend to other GPU architectures and optimize for volumetric 3D-scans.

获取定制报价

设备名称型号厂家功能

Geforce GTX Titan X Pascal architecture NVIDIA
Used as a GPU for accelerating OCT signal processing computations and data transfers.
Geforce GTX 1050 Ti Pascal architecture NVIDIA
Used as a GPU for accelerating OCT signal processing computations and data transfers.

CUDA 8.0 NVIDIA
Parallel computing platform and API used for GPU programming and implementation of signal processing algorithms.
CUBLAS NVIDIA
Library for BLAS operations on GPUs, used to optimize matrix computations in the signal processing chain.
CUFFT NVIDIA
Library for Fast Fourier Transform operations on GPUs, used to replace FFTW in the CPU implementation.
FFTW
Library for Fast Fourier Transform operations, used in the CPU reference implementation.
OpenBLAS
Open-source BLAS library used in CPU-parallel implementations for comparison.
MKL Intel
Math Kernel Library used with Intel Compiler for CPU-parallel implementations.
登录查看剩余6件设备及参数对照表
查看全部

SCI高频之选

查看全部>

AQ6370D
463

型号：AQ6370D

厂家：Yokogawa

智能分析： Yokogawa AQ6370D是一款性能卓越的光谱分析仪，适用于光通信领域以及光放大器（EDFA）的测量和评估。其高波长分辨率、精准度和宽动态范围使其成为实验室和工业环境中的理想选择。虽然设备体积较大且预热时间较长，但其丰富的接口和出色的显示屏设计弥补了这些不足，整体是一款值得推荐的光谱分析仪。
获取实验方案
ZEISS EVO Family
253

型号：ZEISS EVO Family

厂家：Carl Zeiss Microscopy GmbH

智能分析： ZEISS EVO系列是一款高性能模块化扫描电子显微镜，适用于材料科学、生命科学及工业质量控制等领域。其先进的技术特性包括高分辨率、广泛加速电压范围和集成EDS系统。该产品操作直观，支持多用户环境，适合科学研究和工业应用。然而，价格信息缺失以及潜在的维护成本可能是其需要注意的方面。总体而言，ZEISS EVO系列表现优秀，值得推荐给专业用户。
获取实验方案
Crossbeam Family
157

型号：Crossbeam Family350/550

厂家：Carl Zeiss Microscopy GmbH

智能分析： ZEISS Crossbeam系列是蔡司公司推出的一款高端光电分析设备，结合了场发射扫描电子显微镜（FE-SEM）和聚焦离子束（FIB）的功能，适用于材料科学、纳米技术和半导体行业等多个领域。其高分辨率成像能力和自动化样品制备功能使其成为高通量分析的理想选择。此外，该设备支持多种检测器，具备强大的多功能性，是高精度研究和工业应用的利器。然而，由于其高端定位，设备成本较高且操作需要专业技能。总体而言，该设备表现卓越，为科学研究和工业应用提供了先进的解决方案。
获取实验方案
Axio Observer
192

型号：Axio Observer

厂家：Carl Zeiss Microscopy GmbH

智能分析： Axio Observer是一款专为金相学研究设计的倒置显微镜系统，以其高效的设计和蔡司知名的光学技术为特色。它能够快速、灵活地分析大量样品，并支持自动化操作，适用于多种应用场景，包括晶粒尺寸分析、非金属夹杂物检测等。然而，其重量较大且光源寿命较短，可能对使用者提出了额外的维护和空间管理需求。总体而言，这款产品在性能和可靠性方面表现出色，特别适合专业实验室使用。
获取实验方案
ZEISS LSM 990 Spectral Multiplex
276

型号：ZEISS LSM 990 Spectral Multiplex

厂家：Carl Zeiss Microscopy GmbH

智能分析： ZEISS LSM 990 Spectral Multiplex是一款定位于高端科研机构的光谱成像系统，具有卓越的光谱分辨率和自动化功能，适用于复杂的生物、医学及材料科学实验。其高效的荧光标签分离能力和多功能自动化设计为用户提供了强大的实验支持。然而，高昂的价格和一定的学习曲线可能对中小型实验室构成挑战。总体而言，这是一款性能优越、适应性强的高端实验设备。
获取实验方案
ZEISS Sigma 300 with RISE
220

型号：ZEISS Sigma 300 with RISE

厂家：Carl Zeiss Microscopy GmbH

智能分析： ZEISS Sigma 300 with RISE是蔡司公司推出的一款高端光谱分析仪，集成了拉曼成像和扫描电子显微镜技术，能够提供高质量的化学和结构分析。其功能强大，支持多领域应用，但设备价格较高且操作学习曲线可能较陡。适用于科研机构和高端实验室，是材料科学和生命科学领域的理想选择。
获取实验方案

加载中....

您正在对论文“[Lecture Notes in Computer Science] Euro-Par 2018: Parallel Processing Workshops Volume 11339 (Euro-Par 2018 International Workshops, Turin, Italy, August 27-28, 2018, Revised Selected Papers) || Modeling and Optimizing Data Transfer in GPU-Accelerated Optical Coherence Tomography”进行纠错

纠错内容

联系方式（选填）

称呼

电话

+86

单位名称

用途

期望交货周期

称呼

电话

+86

单位名称

用途

期望交货周期

修车大队一品楼qm论坛51一品茶楼论坛,栖凤楼品茶全国楼凤app软件 ,栖凤阁全国论坛入口,广州百花丛bhc论坛杭州百花坊妃子阁

产品选型就用光电查

[Lecture Notes in Computer Science] Euro-Par 2018: Parallel Processing Workshops Volume 11339 (Euro-Par 2018 International Workshops, Turin, Italy, August 27-28, 2018, Revised Selected Papers) || Modeling and Optimizing Data Transfer in GPU-Accelerated Optical Coherence Tomography