- 标题
- 摘要
- 关键词
- 实验方案
- 产品
-
[IEEE 2018 International Conference on Simulation of Semiconductor Processes and Devices (SISPAD) - Austin, TX, USA (2018.9.24-2018.9.26)] 2018 International Conference on Simulation of Semiconductor Processes and Devices (SISPAD) - Dynamical space partitioning for acceleration of parallelized lattice kinetic Monte Carlo simulations
摘要: A new dynamical space partitioning method is presented in a parallelized lattice kinetic Monte Carlo (kMC) simulator to overcome the loss of parallel efficiency found in other parallelized kMC simulators. The dynamical partitioning of the simulation cell allows better load balancing through all threads hence reducing time consuming events during the simulation. The new method is evaluated against both hypothetical and real cases. In both cases, minimal differences between serial and parallelized simulations are found. In real cases, other code optimizations may be needed to further improve the parallel efficiency.
关键词: shared memory,stochastic,nano-scale,FinFET,kMC,parallelization efficiency,OpenMP
更新于2025-09-23 15:22:29
-
[IEEE 2018 IEEE 38th Central America and Panama Convention (CONCAPAN XXXVIII) - San Salvador, El Salvador (2018.11.7-2018.11.9)] 2018 IEEE 38th Central America and Panama Convention (CONCAPAN XXXVIII) - Parallelization of a Magnetohydrodynamics Model for Plasma Simulation
摘要: Plasma simulations are inherently complex due to the numerous and intricate processes that naturally occur to matter in this state. Computer simulations and visualizations of plasma help researchers and scientists understand the physics that takes place in it. We have developed a parallel implementation of an application used to simulate and visualize the process of convection in plasma cells. This application implements a magnetohydrodynamics (MHD) approach to plasma modeling by numerically solving a fourth-order two-dimensional differential scheme. Results of experimentation with our parallel implementation are presented and analyzed. We managed to speedup the program by a factor of nearly 42× after parallelizing the code with OpenMP and using 128 cores on our Intel Xeon Phi KNL server. We also achieved an almost linear scalability of the execution time when increasing the size of the spatial and temporal domains.
关键词: plasma,physics simulation,OpenMP,MHD
更新于2025-09-23 15:22:29
-
Parallel K-Means Clustering for Brain Cancer Detection Using Hyperspectral Images
摘要: The precise delineation of brain cancer is a crucial task during surgery. There are several techniques employed during surgical procedures to guide neurosurgeons in the tumor resection. However, hyperspectral imaging (HSI) is a promising non-invasive and non-ionizing imaging technique that could improve and complement the currently used methods. The HypErspectraL Imaging Cancer Detection (HELICoiD) European project has addressed the development of a methodology for tumor tissue detection and delineation exploiting HSI techniques. In this approach, the K-means algorithm emerged in the delimitation of tumor borders, which is of crucial importance. The main drawback is the computational complexity of this algorithm. This paper describes the development of the K-means clustering algorithm on different parallel architectures, in order to provide real-time processing during surgical procedures. This algorithm will generate an unsupervised segmentation map that, combined with a supervised classification map, will offer guidance to the neurosurgeon during the tumor resection task. We present parallel K-means clustering based on OpenMP, CUDA and OpenCL paradigms. These algorithms have been validated through an in-vivo hyperspectral human brain image database. Experimental results show that the CUDA version can achieve a speed-up of ~150× with respect to a sequential processing. The remarkable result obtained in this paper makes possible the development of a real-time classification system.
关键词: unsupervised clustering,brain cancer detection,Graphics Processing Units (GPUs),OpenCL,CUDA,K-means,OpenMP,hyperspectral imaging
更新于2025-09-23 15:21:01
-
[IEEE 2019 International Workshop on Fiber Optics in Access Networks (FOAN) - Sarajevo, Bosnia and Herzegovina (2019.9.2-2019.9.4)] 2019 International Workshop on Fiber Optics in Access Networks (FOAN) - How Dubai is Becoming a Smart City?
摘要: Quantitative retrieval is a growing area in remote sensing due to the rapid development of remote instruments and retrieval algorithms. The aerosol optical depth (AOD) is a significant optical property of aerosol which is involved in further applications such as the atmospheric correction of remotely sensed surface features, monitoring of volcanic eruptions or forest fires, air quality, and even climate changes from satellite data. The AOD retrieval can be computationally expensive as a result of huge amounts of remote sensing data and compute-intensive algorithms. In this paper, we present two efficient implementations of an AOD retrieval algorithm from the moderate resolution imaging spectroradiometer (MODIS) satellite data. Here, we have employed two different high performance computing architectures: multicore processors and a graphics processing unit (GPU). The compute unified device architecture C (CUDA-C) has been used for the GPU implementation for NVIDIA’s graphic cards and open multiprocessing (OpenMP) for thread-parallelism in the multicore implementation. We observe for the GPU accelerator, a maximal overall speedup of 68.x for the studied data, whereas the multicore processor achieves a reasonable 7.x speedup. Additionally, for the largest benchmark input dataset, the GPU implementation also shows a great advantage in terms of energy efficiency with an overall consumption of 3.15 kJ compared to 58.09 kJ on a CPU with 1 thread and 38.39 kJ with 16 threads. Furthermore, the retrieval accuracy of all implementations has been checked and analyzed. Altogether, using the GPU accelerator shows great advantages for an application in AOD retrieval in both performance and energy efficiency metrics. Nevertheless, the multicore processor provides the easier programmability for the majority of today’s programmers. Our work exploits the parallel implementations, the performance, and the energy efficiency features of GPU accelerators and multicore processors. With this paper, we attempt to give suggestions to geoscientists demanding for efficient desktop solutions.
关键词: High performance computing (HPC),OpenMP,quantitative remote sensing retrieval,graphics processing unit (GPU),Aerosol optical depth (AOD)
更新于2025-09-19 17:13:59
-
Fast parallel beam propagation method based on multi-core and many-core architectures
摘要: In this paper, a fast technique is suggested to accelerate the computation of the fast Fourier transform beam propagation method (FFT-BPM). The FFT-BPM is executed on a graphical processing unit (GPU) and multi-core processor GPUs to speed up the computation of huge number of propagation steps with a higher speed than the traditional CPU. Further, the suggested technique is implemented in parallel approach which is faster than serial implementation. The achieved speedup factor is 150× and 5× using GPU and eight cores multiprocessor, respectively with respect to a single core processing time of 215 steps input Gaussian beam. In order to verify the speed of the proposed technique, the possibility of using the BPM to compute the time-consuming Goos–H?nchen shift calculation is proposed. Further, the propagation of a single mode light beam in fiber optic for 5 × 106 steps is executed using GPU. It is found that the speed up of the studied mode is equal to 168x over a single core calculation.
关键词: Compute unified device architecture (CUDA),Parallel computing,Beam propagation method,Open multiprocessors (OpenMP),Graphical processing unit (GPU)
更新于2025-09-10 09:29:36