- 标题
- 摘要
- 关键词
- 实验方案
- 产品
-
[IEEE IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - Valencia (2018.7.22-2018.7.27)] IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - The Effect of Focal Loss in Semantic Segmentation of High Resolution Aerial Image
摘要: The semantic segmentation of High Resolution Remote Sensing (HRRS) images is the fundamental research area of the earth observation. Convolutional Neural Network (CNN), which has achieved superior performance in computer vision task, is also useful for semantic segmentation of HRRS images. In this work, focal loss is used instead of cross-entropy loss in training of CNN to handle the imbalance in training data. To evaluate the effect of focal loss, we train SegNet and FCN with focal loss and confirm improvement in accuracy in ISPRS 2D Semantic Labeling Contest dataset, especially when (cid:13) is 0.5 in SegNet.
关键词: deep learning,focal loss,semantic segmentation,CNN
更新于2025-09-04 15:30:14
-
[IEEE 2018 26th European Signal Processing Conference (EUSIPCO) - Roma, Italy (2018.9.3-2018.9.7)] 2018 26th European Signal Processing Conference (EUSIPCO) - Information Fusion based Quality Enhancement for 3D Stereo Images Using CNN
摘要: Stereo images provide users with a vivid 3D watching experience. Supported by per-view depth maps, 3D stereo images can be used to generate any required intermediate view between the given left and right stereo views. However, 3D stereo images lead to higher transmission and storage cost compared to single view images. Based on the binocular suppression theory, mixed-quality stereo images can alleviate this problem by employing different compression ratios on the two views. This causes noticeable visual artifacts when a high compression ratio is adopted and limits free-viewpoint applications. Hence, the low quality image at the receiver side needs to be enhanced to match the high quality one. To address this problem, this paper we propose an end-to-end fully Convolutional Neural Network (CNN) for enhancing the low quality images in quality asymmetric stereo images by exploiting inter-view correlation. The proposed network achieves an image quality boost of up to 4.6dB and 3.88dB PSNR gain over ordinary JPEG for QF10 and 20, respectively, and an improvement of up to 2.37dB and 2.05dB over the state-of-the-art CNN-based results for QF10 and 20, respectively.
关键词: asymmetric compression,3D stereo images,CNN,information fusion,quality enhancement
更新于2025-09-04 15:30:14
-
[IEEE 2018 International Conference on Current Trends towards Converging Technologies (ICCTCT) - Coimbatore, India (2018.3.1-2018.3.3)] 2018 International Conference on Current Trends towards Converging Technologies (ICCTCT) - Processing Retinal Images to Discover Diseases
摘要: The retina of a human eye consists of billion of photosensitive cells (rods and cones) and alternative nerve cells that acquire and arrange visual information. The retina of a human eye is a thin tissue layer on the inside back wall of your eye. Three of the are Diabetic retinal diseases most Retinopathy, Glaucoma, and Cataract. The world is presently experiencing an epidemic of Diabetic Retinopathy (DR). Current predictions draw an estimation of doubling of the number affected from the current 170 million to an estimated 367 million by 2030. We propose a system wherein we extract blood vessels of the retina to detect eye diseases. Manually extracting the blood vessels of the human retina is a time-consuming task, and thus an automation of this process results in easy implementation of the work. This paper aims to design and consequently implement deep convolutional neural networks to identify the presence of an exudate, and thereby classify it into Diabetic Retinopathy, Glaucoma, and/or Cataract.
关键词: Computer vision,Glaucoma,Diabetic Retinopathy,Cataract,Convolutional Neural Networks,Retinal disease detection,CNN
更新于2025-09-04 15:30:14
-
[IEEE 2018 IEEE International Conference on Intelligent Transportation Systems (ITSC) - Maui, HI, USA (2018.11.4-2018.11.7)] 2018 21st International Conference on Intelligent Transportation Systems (ITSC) - Machine Learning-based Stereo Vision Algorithm for Surround View Fisheye Cameras
摘要: Recently, automated emergency brake systems for pedestrian have been commercialized. However, they cannot detect crossing pedestrians when turning at intersections because the field of view is not wide enough. Thus, we propose to utilize a surround view camera system becoming popular by making it into stereo vision which is robust for the pedestrian recognition. However, conventional stereo camera technologies cannot be applied due to fisheye cameras and uncalibrated camera poses. Thus we have created the new method to absorb difference of the pedestrian appearance between cameras by machine learning for the stereo vision. The method of stereo matching between image patches in each camera image was designed by combining D-Brief and NCC with SVM. Good generalization performance was achieved by it compared with individual conventional algorithms. Furthermore, feature amounts of the point cloud reconstructed by the stereo pairs are utilized with Random Forest to discriminate pedestrians. The algorithm was evaluated for the actual camera images of crossing pedestrians at various intersections, and 96.0% of pedestrian tracking rate with high position detection accuracy was achieved. They were compared with Faster R-CNN as the best pattern recognition technique, and our proposed method indicated better detection performance.
关键词: NCC,automated emergency brake systems,machine learning,SVM,Faster R-CNN,stereo vision,pedestrian detection,D-Brief,Random Forest,surround view camera system
更新于2025-09-04 15:30:14
-
ChipNet: Real-Time LiDAR Processing for Drivable Region Segmentation on an FPGA
摘要: This paper presents a field-programmable gate array (FPGA) design of a segmentation algorithm based on convolutional neural network (CNN) that can process light detection and ranging (LiDAR) data in real-time. For autonomous vehicles, drivable region segmentation is an essential step that sets up the static constraints for planning tasks. Traditional drivable region segmentation algorithms are mostly developed on camera data, so their performance is susceptible to the light conditions and the qualities of road markings. LiDAR sensors can obtain the 3D geometry information of the vehicle surroundings with high precision. However, it is a computational challenge to process a large amount of LiDAR data in real-time. In this paper, a CNN model is proposed and trained to perform semantic segmentation using data from the LiDAR sensor. An efficient hardware architecture is proposed and implemented on an FPGA that can process each LiDAR scan in 17.59 ms, which is much faster than the previous works. Evaluated using Ford and KITTI road detection benchmarks, the proposed solution achieves both high accuracy in performance and real-time processing in speed.
关键词: Autonomous vehicle,LiDAR,FPGA,road segmentation,CNN
更新于2025-09-04 15:30:14
-
[ACM Press the 2nd International Conference - Chengdu, China (2018.06.16-2018.06.18)] Proceedings of the 2nd International Conference on Advances in Image Processing - ICAIP '18 - Design and Implementation of Vehicle Tracking System Based on Depth Learning
摘要: Vehicle tracking is one of the most challenging tasks in the field of visual tracking. A vehicle tracking algorithm based on CNN is constructed to solve the problem of rapid movement, scale change and occlusion of vehicles in outdoor environment. The CNN is used to extract feature sets containing positive and negative samples. The output of the CNN is used as the input of the Logistics classifier to obtain the vehicle classifier, and the particle filter is used to track the target online. The experimental results show that the depth characteristics of CNN extraction can effectively distinguish between the target and the background, and combined with particle filtering algorithm for online tracking, it has high tracking accuracy and strong robustness. Compared with the existing tracking algorithms, the vehicle can be better tracked when faced with changes in lighting, vehicle occlusion, and scale changes.
关键词: Deep learning,Vehicle Detection,Particle filter,Vehicle tracking,CNN
更新于2025-09-04 15:30:14
-
[IEEE 2018 25th IEEE International Conference on Image Processing (ICIP) - Athens, Greece (2018.10.7-2018.10.10)] 2018 25th IEEE International Conference on Image Processing (ICIP) - Experimentally Defined Convolutional Neural Network Architecture Variants for Non-Temporal Real-Time Fire Detection
摘要: In this work we investigate the automatic detection of fire pixel regions in video (or still) imagery within real-time bounds without reliance on temporal scene information. As an extension to prior work in the field, we consider the performance of experimentally defined, reduced complexity deep convolutional neural network architectures for this task. Contrary to contemporary trends in the field, our work illustrates maximal accuracy of 0.93 for whole image binary fire detection, with 0.89 accuracy within our superpixel localization framework can be achieved, via a network architecture of significantly reduced complexity. These reduced architectures additionally offer a 3-4 fold increase in computational performance offering up to 17 fps processing on contemporary hardware independent of temporal information. We show the relative performance achieved against prior work using benchmark datasets to illustrate maximally robust real-time fire region detection.
关键词: fire detection,non-stationary visual fire detection,simplified CNN,real-time,non-temporal
更新于2025-09-04 15:30:14
-
[IEEE 2018 IEEE International Conference on Intelligent Transportation Systems (ITSC) - Maui, HI, USA (2018.11.4-2018.11.7)] 2018 21st International Conference on Intelligent Transportation Systems (ITSC) - Pedestrian-Detection Method based on 1D-CNN during LiDAR Rotation
摘要: Pedestrian detection in autonomous driving systems is important for preventing accidents involving pedestrians and vehicles. Conventional pedestrian detection methods involve Light Detection and Ranging (LiDAR), which requires clustering points into a cloud before determining whether each point is a pedestrian. Therefore, there may not be sufficient time for an autonomous driving system to ensure safety if a pedestrian and vehicle are too close to each other. We propose a pedestrian detection method that is based on a one-dimensional convolution neural network (1D-CNN) that processes LiDAR waveform data without delay. The proposed method sequentially inputs LiDAR waveform data to the 1D-CNN and determines whether each point belongs to a pedestrian. Therefore, it is possible to reduce the difference between the detected and actual positions of pedestrians since our method can be used during LiDAR sensor rotation.
关键词: autonomous driving,LiDAR,Pedestrian detection,1D-CNN
更新于2025-09-04 15:30:14
-
[IEEE 2018 IEEE International Conference on Intelligent Transportation Systems (ITSC) - Maui, HI, USA (2018.11.4-2018.11.7)] 2018 21st International Conference on Intelligent Transportation Systems (ITSC) - The TUBS Road User Dataset: A New LiDAR Dataset and its Application to CNN-based Road User Classification for Automated Vehicles
摘要: We present a novel approach for classifying pre-segmented laser scans of road users with consideration of real-time capability for applications in automated vehicles. Our classification approach uses 2.5D Convolutional Neural Networks (CNNs) to process range data as well as intensity information retrieved from reflected beams. We do not solely rely on publicly available laser scan datasets, which lack several features, but we provide an additional dataset from real-world sensor recordings, annotated by a tracking-based automatic labeling process. We evaluate the classification performance of our CNN regarding different feature configurations. For training, we use automatically and manually labeled data as well as mixtures with other public datasets. The results show promising classification capabilities. Training with automated labels shows similar results, providing a possibility to avoid the need for manual editing expense.
关键词: CNN,dataset,road user classification,automated vehicles,LiDAR
更新于2025-09-04 15:30:14
-
[IEEE 2018 International Conference on 3D Vision (3DV) - Verona (2018.9.5-2018.9.8)] 2018 International Conference on 3D Vision (3DV) - DeepHPS: End-to-end Estimation of 3D Hand Pose and Shape by Learning from Synthetic Depth
摘要: Articulated hand pose and shape estimation is an important problem for vision-based applications such as augmented reality and animation. In contrast to the existing methods which optimize only for joint positions, we propose a fully supervised deep network which learns to jointly estimate a full 3D hand mesh representation and pose from a single depth image. To this end, a CNN architecture is employed to estimate parametric representations i.e. hand pose, bone scales and complex shape parameters. Then, a novel hand pose and shape layer, embedded inside our deep framework, produces 3D joint positions and hand mesh. Lack of sufficient training data with varying hand shapes limits the generalized performance of learning based methods. Also, manually annotating real data is suboptimal. Therefore, we present SynHand5M: a million-scale synthetic dataset with accurate joint annotations, segmentation masks and mesh files of depth maps. Among model based learning (hybrid) methods, we show improved results on our dataset and two of the public benchmarks i.e. NYU and ICVL. Also, by employing a joint training strategy with real and synthetic data, we recover 3D hand mesh and pose from real images in 3.7ms.
关键词: 3D hand pose estimation,hand mesh reconstruction,synthetic dataset,deep learning,CNN
更新于2025-09-04 15:30:14