- 标题
- 摘要
- 关键词
- 实验方案
- 产品
-
Large scale image retrieval with DCNN and local geometrical constraint model
摘要: Image retrieval, which refers to browse, search and retrieve the images of the same scene or object from a large database of digital images, has attracted increasing interests in recent years. This paper proposes a coarse-to-fine method for fast indexing with Deep Convolutional Neural Network(DCNN) and Local Geometrical Constraint Model. We first use a vector quantized DCNN feature descriptors and exploit enhanced Locality-sensitive hashing(LSH) techniques for fast coarse-grained retrieval. Then, we focus on obtaining high-precision preserved matches for fine-grained retrieval. This is formulated as a maximum likelihood estimation of a Bayesian model with latent variables indicating whether matches in the putative set are inliers or outliers. We impose the non-parametric global geometrical constraints on the correspondence using Tikhonov regularizers in a reproducing kernel Hilbert space. To ensure the well-posedness of the problem, we develop a local geometrical constraint that can preserve local structures among neighboring feature points, and it is also robust to a large number of outliers. The problem is solved by using the Expectation Maximization algorithm. Extensive experiments on real near-duplicate images for both feature matching and image retrieval demonstrate that the results of the proposed method outperform current state-of-the-art methods.
关键词: Image retrieval,Coarse-to-fine,Local geometrical constraint model,DCNN
更新于2025-09-23 15:23:52
-
LuAG ceramic scintillators for future HEP experiments
摘要: In this paper, we study the problem of cross-modal retrieval by hashing-based approximate nearest neighbor (ANN) search techniques. Most existing cross-modal hashing work mainly addresses the issue of multi-modal integration complexity using the same mapping and similarity calculation for data from different media types. Nonetheless, this may cause information loss during the mapping process due to overlooking the specifics of each individual modality. In this work, we propose a simple yet effective cross-modal hashing approach, termed Collective Reconstructive Embeddings (CRE), which can simultaneously solve the heterogeneity and integration complexity of multi-modal data. To address the heterogeneity challenge, we propose to process heterogeneous types of data using different modality-specific models. Specifically, we model textual data with cosine similarity based reconstructive embedding to alleviate the data sparsity to the greatest extent, while for image data we utilize the Euclidean distance to characterize the relationships of the projected hash codes. Meanwhile, we unify the projections of text and image to the Hamming space into a common reconstructive embedding through rigid mathematical reformulation, which not only reduces the optimization complexity significantly but also facilitates the inter-modal similarity preservation among different modalities. We further incorporate the code balance and uncorrelation criteria into the problem, and devise an efficient iterative algorithm for optimization. Comprehensive experiments on four widely-used multimodal benchmarks show that the proposed CRE can achieve superior performance compared to the state-of-the-arts on several challenging cross-modal tasks.
关键词: Cross-modal Retrieval,Reconstructive Embeddings,Cross-modal Hashing
更新于2025-09-23 15:23:52
-
Optimal pulse width modulation technique combined with stair phase-coding method for absolute phase retrieval with projector defocusing
摘要: The accurate three-dimensional shape measurement of a complex surface is significantly important in industrial testing. However, three-dimensional profilometry by conventional sinusoidal fringe projection using a phase-shifting algorithm performs suboptimally because of the nonlinear intensity response of projectors. To overcome this problem, the present paper proposes using a combined optimizing sinusoidal pulse width modulation (SPWM) technique and stair phase-coding approach to obtain the unwrapped phase. Properly optimizing SPWM with a small degree of a defocusing projector to generate sinusoidal fringe patterns can overcome the undesired harmonics and nonlinear gamma effect. Two groups of four-step phase-shifting fringe patterns are used. One group of the stripes contains four sinusoidal patterns generated by the SPWM technique, which is used to determine the wrapped phase. The other group of stripes contains four sinusoidal patterns with the codeword embedded into the stair phase, whose stair changes are perfectly aligned with the 2π discontinuities of the sinusoidal fringe phase, which is used to determine the fringe order for the phase unwrapping. Moreover, for the defocusing projection, because the frequency of the phase-coding fringe increases, the fringe order decision becomes less reliable. Thus, a self-correction phase unwrapping method is applied for phase retrieval. Experiments were conducted to verify the performance of the proposed method.
关键词: Phase-shifting,Phase retrieval,Stair phase-coding,Defocusing
更新于2025-09-23 15:23:52
-
[IEEE 2018 25th IEEE International Conference on Image Processing (ICIP) - Athens, Greece (2018.10.7-2018.10.10)] 2018 25th IEEE International Conference on Image Processing (ICIP) - Image-Based 3D Model Retrieval for Indoor Scenes by Simulating Scene Context
摘要: We propose a single image-based 3D model retrieval method for indoor scenes. By simulating the scene context of the input image, our method is able to handle several challenging scenarios featuring cluttered backgrounds and severe occlusions. To use our system, the user only needs to drag a few semantic bounding boxes for the query objects. The proposed approach then retrieves the most similar 3D models from the ShapeNet model repository, and aligns them with the corresponding objects automatically. This requires that the 3D models are represented by calibrated view-dependent visual elements learned from the rendered views. With the estimated occlusion relationships, the rendered model images are stacked at the corresponding locations to simulate the scene context. By conducting matching between these synthesized scenes and the input image, the most similar 3D models under the approximate poses are retrieved. Moreover, we show that the retrieving time can be significantly reduced based on a novel greedy algorithm. Experimental results demonstrate the effectiveness of our proposed method.
关键词: 3D model retrieval,cluttered background,scene context,occlusion relationship
更新于2025-09-23 15:23:52
-
[IEEE IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - Valencia (2018.7.22-2018.7.27)] IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - Significant Wave Height Retrieval from Gaofen-3 Wave Mode Images
摘要: Significant wave height (Hs), is an important parameter, represented as the integration of directional wave spectra. Although many researchers have directly extracted Hs from SAR images and got a great accuracy of retrieval, those approaches are not suitable for GF-3 SAR data. In this paper, we propose an empirical approach for SAR Hs retrieval, using λc estimated from the real part of image cross spectra obtain from VV-polarized Gaofen-3 (GF-3) wave mode data acquired in different radar beams (called wave-code). Results using GF-3 wave mode data from January to February 2017 indicate that the bias and RMSE errors are: 189 wave-code, -0.13 m and 0.57 m; 190 wave-code, -0.07 m and 0.34 m; 193 wave-code, -0.3 m and 0.59 m; 199 wave-code, 0.16 m and 0.68 m; 215 wave-code, 0.2 m and 0.87 m. they show a relative behavior between the retrieved Hs and the Hs extracted from WAVEWATCH-III (WW3). However, there is a significant error when WW3-extracted Hs exceed 4 m. It seems that the model is not suitable for Hs retrieval on high sea conditions.
关键词: wave empirical retrieval,GF-3,cutoff wavelength estimation,synthetic aperture radar,wave mode
更新于2025-09-23 15:22:29
-
Image-based 3D model retrieval using manifold learning
摘要: We propose a new framework for image-based three-dimensional (3D) model retrieval. We first model the query image as a Euclidean point. Then we model all projected views of a 3D model as a symmetric positive definite (SPD) matrix, which is a point on a Riemannian manifold. Thus, the image-based 3D model retrieval is reduced to a problem of Euclid-to-Riemann metric learning. To solve this heterogeneous matching problem, we map the Euclidean space and SPD Riemannian manifold to the same high-dimensional Hilbert space, thus shrinking the great gap between them. Finally, we design an optimization algorithm to learn a metric in this Hilbert space using a kernel trick. Any new image descriptors, such as the features from deep learning, can be easily embedded in our framework. Experimental results show the advantages of our approach over the state-of-the-art methods for image-based 3D model retrieval.
关键词: Model retrieval,Metric learning,Hilbert space,Riemannian manifold,Euclidean space
更新于2025-09-23 15:22:29
-
Compressive Phase Retrieval Realized by Combining Generalized Approximate Message Passing with Cartoon-Texture Model
摘要: Generalized approximate message passing (GAMP) can be applied to compressive phase retrieval (CPR) with excellent phase-transition behavior. In this paper, we introduced the cartoon-texture model into the denoising-based phase retrieval GAMP(D-prGAMP), and proposed a cartoon-texture model based D-prGAMP (C-T D-prGAMP) algorithm. Then, based on experiments and analyses on the variations of the performance of D-PrGAMP algorithms with iterations, we proposed a 2-stage D-prGAMP algorithm, which makes tradeoffs between the C-T D-prGAMP algorithm and general D-prGAMP algorithms. Finally, facing the non-convergence issues of D-prGAMP, we incorporated adaptive damping to 2-stage D-prGAMP, and proposed the adaptively damped 2-stage D-prGAMP (2-stage ADD-prGAMP) algorithm. Simulation results show that, runtime of 2-stage D-prGAMP is relatively equivalent to that of BM3D-prGAMP, but 2-stage D-prGAMP can achieve higher image reconstruction quality than BM3D-prGAMP. 2-stage ADD-prGAMP spends more reconstruction time than 2-stage D-prGAMP and BM3D-prGAMP. But, 2-stage ADD-prGAMP can achieve PSNRs 0.2 ~ 3 dB higher than those of 2-stage D-prGAMP and 0.3 ~ 3.1 dB higher than those of BM3D-prGAMP.
关键词: cartoon-texture model,generalized approximate message passing,adaptive damping,compressive phase retrieval
更新于2025-09-23 15:22:29
-
Extinction and optical depth retrievals for CALIPSO's Version 4 data release
摘要: The Cloud–Aerosol Lidar with Orthogonal Polarization (CALIOP) on board the Cloud–Aerosol Lidar Infrared Pathfinder Satellite Observations (CALIPSO) satellite has been making near-global height-resolved measurements of cloud and aerosol layers since mid-June 2006. Version 4.10 (V4) of the CALIOP data products, released in November 2016, introduces extensive upgrades to the algorithms used to retrieve the spatial and optical properties of these layers, and thus there are both obvious and subtle differences between V4 and previous data releases. This paper describes the improvements made to the extinction retrieval algorithms and illustrates the impacts of these changes on the extinction and optical depth estimates reported in the CALIPSO lidar level 2 data products. The lidar ratios for both aerosols and ice clouds are generally higher than in previous data releases, resulting in generally higher extinction coefficients and optical depths in V4. A newly implemented algorithm for retrieving extinction coefficients in opaque layers is described and its impact examined. Precise lidar ratio estimates are also retrieved in these opaque layers. For semi-transparent cirrus clouds, comparisons between CALIOP V4 optical depths and the optical depths reported by MODIS collection 6 show substantial improvements relative to earlier comparisons between CALIOP version 3 and MODIS collection 5.
关键词: retrieval algorithms,clouds,CALIOP,lidar,optical depth,CALIPSO,aerosols,extinction,version 4
更新于2025-09-23 15:22:29
-
[IEEE 2018 IEEE 13th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP) - Aristi Village, Zagorochoria, Greece (2018.6.10-2018.6.12)] 2018 IEEE 13th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP) - MindCamera: Interactive Image Retrieval and Synthesis
摘要: Composing a realistic picture according to the mind is tough work for most people. It is not only a complex operation but also a creation process from nonexistence to existence. Therefore, the core of this problem is to provide rich existing materials for stitching. We present an interactive sketch-based image retrieval and synthesis system, MindCamera. Compared with existing methods, it can use images of daily scenes as the dataset and proposes a sketch-based scene image retrieval model. Furthermore, MindCamera can blend the target object in the gradient domain to avoid the visible seam, and it introduces alpha matting to realize real-time foreground object extraction and composition. Experiments verify that our retrieval model has higher precision and provides more reasonable and richer materials for users. The practical usage demonstrates that MindCamera allows the interactive creation of complex images, and its final compositing results are natural and realistic.
关键词: image fusion,image retrieval,image segmentation
更新于2025-09-23 15:22:29
-
[IEEE IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - Valencia (2018.7.22-2018.7.27)] IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - Circular Relevance Feedback for Remote Sensing Image Retrieval
摘要: Relevance feedback (RF) is a popular reranking technique, which aims at improving the performance of image retrieval by taking the user's opinions into account. In this paper, we introduce a new RF method, named circular relevance feedback (CRF), to enhance the behavior of remote sensing image retrieval (RSIR). Instead of the manual selection used in the common RF method, we adopt the active learning (AL) algorithm to select the samples from the initial results automatically in each RF iteration. Moreover, to ensure the selected images are representative and informative enough, we choose different AL algorithms to complete the different RF processes. Finally, the contributions of all AL-driven RF methods are integrated using a circular fusion scheme. The encouraging experimental results on the ground truth RS image archive illustrate that our CRF is useful for enhancing the performance of RSIR. In addition, compared with many existing RF methods, our CRF achieves improved behavior.
关键词: Relevance feedback,remote sensing image retrieval
更新于2025-09-23 15:22:29