研究目的
Investigating the development of a cross-modal image retrieval system that utilizes both text and sketch as input modalities, focusing on learning a common embedding space for text, sketches, and images, and employing an attention mechanism for multi-object retrieval.
研究成果
The proposed framework successfully integrates text and sketch modalities for image retrieval, demonstrating superior performance in both single and multiple object scenarios compared to state-of-the-art methods. The attention mechanism effectively focuses on relevant image regions for multi-object queries. Future work will explore more efficient training strategies and the possibility of querying by multiple modalities simultaneously.
研究不足
The method is limited by the availability of datasets for training and retrieval, particularly for scenarios involving more than two objects. The performance of sketch-based retrieval is inferior to text-based retrieval due to the greater domain gap between sketches and images.
1:Experimental Design and Method Selection:
The study employs a cross-modal deep network architecture to jointly model sketch and text input modalities with image output modality, incorporating an attention mechanism for focusing on different objects in images.
2:Sample Selection and Data Sources:
Utilizes the Sketchy dataset for single object retrieval and a derived dataset from COCO for multiple object retrieval, with 80% of images used for training and the rest for testing.
3:List of Experimental Equipment and Materials:
Uses VGG-16 network for sketch and image representation, word2vec for text representation, and an LSTM-based attention model.
4:Experimental Procedures and Operational Workflow:
Training involves creating positive and negative examples for text and sketch queries, with retrieval performance evaluated using mean average precision (mAP).
5:Data Analysis Methods:
Employs cosine embedding loss for training and evaluates retrieval performance based on distance metrics in the learned embedding space.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容