研究目的
To introduce a new database of face images acquired simultaneously in visible and thermal spectra under various variations and to compare face recognition performances on both modalities against each variation and the impact of bimodal fusion.
研究成果
The thermal spectrum outperforms the visible spectrum under illumination and expression variations and performs similarly under pose changes, but underperforms under occlusion. Bimodal fusion, especially score-level fusion, improves recognition when modalities perform similarly. Thermal imagery can serve as a complement or alternative to visible imagery in face recognition systems. Future work will explore time-lapse impacts and use higher-resolution cameras.
研究不足
The thermal resolution of the camera is low (160x120 pixels), which may affect image quality. The database currently has only 50 subjects, limiting generalizability. Fusion techniques did not improve performance when one modality significantly outperformed the other. Time-lapse effects were not fully explored in this preliminary evaluation.
1:Experimental Design and Method Selection:
The study involved collecting a new database of face images using a dual sensor camera to capture simultaneous visible and thermal images. A comparative analysis was performed using the Fisherfaces method for face recognition, with evaluations under different variations (illumination, expression, pose, occlusion). Sensor-level and score-level fusion techniques were applied to combine modalities.
2:Sample Selection and Data Sources:
Data was collected from 50 subjects of different ages, sexes, and ethnicities over two sessions separated by 3-4 months, resulting in 4200 images per session. Images included variations in illumination, expression, pose, and occlusion.
3:List of Experimental Equipment and Materials:
FLIR Duo R camera by FLIR Systems, three-point lighting kit (rim light, key light, fill light), controlled environment with ambient temperature set to 25°C.
4:Experimental Procedures and Operational Workflow:
The camera was positioned
5:5 meters from subjects at a height of 1 meter. Images were captured every second to avoid errors. Variations were systematically introduced:
5 illumination conditions, 7 expressions, 4 head poses, and 5 occlusion types.
6:Data Analysis Methods:
Fisherfaces algorithm (PCA and LDA) with 1-Nearest Neighborhood classification was used. Cross-fold validation was performed by splitting data into subsets for each variation. Recognition rates (Rank-1) were calculated for visible, thermal, sensor-level fusion, and score-level fusion.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容