HDRT: A Large-Scale Dataset for Infrared-Guided HDR Imaging (2406.05475v2)

Published 8 Jun 2024 in cs.CV, cs.GR, and eess.IV

Abstract: Capturing images with enough details to solve imaging tasks is a long-standing challenge in imaging, particularly due to the limitations of standard dynamic range (SDR) images which often lose details in underexposed or overexposed regions. Traditional high dynamic range (HDR) methods, like multi-exposure fusion or inverse tone mapping, struggle with ghosting and incomplete data reconstruction. Infrared (IR) imaging offers a unique advantage by being less affected by lighting conditions, providing consistent detail capture regardless of visible light intensity. In this paper, we introduce the HDRT dataset, the first comprehensive dataset that consists of HDR and thermal IR images. The HDRT dataset comprises 50,000 images captured across three seasons over six months in eight cities, providing a diverse range of lighting conditions and environmental contexts. Leveraging this dataset, we propose HDRTNet, a novel deep neural method that fuses IR and SDR content to generate HDR images. Extensive experiments validate HDRTNet against the state-of-the-art, showing substantial quantitative and qualitative quality improvements. The HDRT dataset not only advances IR-guided HDR imaging but also offers significant potential for broader research in HDR imaging, multi-modal fusion, domain transfer, and beyond. The dataset is available at https://huggingface.co/datasets/jingchao-peng/HDRTDataset.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces HDRTNet, a deep learning framework that fuses IR and SDR data to overcome exposure issues and reconstruct robust HDR images.
It employs a U-Net based IR branch and an HDR reconstruction branch, leveraging pixel, perceptual, and adversarial losses to optimize image quality.
Experimental results on a novel aligned HDR-thermal dataset demonstrate superior performance in pu-PSNR, pu-SSIM, and pu-VSI compared to state-of-the-art methods.

HDRT: Infrared Capture for HDR Imaging

The paper "HDRT: Infrared Capture for HDR Imaging" introduces a novel approach for High Dynamic Range (HDR) image acquisition by leveraging thermal infrared (IR) sensors. Traditional HDR methods either rely on multiple exposure fusion, which is prone to ghosting artifacts due to longer capture times, or inverse tone mapping (ITM), which tries to generate HDR images from a single Standard Dynamic Range (SDR) image, often leading to missing details. To circumvent these limitations, the authors propose HDRTNet, a deep neural network that fuses IR and SDR content to produce HDR images.

Methodology Overview

HDRTNet is designed to integrate information from an IR sensor and an SDR camera to reconstruct HDR images, effectively addressing the problems of overexposure and underexposure. The authors divide the HDRTNet architecture into two main components: infrared feature extraction and HDR image reconstruction.

Infrared Feature Extraction:
- The IR branch of HDRTNet uses a U-Net architecture aimed at converting IR images to RGB images. This allows for the extraction of thermal features that are complementary to RGB data.
- The loss function for the IR branch combines pixel loss (mean absolute error and cosine similarity) with perceptual loss using a pretrained VGG-19 network.
HDR Image Reconstruction:
- The HDR branch fuses the extracted IR features with SDR data. This is achieved by combining the IR features with RGB data at shallow layers of a U-Net structure to reconstruct HDR images.
- The HDR branch loss function includes pixel loss, perceptual loss, and adversarial loss to enhance the quality of the generated HDR images.

Dataset

To validate their approach, the authors introduce the first dataset of aligned HDR and thermal images. The dataset consists of 10,000 images captured under various lighting conditions to highlight the robustness of the proposed method.

Experimental Results

HDRTNet is compared against several state-of-the-art single-image HDR reconstruction methods including DrTMO, Deep Recursive HDRI, HDRTVNet, LaNet, HDRCNN, Deep-HDR Reconstruction, ICTCPNet, ExpandNet, and HDRUNet. The evaluation metrics include perceptually uniform peak signal-to-noise ratio (pu-PSNR), structural similarity index measure (pu-SSIM), and visual saliency index (pu-VSI).

The proposed method shows substantial improvements:

HDRTNet achieves the highest scores in pu-PSNR, pu-SSIM, and pu-VSI across overexposed, underexposed, and all images categories.
Qualitatively, HDRTNet significantly outperforms other methods in recovering details lost to overexposure and underexposure.

Ablation Studies and Practical Applications

The authors conduct ablation studies to underscore the importance of their proposed modules:

Feature-level Fusion vs. Pixel-level Fusion: Feature-level fusion is shown to be more effective in extracting relevant information from IR images, reducing visual artifacts.
Separate vs. Combined Training: Training the IR branch separately is crucial to avoid extracting visually irrelevant features, which could degrade the HDR output.

Moreover, the paper discusses adaptations for handling high-resolution images and practical issues like registration errors. Techniques like pixel unshuffle and smooth gradient loss are introduced to address these challenges.

Implications and Future Work

The implications of this research are multifaceted:

Practical Applications: The integration of thermal imaging into the HDR pipeline opens new avenues for robust image acquisition in adverse lighting conditions, enhancing applications in surveillance, autonomous driving, and photography.
Theoretical Advancement: The paper provides a new direction in HDR imaging research, demonstrating that non-visible spectra can significantly improve visible spectrum tasks.

Future research could delve into improving the integration of IR information when lighting conditions are optimal or tackling scenarios where materials block IR transmission. Additionally, hardware integration of IR and RGB sensors could further streamline the capture process.

Conclusion

The paper presents HDRTNet as a compelling method for HDR imaging by effectively leveraging thermal IR data to reconstruct HDR images from single-exposure SDR inputs. The proposed approach demonstrates clear advantages over existing methods, especially under extreme exposure conditions. The creation of the first aligned HDR and thermal dataset also marks a significant contribution, facilitating further research in this domain.

PDF Markdown

Related Papers

Tweets

https://twitter.com/colour_science/status/1801351817269235790