ExpandNet: A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content (1803.02266v2)

Published 6 Mar 2018 in cs.CV and cs.GR

Abstract: High dynamic range (HDR) imaging provides the capability of handling real world lighting as opposed to the traditional low dynamic range (LDR) which struggles to accurately represent images with higher dynamic range. However, most imaging content is still available only in LDR. This paper presents a method for generating HDR content from LDR content based on deep Convolutional Neural Networks (CNNs) termed ExpandNet. ExpandNet accepts LDR images as input and generates images with an expanded range in an end-to-end fashion. The model attempts to reconstruct missing information that was lost from the original signal due to quantization, clipping, tone mapping or gamma correction. The added information is reconstructed from learned features, as the network is trained in a supervised fashion using a dataset of HDR images. The approach is fully automatic and data driven; it does not require any heuristics or human expertise. ExpandNet uses a multiscale architecture which avoids the use of upsampling layers to improve image quality. The method performs well compared to expansion/inverse tone mapping operators quantitatively on multiple metrics, even for badly exposed inputs.

Authors (4)

Demetris Marnerides (5 papers)
Thomas Bashford-Rogers (13 papers)
Jonathan Hatchett (1 paper)
Kurt Debattista (21 papers)

Citations (254)

View on Semantic Scholar

Summary

The paper introduces ExpandNet, a deep CNN that integrates local, dilation, and global branches to effectively transform LDR images into HDR outputs.
The methodology employs multi-scale feature fusion with a 1x1 convolution and diverse data augmentation to preserve fine details while reducing common artifacts.
The evaluation demonstrates superior performance with higher PSNR, SSIM, and HDR-VDP-2.2 scores compared to existing models in challenging exposure scenarios.

ExpandNet: Advancements in HDR Imaging through Deep Convolutional Neural Networks

In this paper, the authors present ExpandNet, a deep Convolutional Neural Network (CNN) specifically designed to transform low dynamic range (LDR) images into high dynamic range (HDR) images. The capability of HDR imaging to accurately reflect real-world lighting conditions makes it incredibly valuable across numerous fields, including photography, rendering, gaming, and medical imaging. Despite these benefits, HDR content is not yet widely available. The solution proposed here enables the automatic generation of HDR content from LDR images, thus filling a significant gap in the current imaging landscape.

ExpandNet Architecture

The core of this paper is the development of the ExpandNet architecture, which uniquely integrates three branches: local, dilation, and global. Each of these branches serves distinct roles in the image processing workflow:

Local Branch: Focuses on capturing high-frequency details, ensuring that fine image features are preserved during the HDR transformation.
Dilation Branch: Handles medium-range spatial details through the use of dilated convolutions, thus expanding the receptive field without the need for downsampling.
Global Branch: Captures broader, image-wide characteristics by downsampling to a single global context vector, which is then fused with the local and dilation outputs.

The fusion of outputs from these branches employs a 1x1 convolution to integrate multi-scale features, allowing the model to predict HDR images effectively without introducing artefacts commonly associated with upsampling methods.

Performance Evaluation

ExpandNet was evaluated against traditional expansion operators and other neural network architectures, including existing models like U-Net and Colornet. The performance metrics used for assessment include PSNR, SSIM, MS-SSIM, and HDR-VDP-2.2, with ExpandNet showing superior performance, especially in scenarios with significant over- and under-exposure. The choice of using a multiscale approach without upsampling shows improvements over previous CNN architectures, effectively minimizing artefacts such as blocking and information bleeding.

Data Augmentation Techniques

An essential component of the training process was the use of diverse data augmentation techniques. The LDR-HDR image pairs were generated using various tone mapping operators (TMOs) and exposures, with randomness in cropping and parameterization to ensure the model's robustness to different input scenes and characteristics. This strategy allowed ExpandNet to generalize well across diverse scenarios and mitigated overfitting, despite the relatively small number of HDR images available for training.

Implications and Future Directions

ExpandNet represents a significant advancement in converting LDR content to HDR, offering a parameter-free, fully automated solution that outperforms current methodologies. Its ability to maintain image quality in challenging lighting conditions extends its applications across multiple sectors where HDR content is increasingly desired.

Future research could focus on integrating temporal coherence for dynamic scenes, critical for HDR applications in video processing. Incorporating Long Short-Term Memory (LSTM) networks or similar architectures could offer a pathway for handling sequential data efficiently.

Overall, through ExpandNet, the possibilities for extending the availability and usage of HDR content are substantially broadened, demonstrating the transformative potential of tailored deep learning architectures in image processing tasks.

PDF Markdown