STEREOFOG -- Computational DeFogging via Image-to-Image Translation on a real-world Dataset (2312.02344v1)

Published 4 Dec 2023 in cs.CV and cs.LG

Abstract: Image-to-Image translation (I2I) is a subtype of Machine Learning (ML) that has tremendous potential in applications where two domains of images and the need for translation between the two exist, such as the removal of fog. For example, this could be useful for autonomous vehicles, which currently struggle with adverse weather conditions like fog. However, datasets for I2I tasks are not abundant and typically hard to acquire. Here, we introduce STEREOFOG, a dataset comprised of $10,067$ paired fogged and clear images, captured using a custom-built device, with the purpose of exploring I2I's potential in this domain. It is the only real-world dataset of this kind to the best of our knowledge. Furthermore, we apply and optimize the pix2pix I2I ML framework to this dataset. With the final model achieving an average Complex Wavelet-Structural Similarity (CW-SSIM) score of $0.76$, we prove the technique's suitability for the problem.

Summary

The paper introduces a novel STEREOFOG dataset with 10,067 paired images that enhances computational defogging for autonomous vehicles.
The study leverages and optimizes the pix2pix GAN model to convert foggy images to clear ones, achieving a CW-SSIM score of 0.76.
The research highlights applications in autonomous driving and search-and-rescue while encouraging further improvements and collaborative studies.

In the field of ML, image-to-image translation (I2I), which involves converting one type of image into another, holds significant promise for enhancing visual perception in adverse weather conditions, such as fog. Fog, in particular, poses a major challenge for autonomous vehicles, which have yet to master safe navigation in such environments. Addressing this challenge, a recent scholarly contribution is the creation of a unique real-world dataset named STEREOFOG, which consists of 10,067 images, each paired with a corresponding clear image counterpart. Researchers crafted a bespoke apparatus featuring two synchronized cameras – one exposed to fog and the other kept clear – to capture these image pairs. This dataset stands as a stepping stone toward advancing computational defogging techniques.

The research leverages a previously established conditional generative adversarial network model, pix2pix, optimizing it to work effectively with this particular dataset. Pix2pix operates by employing a generator and a discriminator in a tug-of-war to produce clear images from foggy ones. Various hyperparameters within this model were adjusted to identify the best performing configuration, all with the goal of converting foggy scenes into their clear version with the highest fidelity possible.

Evaluating the performance of these image translations required impartial metrics, and the team utilized a range of them, including the Complex Wavelet Structural Similarity (CW-SSIM) score. Impressively, the adapted model achieved a CW-SSIM score of 0.76, indicating a strong likeness between the ML-generated clear images and the actual clear images. Despite the success, the paper acknowledged the need to improve the dataset in diversity, size, and range of weather conditions to further enhance the defogging algorithm's robustness and accuracy.

This research is poised to have considerable implications for autonomous driving by aiding the vehicles in perceiving their environment more clearly in foggy conditions. In addition, the practical applications extend beyond the automotive industry to scenarios such as search and rescue missions, where visibility is crucial. The paper’s dataset, code, and supplemental materials have been shared publicly, inviting further research and collaboration in this critical area of ML and computer vision. As autonomous technology continues to mature, contributions such as the STEREOFOG dataset and the associated computational advancements play a vital role in realizing safe navigation regardless of environmental visibility challenges.

PDF Markdown

STEREOFOG -- Computational DeFogging via Image-to-Image Translation on a real-world Dataset (2312.02344v1)

Summary

Related Papers

Tweets