Emergent Mind

Abstract

Image-to-Image translation (I2I) is a subtype of Machine Learning (ML) that has tremendous potential in applications where two domains of images and the need for translation between the two exist, such as the removal of fog. For example, this could be useful for autonomous vehicles, which currently struggle with adverse weather conditions like fog. However, datasets for I2I tasks are not abundant and typically hard to acquire. Here, we introduce STEREOFOG, a dataset comprised of $10,067$ paired fogged and clear images, captured using a custom-built device, with the purpose of exploring I2I's potential in this domain. It is the only real-world dataset of this kind to the best of our knowledge. Furthermore, we apply and optimize the pix2pix I2I ML framework to this dataset. With the final model achieving an average Complex Wavelet-Structural Similarity (CW-SSIM) score of $0.76$, we prove the technique's suitability for the problem.

Overview

  • The paper introduces STEREOFOG, a dataset of 10,067 image pairs aimed at enhancing computational defogging for visual perception.

  • The dataset was created using a custom apparatus with two cameras, capturing both foggy and corresponding clear images simultaneously.

  • The research utilizes and optimizes the pix2pix conditional generative adversarial network model to translate foggy images to clear ones.

  • The model's performance was evaluated using metrics like the Complex Wavelet Structural Similarity score, achieving a CW-SSIM of 0.76.

  • The STEREOFOG dataset and findings aim to improve autonomous vehicle navigation in fog and have applications in various visibility-critical scenarios.

In the realm of ML, image-to-image translation (I2I), which involves converting one type of image into another, holds significant promise for enhancing visual perception in adverse weather conditions, such as fog. Fog, in particular, poses a major challenge for autonomous vehicles, which have yet to master safe navigation in such environments. Addressing this challenge, a recent scholarly contribution is the creation of a unique real-world dataset named STEREOFOG, which consists of 10,067 images, each paired with a corresponding clear image counterpart. Researchers crafted a bespoke apparatus featuring two synchronized cameras – one exposed to fog and the other kept clear – to capture these image pairs. This dataset stands as a stepping stone toward advancing computational defogging techniques.

The research leverages a previously established conditional generative adversarial network model, pix2pix, optimizing it to work effectively with this particular dataset. Pix2pix operates by employing a generator and a discriminator in a tug-of-war to produce clear images from foggy ones. Various hyperparameters within this model were adjusted to identify the best performing configuration, all with the goal of converting foggy scenes into their clear version with the highest fidelity possible.

Evaluating the performance of these image translations required impartial metrics, and the team utilized a range of them, including the Complex Wavelet Structural Similarity (CW-SSIM) score. Impressively, the adapted model achieved a CW-SSIM score of 0.76, indicating a strong likeness between the ML-generated clear images and the actual clear images. Despite the success, the study acknowledged the need to improve the dataset in diversity, size, and range of weather conditions to further enhance the defogging algorithm's robustness and accuracy.

This research is poised to have considerable implications for autonomous driving by aiding the vehicles in perceiving their environment more clearly in foggy conditions. In addition, the practical applications extend beyond the automotive industry to scenarios such as search and rescue missions, where visibility is crucial. The study’s dataset, code, and supplemental materials have been shared publicly, inviting further research and collaboration in this critical area of ML and computer vision. As autonomous technology continues to mature, contributions such as the STEREOFOG dataset and the associated computational advancements play a vital role in realizing safe navigation regardless of environmental visibility challenges.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.