Zero-shot Image-to-Image Translation

Published 6 Feb 2023 in cs.CV, cs.GR, and cs.LG | (2302.03027v1)

Abstract: Large-scale text-to-image generative models have shown their remarkable ability to synthesize diverse and high-quality images. However, it is still challenging to directly apply these models for editing real images for two reasons. First, it is hard for users to come up with a perfect text prompt that accurately describes every visual detail in the input image. Second, while existing models can introduce desirable changes in certain regions, they often dramatically alter the input content and introduce unexpected changes in unwanted regions. In this work, we propose pix2pix-zero, an image-to-image translation method that can preserve the content of the original image without manual prompting. We first automatically discover editing directions that reflect desired edits in the text embedding space. To preserve the general content structure after editing, we further propose cross-attention guidance, which aims to retain the cross-attention maps of the input image throughout the diffusion process. In addition, our method does not need additional training for these edits and can directly use the existing pre-trained text-to-image diffusion model. We conduct extensive experiments and show that our method outperforms existing and concurrent works for both real and synthetic image editing.

Abstract PDF Upgrade to Chat

Authors (6)

Citations (353)

View on Semantic Scholar

Summary

The paper outlines essential formatting rules for ICCV submissions, including precise page layout, font usage, and margin settings.
It clarifies the double-blind review process by detailing anonymization practices and citation standards to ensure unbiased peer review.
The guidelines emphasize consistency in manuscript preparation and suggest future improvements, like automated tools, to enhance submission accuracy.

ICCV LaTeX Author Guidelines

The paper under examination provides a comprehensive guide on preparing manuscripts for submission to the IEEE International Conference on Computer Vision (ICCV) proceedings, with a focus on LaTeX document preparation. This document serves as a critical resource for authors aiming to ensure their papers meet the formal requirements and formatting standards of the ICCV conference.

Core Content of the Guidelines

The paper delineates several key points regarding manuscript preparation:

General Requirements: All submissions must be in English, with explicit guidelines on formatting, length, and submission language. The paper enforces the strict policy that submissions exceeding the eight-page limit, excluding references, will not be reviewed.
Blind Review Process: The paper clarifies the misconceptions regarding double-blind review, outlining how authors can reference their previous work without revealing their identity. This involves avoiding first-person references and anonymizing certain submission elements to comply with review policies.
Formatting Specifications: Detailed instructions are provided on the use of type-style, font, margins, and page layout. For instance, submissions should be in a two-column format, and the specific measurements for margin settings are outlined to ensure uniformity across all submissions.
Figures and Tables: Authors are directed on the appropriate styling for figures, tables, and their captions. Particular attention is given to ensure that figures and tables enhance readability and are visually consistent with the textual content.
Mathematical Notation and Citations: The guidelines emphasize the importance of numbering equations and sections to facilitate reference. Additionally, citation styles must adhere to ICCV standards, with references listed at the document’s end.
Additional Elements: The use of footnotes is discouraged in favor of inclusion within the main text. However, if necessary, footnotes should be concise and placed at the bottom of the page where they are referenced.

Implications and Future Directions

These guidelines have significant implications for authors submitting to ICCV. By establishing clear formatting and submission standards, the paper aids in maintaining a high-quality and consistent conference output. The detailed instructions ensure that the focus remains on the technical content of submissions, reducing the potential for formatting-related rejections.

From a theoretical standpoint, adherence to these guidelines allows for efficient peer review by removing ancillary issues related to formatting discrepancies. Practically, the document improves the accessibility and readability of papers across various viewing platforms, including both electronic and print formats.

Future developments could involve the evolution of these guidelines to include more interactive multimedia elements as digital dissemination methods advance. Additionally, the integration of automated tools to assist authors in complying with these standards may enhance the submission process, allowing authors to focus more on research quality rather than formatting details.

In summary, the ICCV LaTeX Author Guidelines paper offers an exhaustive resource for researchers aiming to submit their work to the conference. Its precise instructions are crucial in ensuring the clarity, consistency, and professional presentation of conference proceedings.

Markdown Report Issue