With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations

Published 29 Apr 2021 in cs.CV | (2104.14548v2)

Abstract: Self-supervised learning algorithms based on instance discrimination train encoders to be invariant to pre-defined transformations of the same instance. While most methods treat different views of the same image as positives for a contrastive loss, we are interested in using positives from other instances in the dataset. Our method, Nearest-Neighbor Contrastive Learning of visual Representations (NNCLR), samples the nearest neighbors from the dataset in the latent space, and treats them as positives. This provides more semantic variations than pre-defined transformations. We find that using the nearest-neighbor as positive in contrastive losses improves performance significantly on ImageNet classification, from 71.7% to 75.6%, outperforming previous state-of-the-art methods. On semi-supervised learning benchmarks we improve performance significantly when only 1% ImageNet labels are available, from 53.8% to 56.5%. On transfer learning benchmarks our method outperforms state-of-the-art methods (including supervised learning with ImageNet) on 8 out of 12 downstream datasets. Furthermore, we demonstrate empirically that our method is less reliant on complex data augmentations. We see a relative reduction of only 2.1% ImageNet Top-1 accuracy when we train using only random crops.

Abstract PDF Upgrade to Chat

Authors (5)

Citations (417)

View on Semantic Scholar

Summary

The paper proposes a contrastive framework that leverages nearest neighbors to improve visual feature discrimination.
It employs a unique methodology for pairing positive and negative samples based on local neighborhood structures.
Experiments demonstrate enhanced performance on benchmark datasets, highlighting its practical impact in computer vision.

Overview of LaTeX Submission Guidelines for ICCV Proceedings

This paper outlines the author guidelines for preparing manuscripts for submission to the International Conference on Computer Vision (ICCV) using LaTeX. Specifically targeted towards authors intending to present their work at ICCV, it provides a comprehensive type-setting framework to uniformly maintain the conference's standards for manuscript presentation. Below is an encapsulation of its critical aspects and an analysis of its implications for submissions.

Structural Components and Formatting

The document provides meticulous instructions on manuscript structuring, advocating for a two-column layout with specific dimensions for margins and text areas. It underscores that the entire text must adhere to a Times New Roman font or its closest alternative, with specified font sizes varying across sections to ensure visual cohesion. Main titles, author names, and affiliations, abstracts, and main texts are prescribed precise formatting requirements, such as spacing, capitalization rules, and alignment conditions, which are designed to enhance readability and consistency across submissions.

Specific Guidelines and Considerations

Paper Length and Composition

The manuscript restricts the paper length to eight pages excluding references, asserting that non-compliance will lead to the exclusion of the paper from review. This renders the guideline not merely advisory but mandatory, emphasizing the need for authors to refine content efficiently while staying within the stipulated confines. Importantly, the absence of excess page charges reinforces that the limitation is strictly enforced to maintain editorial rigor rather than as a financial consideration.

The document elucidates the requirements for maintaining anonymity during the peer review process. Authors are reminded to use third-person language when citing their previous work to obscure their identity while enabling constructive review dialogues. This standard ensures the integrity of the blind review mechanism, allowing fair assessment of the submissions based on merit rather than prior recognition of the authors.

Implications and Challenges

These guidelines collectively ensure submissions meet technical and visual standards reflective of the ICCV's prominence. While the mandates for formatting might appear overly strict, they facilitate a uniform appearance and readability, thus aiding reviewers and attendees in navigating diverse presentations efficiently. Moreover, adhering to stringent guidelines signifies professionalism and reliability, attributes valued in scientific communication.

For authors, these standards require meticulous attention to detail in manuscript preparation, potentially demanding greater collaboration with peers familiar with LaTeX typesetting. For conference organizers, the guidelines reduce potential logistical issues related to publication and presentation, streamlining the conference workflow.

Potential Directions and Advancements

Given the rapid evolution of document preparation technologies and practices, a future iteration of these guidelines may incorporate recommendations for effective incorporation of emerging technologies such as dynamic figures and interactive data visualizations within submissions. These enhancements could further enhance the presentation and dissemination of research findings, ensuring ICCV maintains its leading position in accommodating transformative scientific communications.

In conclusion, this document is a comprehensive guide establishing essential conventions for ICCV submissions, balancing traditional standards with contemporary scholarly presentation practices. Its meticulous design facilitates both authors’ compliance and systematic, effective review and dissemination of visual computing advancements.

Markdown Report Issue