Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 56 tok/s
Gemini 2.5 Pro 39 tok/s Pro
GPT-5 Medium 15 tok/s Pro
GPT-5 High 16 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 155 tok/s Pro
GPT OSS 120B 476 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

HCNet: Hierarchical Context Network for Semantic Segmentation (2010.04962v2)

Published 10 Oct 2020 in cs.CV

Abstract: Global context information is vital in visual understanding problems, especially in pixel-level semantic segmentation. The mainstream methods adopt the self-attention mechanism to model global context information. However, pixels belonging to different classes usually have weak feature correlation. Modeling the global pixel-level correlation matrix indiscriminately is extremely redundant in the self-attention mechanism. In order to solve the above problem, we propose a hierarchical context network to differentially model homogeneous pixels with strong correlations and heterogeneous pixels with weak correlations. Specifically, we first propose a multi-scale guided pre-segmentation module to divide the entire feature map into different classed-based homogeneous regions. Within each homogeneous region, we design the pixel context module to capture pixel-level correlations. Subsequently, different from the self-attention mechanism that still models weak heterogeneous correlations in a dense pixel-level manner, the region context module is proposed to model sparse region-level dependencies using a unified representation of each region. Through aggregating fine-grained pixel context features and coarse-grained region context features, our proposed network can not only hierarchically model global context information but also harvest multi-granularity representations to more robustly identify multi-scale objects. We evaluate our approach on Cityscapes and the ISPRS Vaihingen dataset. Without Bells or Whistles, our approach realizes a mean IoU of 82.8% and overall accuracy of 91.4% on Cityscapes and ISPRS Vaihingen test set, achieving state-of-the-art results.

Citations (4)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.