Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 63 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 14 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 472 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

A comprehensive framework for occluded human pose estimation (2401.00155v2)

Published 30 Dec 2023 in cs.CV

Abstract: Occlusion presents a significant challenge in human pose estimation. The challenges posed by occlusion can be attributed to the following factors: 1) Data: The collection and annotation of occluded human pose samples are relatively challenging. 2) Feature: Occlusion can cause feature confusion due to the high similarity between the target person and interfering individuals. 3) Inference: Robust inference becomes challenging due to the loss of complete body structural information. The existing methods designed for occluded human pose estimation usually focus on addressing only one of these factors. In this paper, we propose a comprehensive framework DAG (Data, Attention, Graph) to address the performance degradation caused by occlusion. Specifically, we introduce the mask joints with instance paste data augmentation technique to simulate occlusion scenarios. Additionally, an Adaptive Discriminative Attention Module (ADAM) is proposed to effectively enhance the features of target individuals. Furthermore, we present the Feature-Guided Multi-Hop GCN (FGMP-GCN) to fully explore the prior knowledge of body structure and improve pose estimation results. Through extensive experiments conducted on three benchmark datasets for occluded human pose estimation, we demonstrate that the proposed method outperforms existing methods. Code and data will be publicly available.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. “Learning semantic-aligned action representation,” IEEE transactions on neural networks and learning systems, vol. 29, no. 8, pp. 3715–3725, 2017.
  2. “Towards real-time physical human-robot interaction using skeleton information and hand gestures,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, pp. 1–6.
  3. “Diverse part discovery: Occluded person re-identification with part-aware transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2898–2907.
  4. “Estimating human pose efficiently by parallel pyramid networks,” IEEE Transactions on Image Processing, vol. 30, pp. 6785–6800, 2021.
  5. “Tokenpose: Learning keypoint tokens for human pose estimation,” in Proceedings of the IEEE International conference on computer vision, 2021, pp. 11313–11322.
  6. “ViTPose: Simple vision transformer baselines for human pose estimation,” in Advances in Neural Information Processing Systems, 2022.
  7. “Posetrans: A simple yet effective pose transformation augmentation for human pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2022, pp. 643–659.
  8. “Adversarial semantic data augmentation for human pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2020, pp. 606–622.
  9. “Multi-scale structure-aware network for human pose estimation,” in Proceedings of the european conference on computer vision (ECCV), 2018, pp. 713–728.
  10. “Semantic-aware transfer with instance-adaptive parsing for crowded scenes pose estimation,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 686–694.
  11. “Occlusion-aware siamese network for human pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2020, pp. 396–412.
  12. “Peeking into occluded joints: A novel framework for crowd pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2020, pp. 488–504.
  13. “Devil in the details: Towards accurate single and multiple human parsing,” in Proceedings of the AAAI conference on artificial intelligence, 2019, vol. 33, pp. 4814–4821.
  14. “Multi-hop modulated graph convolutional networks for 3d human pose estimation,” in Proceedings of the British Machine Vision Conference, 2022, pp. 1–13.
  15. “Simple baselines for human pose estimation and tracking,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 466–481.
  16. “Deep high-resolution representation learning for human pose estimation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5693–5703.
  17. “Simcc: A simple coordinate classification perspective for human pose estimation,” in Proceedings of the European conference on computer vision (ECCV), 2022, pp. 89–106.
  18. “Poseur: Direct human pose regression with transformers,” in Proceedings of the European conference on computer vision (ECCV), 2022, pp. 72–88.
  19. “Crowdpose: Efficient crowded scenes pose estimation and a new benchmark,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 10863–10872.
  20. “Pose2seg: Detection free human instance segmentation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 889–898.
  21. “Feature pyramid networks for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117–2125.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube