A Fixed-Point Approach to Unified Prompt-Based Counting (2403.10236v1)
Abstract: Existing class-agnostic counting models typically rely on a single type of prompt, e.g., box annotations. This paper aims to establish a comprehensive prompt-based counting framework capable of generating density maps for concerned objects indicated by various prompt types, such as box, point, and text. To achieve this goal, we begin by converting prompts from different modalities into prompt masks without requiring training. These masks are then integrated into a class-agnostic counting methodology for predicting density maps. Furthermore, we introduce a fixed-point inference along with an associated loss function to improve counting accuracy, all without introducing new parameters. The effectiveness of this method is substantiated both theoretically and experimentally. Additionally, a contrastive training scheme is implemented to mitigate dataset bias inherent in current class-agnostic counting datasets, a strategy whose effectiveness is confirmed by our ablation study. Our model excels in prominent class-agnostic datasets and exhibits superior performance in cross-dataset adaptation tasks.
- Fast ICA based algorithm for building detection from VHR imagery. In International Geoscience and Remote Sensing Symposium (IGARSS), 1889–1892. IEEE.
- Finding berries: Segmentation and counting of cranberries using point supervision and shape priors. In Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 50–51.
- Counting in the wild. In European conference on computer vision (ECCV), 483–498.
- Counting in the wild. In European Conference on Computer Vision (ECCV), 483–498. Springer.
- Numerical analysis. Cengage learning.
- Privacy preserving crowd monitoring: Counting people without people models or tracking. In 2008 IEEE conference on computer vision and pattern recognition, 1–7. IEEE.
- Object representations as fixed points: Training iterative refinement algorithms with implicit differentiation. Advances in Neural Information Processing Systems (NeurIPS), 35: 32694–32708.
- Object counting in high resolution remote sensing images with OTB. In International Geoscience and Remote Sensing Symposium (IGARSS), volume 4, IV–737–IV–740.
- LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model. arXiv preprint arXiv:2304.15010.
- Precise detection in densely packed scenes. In Conference on Computer Vision and Pattern Recognition (CVPR), 5227–5236.
- Class-Agnostic Object Counting Robust to Intraclass Diversity. In European Conference on Computer Vision (ECCV), volume 13693, 388–403.
- SAU-Net: A Unified Network for Cell Counting in 2D and 3D Microscopy Images. IEEE/ACM Transactions on Computational Biology and Bioinformatics.
- Momentum Contrast for Unsupervised Visual Representation Learning. In Conference on Computer Vision and Pattern Recognition (CVPR), 9726–9735.
- Deep Residual Learning for Image Recognition. In Conference on Computer Vision and Pattern Recognition (CVPR), 770–778.
- spaCy: Industrial-strength Natural Language Processing in Python.
- Drone-Based Object Counting by Spatially Regularized Regional Proposal Network. In International Conference on Computer Vision (ICCV), 4165–4173.
- A Deep Learning Bidirectional Temporal Tracking Algorithm for Automated Blood Cell Counting from Non-invasive Capillaroscopy Videos. In de Bruijne, M.; Cattin, P. C.; Cotin, S.; Padoy, N.; Speidel, S.; Zheng, Y.; and Essert, C., eds., Medical Image Computing and Computer Assisted Intervention (MICCAI), 415–424.
- Differentiable Forward and Backward Fixed-Point Iteration Layers. IEEE Access, 9: 18383–18392.
- Improving Object-centric Learning with Query Optimization. In International Conference on Learning Representations (ICLR).
- CLIP-Count: Towards Text-Guided Zero-Shot Object Counting. CoRR, abs/2305.07304.
- Segment Anything. In International Conference on Computer Vision (ICCV).
- Corn plant counting using deep learning and UAV images. IEEE Geoscience and Remote Sensing Letters (GRSL).
- Optimal Transport Minimization: Crowd Localization on Density Maps for Semi-Supervised Counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 21663–21673.
- Scale-Prior Deformable Convolution for Exemplar-Guided Class-Agnostic Counting. In British Machine Vision Conference (BMVC), 313.
- CounTR: Transformer-based Generalised Visual Counting. In British Machine Vision Conference (BMVC), 370.
- SGDR: Stochastic Gradient Descent with Warm Restarts. In International Conference on Learning Representations (ICLR).
- Decoupled Weight Decay Regularization. In International Conference on Learning Representations (ICLR).
- Class-agnostic counting. In Asian conference on computer vision (ACCV), 669–684. Springer.
- Few-Shot Object Counting and Detection. In European Conference on Computer Vision (ECCV), 348–365. Springer.
- Learning Transferable Visual Models From Natural Language Supervision. In International Conference on Machine Learning (ICML), volume 139, 8748–8763.
- Learning To Count Everything. In Conference on Computer Vision and Pattern Recognition (CVPR), 3394–3403.
- Represent, Compare, and Learn: A Similarity-Aware Framework for Class-Agnostic Counting. In Conference on Computer Vision and Pattern Recognition (CVPR), 9529–9538.
- Training-free Object Counting with Prompts. arXiv preprint arXiv:2307.00038.
- Generalized Characteristic Function Loss for Crowd Analysis in the Frequency Domain. IEEE Transactions on Pattern Analysis and Machine Intelligence.
- Indiscernible Object Counting in Underwater Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13791–13801.
- Counting trees with point-wise supervised segmentation network. Engineering Applications of Artificial Intelligence, 100: 104172.
- Representation Learning with Contrastive Predictive Coding. Arxiv, abs/1807.03748.
- Attention is All you Need. In Guyon, I.; von Luxburg, U.; Bengio, S.; Wallach, H. M.; Fergus, R.; Vishwanathan, S. V. N.; and Garnett, R., eds., Neural Information Processing Systems (NeurIPS), 5998–6008.
- NWPU-crowd: A large-scale benchmark for crowd counting and localization. IEEE transactions on pattern analysis and machine intelligence (T-PAMI), 43(6): 2141–2149.
- Robust hierarchical deep learning for vehicular management. IEEE Transactions on Vehicular Technology, 68(5): 4148–4156.
- Zero-Shot Object Counting. In Conference on Computer Vision and Pattern Recognition (CVPR), 15548–15557.
- Automatic Sheep Counting by Multi-object Tracking. In Conference on Visual Communications and Image Processing (VCIP), 257–257. IEEE.
- Class-agnostiFew-shot Object Counting. In Winter Conference on Applications of Computer Vision (WACV), 870–878.
- Counting of grapevine berries in images via semantic segmentation using convolutional neural networks. ISPRS Journal of Photogrammetry and Remote Sensing, 164: 73–83.
- Cross-view cross-scene multi-view crowd counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 557–567.
- Single-image crowd counting via multi-column convolutional neural network. In Conference on Computer Vision and Pattern recognition (CVPR), 589–597.
- Extract Free Dense Labels from CLIP. In European Computer Vision Conference (ECCV), volume 13688, 696–712.
- Wei Lin (207 papers)
- Antoni B. Chan (64 papers)