Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images (2311.01064v1)

Published 2 Nov 2023 in cs.CV and cs.LG

Abstract: Due to deteriorating environmental conditions and increasing human activity, conservation efforts directed towards wildlife is crucial. Motion-activated camera traps constitute an efficient tool for tracking and monitoring wildlife populations across the globe. Supervised learning techniques have been successfully deployed to analyze such imagery, however training such techniques requires annotations from experts. Reducing the reliance on costly labelled data therefore has immense potential in developing large-scale wildlife tracking solutions with markedly less human labor. In this work we propose WildMatch, a novel zero-shot species classification framework that leverages multimodal foundation models. In particular, we instruction tune vision-LLMs to generate detailed visual descriptions of camera trap images using similar terminology to experts. Then, we match the generated caption to an external knowledge base of descriptions in order to determine the species in a zero-shot manner. We investigate techniques to build instruction tuning datasets for detailed animal description generation and propose a novel knowledge augmentation technique to enhance caption quality. We demonstrate the performance of WildMatch on a new camera trap dataset collected in the Magdalena Medio region of Colombia.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (13)
  1. Zalan Fabian (14 papers)
  2. Zhongqi Miao (8 papers)
  3. Chunyuan Li (122 papers)
  4. Yuanhan Zhang (29 papers)
  5. Ziwei Liu (368 papers)
  6. Andrés Hernández (2 papers)
  7. Andrés Montes-Rojas (1 paper)
  8. Rafael Escucha (1 paper)
  9. Laura Siabatto (1 paper)
  10. Andrés Link (1 paper)
  11. Rahul Dodhia (33 papers)
  12. Juan Lavista Ferres (20 papers)
  13. Pablo Arbeláez (40 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.