Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OVTrack: Open-Vocabulary Multiple Object Tracking (2304.08408v1)

Published 17 Apr 2023 in cs.CV

Abstract: The ability to recognize, localize and track dynamic objects in a scene is fundamental to many real-world applications, such as self-driving and robotic systems. Yet, traditional multiple object tracking (MOT) benchmarks rely only on a few object categories that hardly represent the multitude of possible objects that are encountered in the real world. This leaves contemporary MOT methods limited to a small set of pre-defined object categories. In this paper, we address this limitation by tackling a novel task, open-vocabulary MOT, that aims to evaluate tracking beyond pre-defined training categories. We further develop OVTrack, an open-vocabulary tracker that is capable of tracking arbitrary object classes. Its design is based on two key ingredients: First, leveraging vision-LLMs for both classification and association via knowledge distillation; second, a data hallucination strategy for robust appearance feature learning from denoising diffusion probabilistic models. The result is an extremely data-efficient open-vocabulary tracker that sets a new state-of-the-art on the large-scale, large-vocabulary TAO benchmark, while being trained solely on static images. Project page: https://www.vis.xyz/pub/ovtrack/

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Siyuan Li (140 papers)
  2. Tobias Fischer (125 papers)
  3. Lei Ke (31 papers)
  4. Henghui Ding (87 papers)
  5. Martin Danelljan (96 papers)
  6. Fisher Yu (104 papers)
Citations (36)

Summary

We haven't generated a summary for this paper yet.