- The paper presents a Bayesian multi-view filter that integrates dynamic and measurement models to enable efficient real-time 3D track initiation, termination, and re-identification.
- The approach uses an adaptive birth model and occlusion handling based on bounding box overlaps to manage re-identification and maintain track continuity in cluttered environments.
- Experimental results on benchmark datasets show significant improvements in tracking accuracy and resilience against camera reconfigurations.
Track Initialization and Re-Identification for 3D Multi-View Multi-Object Tracking
This paper addresses the challenges of 3D multi-object tracking (MOT) using 2D detections from monocular cameras to automatically initiate, terminate, and re-identify tracks while handling occlusions. The proposed solution integrates multi-object dynamic and measurement models into a Bayesian filtering framework for practical real-world tracking applications.
Proposed Solution Overview
The authors propose a 3D multi-view MOT (MV-MOT) solution that efficiently combines track-by-detection approaches using 2D monocular camera detections to achieve 3D tracking. The core methodology involves leveraging a Bayesian multi-object framework that performs automatic track initiation/termination, track re-identification, occlusion handling, and data association in a single Bayes filtering recursion. While computational intractability is a challenge due to the exponential complexity of exactly implementing such a filter, the authors present an approximation suitable for online MOT by incorporating object features and kinematics into the measurement model, effectively reducing the number of computational terms needed.
Figure 1: Schematic of the proposed 3D MV-MOT solution. Multi-view detections (bounding boxes and visual features from all cameras) are supplied to the MV-MOT filter, integrating multi-object dynamic and measurement models to realize all MOT functionalities.
Bayesian Multi-View MOT Filter
The proposed filter uses a combination of geometric projections and adaptive models to handle occlusions and track re-identification effectively. The multi-view Bayesian tracking framework, employs numerical approximations of the GLMB filter, such that state estimation and track management is feasible in practice. The filter achieves linear complexity concerning the detection count across cameras, facilitating efficient online operation even when cameras are reconfigured without the need for detector retraining.
Implementation and Adaptive Models
Occlusion Handling
An innovative occlusion model accounts for partial and complete occlusions by evaluating the overlap of bounding boxes on camera image planes. Detection probability for each object is adjusted based on its occlusion score, drastically improving tracking in cluttered environments.
Figure 2: Schematic of the proposed multi-view MOT filter, showing the integration of Adaptive Birth Model and Occlusion Model for realizing MOT functionalities.
Track Initialization and Re-Identification
Track initialization and re-identification is achieved through an adaptive birth model that generates labels for newly appearing or reappearing objects using clustering techniques on sensor data. This approach not only initiates new tracks but also restores terminated ones based on visual feature similarity measures.
Figure 3: Illustration of detection probability differences correlating with track overlap and distance from the camera.
Adaptive Birth Model Parameters
Employing a statistical adaptive birth model, the filter estimates and updates model parameters online, ensuring new tracks are initialized accurately by analyzing feature vectors for similarity and recalling tentatively terminated tracks.
Experimental Results
The proposed MV-MOT filter was evaluated on datasets such as WILDTRACK and Curtin multi-camera (CMC) to demonstrate its robustness in various tracking scenarios. Results indicated substantial accuracy improvements and resilience to camera reconfigurations compared to existing solutions.

Figure 4: 3D ellipsoid estimates from the proposed MV-GLMB-AB filter utilizing CSTrack detection inputs, with projections on respective camera planes.
Impact and Future Directions
This research advances the capability of current MOT systems, particularly in deploying real-time monitoring systems without the need for exhaustive computational resources. The integration of appearance-reappearance resolution provides nuanced handling for complex tracking challenges. Future developments may focus on enhancing feature extraction from monocular inputs, improving computational efficiency through better optimization algorithms, and seamlessly incorporating other sensor modalities.
Conclusion
The paper presents a sophisticated approach for real-time 3D MV-MOT by efficiently integrating dynamic models and reducing approximation complexity. The insights obtained from this research open avenues for enhancing practical object tracking applications, sustaining robust performance in dynamic environments with frequent sensor configuration changes.