Automated Behavioral Analysis Using Instance Segmentation (2312.07723v1)
Abstract: Animal behavior analysis plays a crucial role in various fields, such as life science and biomedical research. However, the scarcity of available data and the high cost associated with obtaining a large number of labeled datasets pose significant challenges. In this research, we propose a novel approach that leverages instance segmentation-based transfer learning to address these issues. By capitalizing on fine-tuning the classification head of the instance segmentation network, we enable the tracking of multiple animals and facilitate behavior analysis in laboratory-recorded videos. To demonstrate the effectiveness of our method, we conducted a series of experiments, revealing that our approach achieves exceptional performance levels, comparable to human capabilities, across a diverse range of animal behavior analysis tasks. Moreover, we emphasize the practicality of our solution, as it requires only a small number of labeled images for training. To facilitate the adoption and further development of our method, we have developed an open-source implementation named Annolid (An annotation and instance segmentation-based multiple animal tracking and behavior analysis package). The codebase is publicly available on GitHub at https://github.com/cplab/annolid. This resource serves as a valuable asset for researchers and practitioners interested in advancing animal behavior analysis through state-of-the-art techniques.
- A primer on motion capture with deep learning: Principles, pitfalls, and perspectives. Neuron 108, 44–65 (2020).
- Pereira, T. et al. Fast animal pose estimation using deep neural networks. bioRxiv (2018). URL https://www.biorxiv.org/content/early/2018/05/30/331181. https://www.biorxiv.org/content/early/2018/05/30/331181.full.pdf.
- Pereira, T. D. et al. Sleap: Multi-animal pose tracking. bioRxiv (2020). URL https://www.biorxiv.org/content/early/2020/09/02/2020.08.31.276246. https://www.biorxiv.org/content/early/2020/09/02/2020.08.31.276246.full.pdf.
- Lauer, J. et al. Multi-animal pose estimation and tracking with deeplabcut. bioRxiv (2021).
- Deepbehavior: A deep learning toolbox for automated analysis of animal and human behavior imaging data. Frontiers in Systems Neuroscience 2019 May 7;13:20 (2019).
- Segalin, C. et al. The mouse action recognition system (mars): a software pipeline for automated analysis of social behaviors in mice. bioRxiv (2020).
- Sun, J. J. et al. The multi-agent behavior dataset: Mouse dyadic social interactions. arXiv preprint arXiv:2104.02710 (2021).
- Bohnslav, J. P. et al. Deepethogram: a machine learning pipeline for supervised behavior classification from raw pixels. bioRxiv (2020). URL https://www.biorxiv.org/content/early/2020/09/25/2020.09.24.312504. https://www.biorxiv.org/content/early/2020/09/25/2020.09.24.312504.full.pdf.
- Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, 2961–2969 (2017).
- PointRend: Image segmentation as rendering (2019).
- Yolact++: Better real-time instance segmentation (2019). 1912.06218.
- Yolact: Real-time instance segmentation. In ICCV (2019).
- Tracking objects as points. arXiv preprint arXiv:2004.01177 (2020).
- Deep affinity network for multiple object tracking (2018). 1810.11780.
- Ciaparrone, G. et al. Deep learning in video multi-object tracking: A survey. Neurocomputing 381, 61–88 (2020).
- Fairmot: On the fairness of detection and re-identification in multiple object tracking (2020). 2004.01888.
- Mathis, A. et al. Deeplabcut: markerless pose estimation of user-defined body parts with deep learning. Nature Neuroscience (2018). URL https://www.nature.com/articles/s41593-018-0209-y.
- Deep learning tools for the measurement of animal behavior in neuroscience. Current Opinion in Neurobiology 60, 1–11 (2020). URL http://dx.doi.org/10.1016/j.conb.2019.10.008.
- Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017).
- Lin, T.-Y. et al. Microsoft coco: Common objects in context. In European conference on computer vision, 740–755 (Springer, 2014).
- Detectron2. https://github.com/facebookresearch/detectron2 (2019).
- Koch, G. Siamese neural networks for one-shot image recognition (2015).
- Wada, K. labelme: Image Polygonal Annotation with Python. https://github.com/wkentaro/labelme (2016).
- Nath, T. et al. Using deeplabcut for 3d markerless pose estimation across species and behaviors. Nature protocols 14, 2152–2176 (2019).
- Kirillov, A. et al. Segment anything. arXiv:2304.02643 (2023).
- Hügel, S. Simplification (2021). URL https://github.com/urschrei/simplification.
- Deng, J. et al. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255 (Ieee, 2009).
- Multiple object tracking performance metrics and evaluation in a smart room environment. In Sixth IEEE International Workshop on Visual Surveillance, in conjunction with ECCV, vol. 90, 91 (Citeseer, 2006).
- Simple online and realtime tracking with a deep association metric. In 2017 IEEE International Conference on Image Processing (ICIP), 3645–3649 (IEEE, 2017).
- Deep cosine metric learning for person re-identification. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 748–756 (IEEE, 2018).
- An introduction to the kalman filter (1995).
- Kuhn, H. W. The hungarian method for the assignment problem. Naval research logistics quarterly 2, 83–97 (1955).
- Tube convolutional neural network (t-cnn) for action detection in videos (2017). 1703.10664.