An event-based implementation of saliency-based visual attention for rapid scene analysis (2401.05030v1)
Abstract: Selective attention is an essential mechanism to filter sensory input and to select only its most important components, allowing the capacity-limited cognitive structures of the brain to process them in detail. The saliency map model, originally developed to understand the process of selective attention in the primate visual system, has also been extensively used in computer vision. Due to the wide-spread use of frame-based video, this is how dynamic input from non-stationary scenes is commonly implemented in saliency maps. However, the temporal structure of this input modality is very different from that of the primate visual system. Retinal input to the brain is massively parallel, local rather than frame-based, asynchronous rather than synchronous, and transmitted in the form of discrete events, neuronal action potentials (spikes). These features are captured by event-based cameras. We show that a computational saliency model can be obtained organically from such vision sensors, at minimal computational cost. We assess the performance of the model by comparing its predictions with the distribution of overt attention (fixations) of human observers, and we make available an event-based dataset that can be used as ground truth for future studies.
- K. Koch, J. McLean, M. Berry, P. Sterling, V. Balasubramanian, and M. Freed, “Efficiency of information transmission by retinal ganglion cells,” Current biology, vol. 14, pp. 1523–1530, 2004.
- P. Le Callet and E. Niebur, “Visual Attention and Applications in Multimedia Technologies,” IEEE Proceedings, vol. 101, no. 9, pp. 2058–67, 2013, nIHMS539064.
- F. Baluch and L. Itti, “Mechanisms of top-down attention,” Trends in neurosciences, vol. 34, no. 4, pp. 210–224, 2011.
- C. E. Connor, H. E. Egeth, and S. Yantis, “Visual attention: bottom-up versus top-down,” Current biology, vol. 14, no. 19, pp. R850–R852, 2004.
- S. Mihalas, Y. Dong, R. von der Heydt, and E. Niebur, “Mechanisms of perceptual organization provide auto-zoom and auto-localization for attention to objects,” Proceedings of the National Academy of Sciences, vol. 108, no. 18, pp. 7583–8, 2011, pMC3088583.
- M. Usher and E. Niebur, “Modeling the Temporal Dynamics of IT Neurons in Visual Search: A Mechanism for Top-Down Selective Attention,” J. Cognitive Neuroscience, vol. 8, no. 4, pp. 311–327, 1996.
- A. Treisman and G. Gelade, “A feature integration theory of attention,” Cognitive psychology, vol. 12, no. 1, pp. 97–136, 1980.
- C. Koch and S. Ullman, “Shifts in Selective Visual Attention: Towards the Underlying Neural Circuitry,” Human neurobiology, vol. 4, no. 4, pp. 219–227, 1985.
- E. Niebur and C. Koch, “Control of Selective Visual Attention: Modeling the “Where” Pathway,” in Neural Information Processing Systems, D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, Eds. Cambridge, MA: MIT Press, 1996, vol. 8, pp. 802–808.
- L. Itti, C. Koch, and E. Niebur, “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254–1259, 1998.
- L. Itti and C. Koch, “Computational modelling of visual attention,” Nature Neuroscience, vol. 2, pp. 194–203, 2001.
- E. Niebur, “The Saliency Map,” Scholarpedia, vol. 2, no. 8, p. 2675, 2007.
- L. Itti, N. Dhavale, and F. Pighin, “Realistic avatar eye and head animation using a neurobiological model of visual attention,” in Proceedings of SPIE Vol. 5200, Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation VI, vol. 64, 2004. [Online]. Available: http://link.aip.org/link/?PSI/5200/64/1{&}Agg=doi
- L. Itti, “Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes,” Visual Cognition, vol. 12, pp. 1093–1123, 2005.
- R. Rosenholtz, “A simple saliency model predicts a number of motion popout phenomena,” Vision Research, vol. 39, no. 19, pp. 3157–3163, 1999.
- J. Molin, C. Thakur, E. Niebur, and R. Etienne-Cummings, “A neuromorphic proto-object based dynamic visual saliency model with a hybrid implementation,” IEEE Transactions on Biomedical Circuits and Systems, vol. 15, no. 3, pp. 580–594, 2021.
- P. Lichtsteiner, C. Posch, and T. Delbrück, “A 128 ×\times× 128 120 db 15 mu𝑚𝑢muitalic_m italic_us latency asynchronous temporal contrast vision sensor,” IEEE Journal of solid-state circuits, vol. 43, no. 2, pp. 566–576, 2008.
- C. Posch, D. Matolin, and R. Wohlgenannt, “A qvga 143 db dynamic range frame-free pwm image sensor with lossless pixel-level video compression and time-domain cds,” IEEE Journal of solid-state circuits, vol. 46, no. 1, pp. 259–275, 2011.
- A. Marcireau, S.-H. Ieng, C. Simon-Chane, and R. B. Benosman, “Event-based color segmentation with a high dynamic range sensor,” Frontiers in neuroscience, vol. 12, p. 135, 2018.
- J. Molin, R. Etienne-Cummings, and E. Niebur, “How is Motion Integrated into a Proto-Object Based Visual Saliency Model?” in 49th Annual Conference on Information Sciences and Systems IEEE-CISS-2015. IEEE Press, 2015, pp. 1–6.
- M. Tangemann, M. Kümmerer, T. S. Wallis, and M. Bethge, “Measuring the importance of temporal features in video saliency,” in European Conference on Computer Vision. Springer, 2020, pp. 667–684.
- D. J. Parkhurst, “Selective attention in natural vision: using computational models to quantify stimulus driven attentional allocation,” Ph.D. dissertation, Johns Hopkins University, April 2002.
- D. Gao, V. Mahadevan, and N. Vasconcelos, “On the plausibility of the discriminant center-surround hypothesis for visual saliency,” Journal of vision, vol. 8, no. 7, pp. 13–13, 2008.
- L. Itti and P. Baldi, “Bayesian surprise attracts human attention,” Advances in neural information processing systems, pp. 547–554, 2005.
- R. A. Rensink, “The dynamic representation of scenes,” Visual Cognition, vol. 7, no. 1/2/3, pp. 17–42, 2000.
- A. F. Russell, S. Mihalaş, R. von der Heydt, E. Niebur, and R. Etienne-Cummings, “A model of proto-object based saliency,” Vision Research, vol. 94, pp. 1–15, 2014.
- J. L. Molin, A. F. Russell, S. Mihalas, E. Niebur, and R. Etienne-Cummings, “Proto-object based visual saliency model with a motion-sensitive channel,” in Biomedical Circuits and Systems Conference (BioCAS), 2013 IEEE. IEEE, 2013, pp. 25–28.
- M. Iacono, G. D’Angelo, A. Glover, V. Tikhanoff, E. Niebur, and C. Bartolozzi, “Proto-object based saliency for event-driven cameras,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 805–812.
- P. L. A. Gabbott, K. A. C. Martin, and D. Whitteridge, “Connections between pyramidal neurons in layer 5 of cat visual cortex (area 17),” J. Comp. Neurol., vol. 259, pp. 364–381, 1987.
- C. Koch and S. Ullman, “Shifts in selective visual attention: towards the underlying neural circuitry,” Human Neurobiol., vol. 4, pp. 219–227, 1985.
- R. Berner and T. Delbruck, “Event-based pixel sensitive to changes of color and brightness,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 58, no. 7, pp. 1581–1590, 2011.
- A. Marcireau, S.-H. Ieng, and R. Benosman, “Sepia, Tarsier, and Chameleon: A Modular C++ Framework for Event-Based Computer Vision,” Frontiers in Neuroscience, vol. 13, Jan 2020. [Online]. Available: http://dx.doi.org/10.3389/fnins.2019.01338
- X. Lagorce, G. Orchard, F. Gallupi, B. Shi, and R. Benosman, “Hots: A hierarchy of event-based time-surfaces for pattern recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 7, pp. 1346–1359, 2017.
- D. Parkhurst, K. Law, and E. Niebur, “Modelling the role of salience in the allocation of visual selective attention,” Vision Research, vol. 42, no. 1, pp. 107–123, 2002.
- M. Emami and L. L. Hoberock, “Selection of a best metric and evaluation of bottom-up visual saliency models,” Image and Vision Computing, vol. 31, no. 10, pp. 796–808, 2013.
- U. Engelke, H. Liu, J. Wang, P. Le Callet, I. Heynderickx, H. J. Zepernick, and A. Maeder, “Comparative study of fixation density maps,” IEEE Transactions on Image Processing, vol. 22, no. 3, pp. 1121–1133, 2013.
- A. Borji, D. N. Sihite, and L. Itti, “Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study,” IEEE Transactions on Image Processing, vol. 22, no. (1), pp. 55–69, 2013.
- A. Borji, H. R. Tavakoli, D. N. Sihite, and L. Itti, “Analysis of scores, datasets, and models in visual saliency prediction,” Proceedings of the IEEE International Conference on Computer Vision, pp. 921–928, 2013.
- J. Li, C. Xia, Y. Song, S. Fang, and X. Chen, “A data-driven metric for comprehensive evaluation of saliency models,” Proceedings of the IEEE International Conference on Computer Vision, vol. 2015 Inter, pp. 190–198, 2015.
- R. J. Peters, A. Iyer, L. Itti, and C. Koch., “Components of bottom-up gaze allocation in natural images.” Vision Research, vol. 45, no. 18, pp. 2397 – 2416, 2005.
- M. Kümmerer, T. Wallis, and M. Bethge, “How close are we to understanding image-based saliency?” 2014. [Online]. Available: http://arxiv.org/abs/1409.7686
- M. Kümmerer, T. S. A. Wallis, and M. Bethge, “Information-theoretic model comparison unifies saliency metrics,” Proceedings of the National Academy of Sciences, vol. 112, no. 52, pp. 16 054–16 059, 2015.
- K. Pearson, “Contributions to the mathematical theory of evolution.” Philosophical Transactions of the Royal Society of London. A, vol. 185, pp. 71–110, 1894.
- O. Le Meur, P. Le Callet, and D. Barba, “Predicting visual fixations on video based on low-level visual features,” Vision Research, vol. 47, no. 19, pp. 2483–2498, 2007.
- N. Wilming, T. Betz, T. C. Kietzmann, and P. König, “Measures and limits of models of fixation selection,” Plos One, vol. 6, no. 9, pp. 1–19, 2011.
- N. Riche, M. Duvinage, M. Mancas, B. Gosselin, and T. Dutoit, “Saliency and Human Fixations : State-of-the-Art and Study of Comparison Metrics,” in IEEE International Conference on Computer Vision, 2013, pp. 1153–1160.
- Z. Bylinskii, T. Judd, A. Oliva, A. Torralba, and F. Durand, “What do different evaluation metrics tell us about saliency models?” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 3, pp. 740–757, 2019.
- L. Zhang, M. H. Tong, T. K. Marks, H. Shan, and G. W. Cottrell, “Sun: A bayesian framework for saliency using natural statistics,” Journal of vision, vol. 8, no. 7, pp. 32–32, 2008.
- M. J. Swain and D. H. Ballar, “Color indexing,” International Journal of Computer Vision, vol. 7, no. 1, 1991.
- http://www.streetlab-vision.com, accessed 2023-03-09.
- J. Harel, “A saliency implementation in matlab,” http://www.animaclock.com/harel/share/gbvs.php.
- E. Niebur and C. Koch, “Modeling The “Where” Visual Pathway,” in Proceedings of 2nd Joint Symposium on Neural Computation, Caltech–UCSD, T. J. Sejnowski, Ed. La Jolla: Institute for Neural Computation, 1995, vol. 5, pp. 26–35.
- R. VanRullen, “Visual saliency and spike timing in the ventral visual pathway,” Journal of Physiology Paris, vol. 97, no. 2-3, pp. 365–377, 2003.
- S. Yantis and J. Jonides, “Abrupt visual onsets and selective attention: evidence from visual search,” J Exp Psychol Hum Percept Perform, vol. 10, pp. 601–621, Oct 1984.
- J. Jonides and S. Yantis, “Uniqueness of abrupt visual onset in capturing attention,” Perception & Psychophysics, vol. 43, no. 4, pp. 346–354, 1988.
- D. Schreij, C. Owens, and J. Theeuwes, “Abrupt onsets capture attention independent of top-down control settings,” Perception & psychophysics, vol. 70, pp. 208–218, 2008.
- E. Ruthruff, C. Hauck, and M.-C. Lien, “What do we know about suppression of attention capture?” Visual Cognition, vol. 29, no. 9, pp. 604–607, 2021.
- W. F. Bacon and H. E. Egeth, “Overriding stimulus-driven attentional capture,” Perception & Psychophysics, vol. 55, pp. 485–496, 1994.
- K. Anton-Erxleben and M. Carrasco, “Attentional enhancement of spatial resolution: linking behavioural and neurophysiological evidence,” Nature Reviews Neuroscience, vol. 14, no. 3, pp. 188–200, 2013.