Emergent Mind

Retina : Low-Power Eye Tracking with Event Camera and Spiking Hardware

(2312.00425)
Published Dec 1, 2023 in cs.CV and cs.NE

Abstract

This paper introduces a neuromorphic methodology for eye tracking, harnessing pure event data captured by a Dynamic Vision Sensor (DVS) camera. The framework integrates a directly trained Spiking Neuron Network (SNN) regression model and leverages a state-of-the-art low power edge neuromorphic processor - Speck, collectively aiming to advance the precision and efficiency of eye-tracking systems. First, we introduce a representative event-based eye-tracking dataset, "Ini-30", which was collected with two glass-mounted DVS cameras from thirty volunteers. Then,a SNN model, based on Integrate And Fire (IAF) neurons, named "Retina", is described , featuring only 64k parameters (6.63x fewer than the latest) and achieving pupil tracking error of only 3.24 pixels in a 64x64 DVS input. The continous regression output is obtained by means of convolution using a non-spiking temporal 1D filter slided across the output spiking layer. Finally, we evaluate Retina on the neuromorphic processor, showing an end-to-end power between 2.89-4.8 mW and a latency of 5.57-8.01 mS dependent on the time window. We also benchmark our model against the latest event-based eye-tracking method, "3ET", which was built upon event frames. Results show that Retina achieves superior precision with 1.24px less pupil centroid error and reduced computational complexity with 35 times fewer MAC operations. We hope this work will open avenues for further investigation of close-loop neuromorphic solutions and true event-based training pursuing edge performance.

Overview

  • The paper introduces a novel low-power eye-tracking methodology using Dynamic Vision Sensors (DVS) and neuromorphic hardware, focusing on efficient data processing and reduced computational complexity.

  • A new event-based eye-tracking dataset called 'Ini-30' was developed, featuring data from 30 volunteers and providing enhanced precision through variable event flows and event-based slicing.

  • The 'Retina' model, optimized for the Synsense Speck neuromorphic processor, offers superior precision and significantly lower power consumption than traditional methods, paving the way for embedded and real-time applications.

Retina: Low-Power Eye Tracking with Event Camera and Spiking Hardware

The paper, "Retina: Low-Power Eye Tracking with Event Camera and Spiking Hardware," introduces a novel methodology for eye-tracking by leveraging event data captured by Dynamic Vision Sensors (DVS) and executed on neuromorphic hardware. This approach incorporates a Spiking Neuron Network (SNN) regression model optimized for low-power consumption and reduced computational complexity. The key contributions of this research are threefold: the introduction of an event-based eye-tracking dataset dubbed "Ini-30," the development of a model named "Retina" optimized for event data processing, and the deployment of this model on low-power neuromorphic hardware, the Synsense Speck.

Dataset: Ini-30

The "Ini-30" dataset is a seminal contribution, representing the first event-based eye-tracking dataset collected using DVS mounted on a glass frame. Datasets were compiled from 30 volunteers, yielding diverse recordings spanning event counts and durations. The event-based nature of this dataset introduces temporal variability, distinct from synthetic datasets or frame-based approaches predominantly found in prior works. Notably, data slicing is performed based on event counts rather than fixed timestamps, contributing to domain adaptation between different recordings and enhances the tracking precision by accounting for variable event flows.

Model: Retina

The "Retina" model exemplifies a highly efficient approach to eye tracking. Utilizing an SNN based on Integrate-and-Fire (IAF) neurons, the model contains only 64k parameters while achieving a pupil tracking error of 3.24 pixels on 64x64 DVS input. This model stands out due to its temporal regression mechanism, employing a non-spiking temporal weighted-sum filter which effectively integrates the spiking outputs. Additionally, this architecture circumvents the need for recurrent neurons or voltage decay factors, simplifying deployment on neuromorphic hardware not supporting such features.

The Retina model's performance exceeds that of state-of-the-art methods such as "3ET," demonstrating 20% superior precision and 30x fewer Multiply and Accumulate (MAC) operations in the evaluated synthetic datasets. The practical implications of this model are substantial, suggesting a feasible path towards energy-efficient, high-precision eye-tracking systems suitable for embedded applications.

Deployment on Neuromorphic Hardware: Synsense Speck

Deploying the Retina model on the Synsense Speck neuromorphic processor constitutes a significant step towards realizing low-power, real-time eye-tracking systems. The reported power consumption stands between 2.89 to 4.8 mW, and the end-to-end latency ranges from 5.57 to 8.01 ms. These metrics represent a substantial improvement over conventional frame-based methods, ensuring prolonged operation on battery-powered devices and potentially broadening the applicability in real-world scenarios such as portable or wearable eye-tracking applications.

Implications and Future Directions

The implications of the presented research are multifaceted. On a practical level, this work paves the way for energy-efficient, high-precision eye-tracking systems that are viable for real-time applications in unconstrained environments. Such advancements are pertinent for applications in augmented reality, human-computer interaction, and other domains requiring robust and efficient gaze tracking. The decrease in computational complexity and power consumption inherently supports the deployment on resource-constrained edge devices.

Theoretically, this research underscores the efficacy of event-based data combined with neuromorphic hardware, showcasing the potential to surpass traditional frame-based approaches both in terms of performance and efficiency. The integration of neuromorphic principles with real-world applications opens promising avenues for future explorations. Key areas for further investigation include adapting the architecture to handle continuous streams without neuron state resets, improving resilience to complex and dynamic lighting conditions, and exploring broader applications beyond eye tracking, such as gesture recognition and complex object tracking.

Conclusion

The paper "Retina: Low-Power Eye Tracking with Event Camera and Spiking Hardware" represents a noteworthy advancement in the field of event-based eye-tracking. By introducing the "Ini-30" dataset, developing the "Retina" model, and deploying it on the Synsense Speck neuromorphic processor, this research demonstrates a significant reduction in power consumption and computational complexity while maintaining high precision. Its implications extend to practical applications requiring low-power, high-speed eye tracking in real-world settings and provide a foundation for future advancements in neuromorphic computing and event-based data processing.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.