Emergent Mind

Abstract

To efficiently process visual data at scale, researchers have proposed two techniques for lowering the computational overhead associated with the underlying deep learning models. The first approach consists of leveraging a specialized, lightweight model to directly answer the query. The second approach focuses on filtering irrelevant frames using a lightweight model and processing the filtered frames using a heavyweight model. These techniques suffer from two limitations. With the first approach, the specialized model is unable to provide accurate results for hard-to-detect events. With the second approach, the system is unable to accelerate queries focusing on frequently occurring events as the filter is unable to eliminate a significant fraction of frames in the video. In this paper, we present THIA, a video analytics system for tackling these limitations. The design of THIA is centered around three techniques. First, instead of using a cascade of models, it uses a single object detection model with multiple exit points for short-circuiting the inference. This early inference technique allows it to support a range of throughput-accuracy tradeoffs. Second, it adopts a fine-grained approach to planning and processes different chunks of the video using different exit points to meet the user's requirements. Lastly, it uses a lightweight technique for directly estimating the exit point for a chunk to lower the optimization time. We empirically show that these techniques enable THIA to outperform two state-of-the-art video analytics systems by up to 6.5X, while providing accurate results even on queries focusing on hard-to-detect events.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.