Emergent Mind

Abstract

We describe a content based video retrieval (CBVR) software system for identifying specific locations of a human action within a full length film, and retrieving similar video shots from a query. For this, we introduce the concept of a trajectory point cloud for classifying unique actions, encoded in a spatio-temporal covariant eigenspace, where each point is characterized by its spatial location, local Frenet-Serret vector basis, time averaged curvature and torsion and the mean osculating hyperplane. Since each action can be distinguished by their unique trajectories within this space, the trajectory point cloud is used to define an adaptive distance metric for classifying queries against stored actions. Depending upon the distance to other trajectories, the distance metric uses either large scale structure of the trajectory point cloud, such as the mean distance between cloud centroids or the difference in hyperplane orientation, or small structure such as the time averaged curvature and torsion, to classify individual points in a fuzzy-KNN. Our system can function in real-time and has an accuracy greater than 93% for multiple action recognition within video repositories. We demonstrate the use of our CBVR system in two situations: by locating specific frame positions of trained actions in two full featured films, and video shot retrieval from a database with a web search application.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.