KPCA Spatio-temporal trajectory point cloud classifier for recognizing human actions in a CBVR system (1403.6794v1)

Published 26 Mar 2014 in cs.IR and cs.CV

Abstract: We describe a content based video retrieval (CBVR) software system for identifying specific locations of a human action within a full length film, and retrieving similar video shots from a query. For this, we introduce the concept of a trajectory point cloud for classifying unique actions, encoded in a spatio-temporal covariant eigenspace, where each point is characterized by its spatial location, local Frenet-Serret vector basis, time averaged curvature and torsion and the mean osculating hyperplane. Since each action can be distinguished by their unique trajectories within this space, the trajectory point cloud is used to define an adaptive distance metric for classifying queries against stored actions. Depending upon the distance to other trajectories, the distance metric uses either large scale structure of the trajectory point cloud, such as the mean distance between cloud centroids or the difference in hyperplane orientation, or small structure such as the time averaged curvature and torsion, to classify individual points in a fuzzy-KNN. Our system can function in real-time and has an accuracy greater than 93% for multiple action recognition within video repositories. We demonstrate the use of our CBVR system in two situations: by locating specific frame positions of trained actions in two full featured films, and video shot retrieval from a database with a web search application.