Emergent Mind

Abstract

The unsupervised Pretraining method has been widely used in aiding human action recognition. However, existing methods focus on reconstructing the already present frames rather than generating frames which happen in future.In this paper, We propose an improved Variantial Autoencoder model to extract the features with a high connection to the coming scenarios, also known as Predictive Learning. Our framework lists as following: two steam 3D-convolution neural networks are used to extract both spatial and temporal information as latent variables. Then a resample method is introduced to create new normal distribution probabilistic latent variables and finally, the deconvolution neural network will use these latent variables generate next frames. Through this possess, we train the model to focus more on how to generate the future and thus it will extract the future high connected features. In the experiment stage, A large number of experiments on UT and UCF101 datasets reveal that future generation aids Prediction does improve the performance. Moreover, the Future Representation Learning Network reach a higher score than other methods when in half observation. This means that Future Representation Learning is better than the traditional Representation Learning and other state- of-the-art methods in solving the human action prediction problems to some extends.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.