Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 63 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 14 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 472 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

A Pitfall of Unsupervised Pre-Training (1703.04332v4)

Published 13 Mar 2017 in cs.CV

Abstract: The point of this paper is to question typical assumptions in deep learning and suggest alternatives. A particular contribution is to prove that even if a Stacked Convolutional Auto-Encoder is good at reconstructing pictures, it is not necessarily good at discriminating their classes. When using Auto-Encoders, intuitively one assumes that features which are good for reconstruction will also lead to high classification accuracy. Indeed, it became research practice and is a suggested strategy by introductory books. However, we prove that this is not always the case. We thoroughly investigate the quality of features produced by Stacked Convolutional Auto-Encoders when trained to reconstruct their input. In particular, we analyze the relation between the reconstruction and classification capabilities of the network, if we were to use the same features for both tasks. Experimental results suggest that in fact, there is no correlation between the reconstruction score and the quality of features for a classification task. This means, more formally, that the sub-dimension representation space learned from the Stacked Convolutional Auto-Encoder (while being trained for input reconstruction) is not necessarily better separable than the initial input space. Furthermore, we show that the reconstruction error is not a good metric to assess the quality of features, because it is biased by the decoder quality. We do not question the usefulness of pre-training, but we conclude that aiming for the lowest reconstruction error is not necessarily a good idea if afterwards one performs a classification task.

Citations (3)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.