Revealing Secrets From Pre-trained Models (2207.09539v1)

Published 19 Jul 2022 in cs.CR and cs.LG

Abstract: With the growing burden of training deep learning models with large data sets, transfer-learning has been widely adopted in many emerging deep learning algorithms. Transformer models such as BERT are the main player in natural language processing and use transfer-learning as a de facto standard training method. A few big data companies release pre-trained models that are trained with a few popular datasets with which end users and researchers fine-tune the model with their own datasets. Transfer-learning significantly reduces the time and effort of training models. However, it comes at the cost of security concerns. In this paper, we show a new observation that pre-trained models and fine-tuned models have significantly high similarities in weight values. Also, we demonstrate that there exist vendor-specific computing patterns even for the same models. With these new findings, we propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model with vendor-specific computing patterns and then estimates the entire model weights based on the weight value similarities between the fine-tuned model and pre-trained model. We also show that the weight similarity can be leveraged for increasing the model extraction feasibility through a novel weight extraction pruning.

Authors (3)

Mujahid Al Rafi (2 papers)
Yuan Feng (109 papers)
Hyeran Jeon (6 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Revealing Secrets From Pre-trained Models (2207.09539v1)

Summary

Related Papers