A Model Stealing Attack Against Multi-Exit Networks (2305.13584v2)
Abstract: Compared to traditional neural networks with a single output channel, a multi-exit network has multiple exits that allow for early outputs from the model's intermediate layers, thus significantly improving computational efficiency while maintaining similar main task accuracy. Existing model stealing attacks can only steal the model's utility while failing to capture its output strategy, i.e., a set of thresholds used to determine from which exit to output. This leads to a significant decrease in computational efficiency for the extracted model, thereby losing the advantage of multi-exit networks. In this paper, we propose the first model stealing attack against multi-exit networks to extract both the model utility and the output strategy. We employ Kernel Density Estimation to analyze the target model's output strategy and use performance loss and strategy loss to guide the training of the extracted model. Furthermore, we design a novel output strategy search algorithm to maximize the consistency between the victim model and the extracted model's output behaviors. In experiments across multiple multi-exit networks and benchmark datasets, our method always achieves accuracy and efficiency closest to the victim models.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.