ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture Search (2003.01335v1)

Published 3 Mar 2020 in cs.NE and cs.LG

Abstract: How to discover and evaluate the true strength of models quickly and accurately is one of the key challenges in Neural Architecture Search (NAS). To cope with this problem, we propose an Architecture-Driven Weight Prediction (ADWP) approach for neural architecture search (NAS). In our approach, we first design an architecture-intensive search space and then train a HyperNetwork by inputting stochastic encoding architecture parameters. In the trained HyperNetwork, weights of convolution kernels can be well predicted for neural architectures in the search space. Consequently, the target architectures can be evaluated efficiently without any finetuning, thus enabling us to search fortheoptimalarchitectureinthespaceofgeneralnetworks (macro-search). Through real experiments, we evaluate the performance of the models discovered by the proposed AD-WPNAS and results show that one search procedure can be completed in 4.0 GPU hours on CIFAR-10. Moreover, the discovered model obtains a test error of 2.41% with only 1.52M parameters which is superior to the best existing models.

Citations (1)

View on Semantic Scholar