Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance (2312.12648v1)

Published 19 Dec 2023 in cs.LG and cs.CV

Abstract: Among existing Neural Architecture Search methods, DARTS is known for its efficiency and simplicity. This approach applies continuous relaxation of network representation to construct a weight-sharing supernet and enables the identification of excellent subnets in just a few GPU days. However, performance collapse in DARTS results in deteriorating architectures filled with parameter-free operations and remains a great challenge to the robustness. To resolve this problem, we reveal that the fundamental reason is the biased estimation of the candidate importance in the search space through theoretical and experimental analysis, and more precisely select operations via information-based measurements. Furthermore, we demonstrate that the excessive concern over the supernet and inefficient utilization of data in bi-level optimization also account for suboptimal results. We adopt a more realistic objective focusing on the performance of subnets and simplify it with the help of the information-based measurements. Finally, we explain theoretically why progressively shrinking the width of the supernet is necessary and reduce the approximation error of optimal weights in DARTS. Our proposed method, named IS-DARTS, comprehensively improves DARTS and resolves the aforementioned problems. Extensive experiments on NAS-Bench-201 and DARTS-based search space demonstrate the effectiveness of IS-DARTS.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167.
  2. Stabilizing differentiable architecture search via perturbation-based regularization. In International conference on machine learning, 1554–1565. PMLR.
  3. Drnas: Dirichlet neural architecture search. arXiv preprint arXiv:2006.10355.
  4. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE/CVF international conference on computer vision, 1294–1303.
  5. Darts-: robustly stepping out of performance collapse without indicators. arXiv preprint arXiv:2009.01027.
  6. Fair darts: Eliminating unfair advantages in differentiable architecture search. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV, 465–480. Springer.
  7. One-shot neural architecture search via self-evaluated template network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3681–3690.
  8. Searching for a robust neural architecture in four gpu hours. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1761–1770.
  9. Nas-bench-201: Extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326.
  10. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  11. Dropnas: Grouped operation dropout for differentiable architecture search. arXiv preprint arXiv:2201.11679.
  12. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.
  13. Dsnas: Direct neural architecture search without parameter retraining. 2020 IEEE. In CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12081–12089.
  14. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4700–4708.
  15. Sgas: Sequential greedy architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1620–1630.
  16. Darts+: Improved differentiable architecture search with early stopping. arXiv preprint arXiv:1909.06035.
  17. Progressive neural architecture search. In Proceedings of the European conference on computer vision (ECCV), 19–34.
  18. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436.
  19. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055.
  20. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3431–3440.
  21. Automatic differentiation in pytorch.
  22. Efficient neural architecture search via parameters sharing. In International conference on machine learning, 4095–4104. PMLR.
  23. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, 4780–4789.
  24. Large-scale evolution of image classifiers. In International Conference on Machine Learning, 2902–2911. PMLR.
  25. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
  26. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1–9.
  27. idarts: Improving darts by node normalization and decorrelation discretization. IEEE Transactions on Neural Networks and Learning Systems.
  28. Rethinking architecture selection in differentiable nas. arXiv preprint arXiv:2108.04392.
  29. SNAS: stochastic neural architecture search. arXiv preprint arXiv:1812.09926.
  30. Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737.
  31. β𝛽\betaitalic_β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10864–10873. IEEE.
  32. Understanding and robustifying differentiable architecture search. arXiv preprint arXiv:1909.09656.
  33. Robustifying DARTS by Eliminating Information Bypass Leakage via Explicit Sparse Regularization. In 2021 IEEE International Conference on Data Mining (ICDM), 877–885. IEEE.
  34. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition, 6848–6856.
  35. Operation-level Progressive Differentiable Architecture Search. In 2021 IEEE International Conference on Data Mining (ICDM), 1559–1564. IEEE.
  36. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 8697–8710.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hongyi He (4 papers)
  2. Longjun Liu (4 papers)
  3. Haonan Zhang (51 papers)
  4. Nanning Zheng (146 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.