skip to main content
10.1145/3583131.3590513acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

MPENAS: Multi-fidelity Predictor-guided Evolutionary Neural Architecture Search with Zero-cost Proxies

Published: 12 July 2023 Publication History

Abstract

Neural architecture search (NAS) aims to automatically design suitable architectures of artificial neural networks (ANNs) under various situations. Recently, NAS based on zero-cost proxies can predict the performance of ANNs with the cost of a single forward/backward propagation pass at most. While zero-cost proxies can speed up NAS by orders of magnitude, the gap between the predicted and actual performance of ANNs prevents zero-cost proxies from identifying ANNs with top performance.
One solution is to regard zero-cost proxies as a low-fidelity evaluation method and switch from zero-cost proxies to high-fidelity evaluation methods when the zero-cost proxies struggle at selecting architectures. Based on this idea, we propose Multi-fidelity Predictor-guided Evolutionary Neural Architecture Search (MPENAS). MPENAS is based on a novel surrogate-assisted evolutionary computation framework. With a predictor, MPENAS combines architecture encodings, zero-cost proxies, learning curve extrapolations, and fully trained ANNs' performance into one consistent fitness across different fidelity.
To our knowledge, MPENAS is the first work that integrates zero-cost proxies into a multi-fidelity optimization framework. MPENAS outperforms ten other methods for the NAS-Bench-201 search space in all cases. In addition, we demonstrate the generalizability of MPENAS for the TransNAS-Bench-101 search space.

Supplementary Material

PDF File (p1276-xu-suppl.pdf)
Supplemental material.

References

[1]
Mohamed S Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, and Nicholas D Lane. 2021. Zero-cost proxies for lightweight nas. arXiv preprint arXiv:2101.08134 (2021).
[2]
Noor Awad, Neeratyoy Mallik, and Frank Hutter. 2021. Dehb: Evolutionary hyperband for scalable, robust and efficient hyperparameter optimization. arXiv preprint arXiv:2105.09821 (2021).
[3]
Bowen Baker, Otkrist Gupta, Ramesh Raskar, and Nikhil Naik. 2017. Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823 (2017).
[4]
James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of machine learning research 13, 2 (2012).
[5]
Hanlin Chen, Ming Lin, Xiuyu Sun, and Hao Li. 2021. Nas-bench-zero: A large scale dataset for understanding zero-shot neural architecture search. (2021).
[6]
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785--794.
[7]
Wuyang Chen, Xinyu Gong, and Zhangyang Wang. 2021. Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. arXiv preprint arXiv:2102.11535 (2021).
[8]
Xiangning Chen, Ruochen Wang, Minhao Cheng, Xiaocheng Tang, and Cho-Jui Hsieh. 2020. Drnas: Dirichlet neural architecture search. arXiv preprint arXiv:2006.10355 (2020).
[9]
Xuanyi Dong and Yi Yang. 2019. Searching for a robust neural architecture in four gpu hours. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1761--1770.
[10]
Xuanyi Dong and Yi Yang. 2020. Nas-bench-201: Extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326 (2020).
[11]
Tony Duan, Avati Anand, Daisy Yi Ding, Khanh K Thai, Sanjay Basu, Andrew Ng, and Alejandro Schuler. 2020. Ngboost: Natural gradient boosting for probabilistic prediction. In International Conference on Machine Learning. PMLR, 2690--2700.
[12]
Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, and Zhenguo Li. 2021. Transnas-bench-101: Improving transferability and generalizability of cross-task neural architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5251--5260.
[13]
Lukasz Dudziak, Thomas Chau, Mohamed Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas Lane. 2020. Brp-nas: Prediction-based nas using gcns. Advances in Neural Information Processing Systems 33 (2020), 10480--10490.
[14]
Michael TM Emmerich, Kyriakos C Giannakoglou, and Boris Naujoks. 2006. Single-and multiobjective evolutionary optimization assisted by Gaussian random field metamodels. IEEE Transactions on Evolutionary Computation 10, 4 (2006), 421--439.
[15]
Stefan Falkner, Aaron Klein, and Frank Hutter. 2018. BOHB: Robust and efficient hyperparameter optimization at scale. In International Conference on Machine Learning. PMLR, 1437--1446.
[16]
Bryson Greenwood and Tyler McDonnell. 2022. Surrogate-assisted neuroevolution. In Proceedings of the Genetic and Evolutionary Computation Conference. 1048--1056.
[17]
John J Grefenstette and J Michael Fitzpatrick. 2014. Genetic search with approximate function evaluations. In Proceedings of the First International Conference on Genetic Algorithms and Their Applications. Psychology Press, 112--120.
[18]
M Hüscken, Y Jin, and B Sendhoff. 2005. Structure optimization of neural networks for aerodynamic optimization. Soft Computing Journal 9, 1 (2005), 21--28.
[19]
Yesmina Jaafra, Jean Luc Laurent, Aline Deruyver, and Mohamed Saber Naceur. 2019. Reinforcement learning for neural architecture search: A review. Image and Vision Computing 89 (2019), 57--66.
[20]
Arthur Jacot, Franck Gabriel, and Clément Hongler. 2018. Neural tangent kernel: Convergence and generalization in neural networks. Advances in neural information processing systems 31 (2018).
[21]
Kevin Jamieson and Ameet Talwalkar. 2016. Non-stochastic best arm identification and hyperparameter optimization. In Artificial intelligence and statistics. PMLR, 240--248.
[22]
Jun-Rong Jian, Zhi-Hui Zhan, and Jun Zhang. 2020. Large-scale evolutionary optimization: a survey and experimental comparative study. International Journal of Machine Learning and Cybernetics 11, 3 (2020), 729--745.
[23]
Yaochu Jin. 2005. A comprehensive survey of fitness approximation in evolutionary computation. Soft computing 9, 1 (2005), 3--12.
[24]
Yaochu Jin. 2011. Surrogate-assisted evolutionary computation: Recent advances and future challenges. Swarm and Evolutionary Computation 1, 2 (2011), 61--70.
[25]
Yaochu Jin, Markus Olhofer, Bernhard Sendhoff, et al. 2000. On Evolutionary Optimization with Approximate Fitness Functions. In Gecco. 786--793.
[26]
Yaochu Jin and Bernhard Sendhoff. 2004. Reducing fitness evaluations using clustering techniques and neural network ensembles. In Genetic and Evolutionary Computation-GECCO 2004: Genetic and Evolutionary Computation Conference, Seattle, WA, USA, June 26--30, 2004. Proceedings, Part I. Springer, 688--699.
[27]
Yaochu Jin, Handing Wang, Tinkle Chugh, Dan Guo, and Kaisa Miettinen. 2018. Data-driven evolutionary optimization: An overview and case studies. IEEE Transactions on Evolutionary Computation 23, 3 (2018), 442--458.
[28]
Hiroaki Kitano. 1990. Designing neural networks using genetic algorithms with graph generation system. Complex systems 4 (1990), 461--476.
[29]
Lars Kotthoff, Chris Thornton, Holger H Hoos, Frank Hutter, and Kevin Leyton-Brown. 2019. Auto-WEKA: Automatic model selection and hyperparameter optimization in WEKA. Automated machine learning: methods, systems, challenges (2019), 81--95.
[30]
Arjun Krishnakumar, Colin White, Arber Zela, Renbo Tu, Mahmoud Safari, and Frank Hutter. 2022. NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies. arXiv preprint arXiv:2210.03230 (2022).
[31]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[32]
Jian-Yu Li, Zhi-Hui Zhan, and Jun Zhang. 2022. Evolutionary computation for expensive optimization: A survey. Machine Intelligence Research 19, 1 (2022), 3--23.
[33]
Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2017. Hyperband: A novel bandit-based approach to hyperparameter optimization. The Journal of Machine Learning Research 18, 1 (2017), 6765--6816.
[34]
Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).
[35]
Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, and Tie-Yan Liu. 2020. Accuracy prediction with non-neural model for neural architecture search. arXiv preprint arXiv:2007.04785 (2020).
[36]
Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, and Tie-Yan Liu. 2020. Semi-supervised neural architecture search. Advances in Neural Information Processing Systems 33 (2020), 10547--10557.
[37]
Joe Mellor, Jack Turner, Amos Storkey, and Elliot J Crowley. 2021. Neural architecture search without training. In International Conference on Machine Learning. PMLR, 7588--7598.
[38]
Xuefei Ning, Changcheng Tang, Wenshuo Li, Zixuan Zhou, Shuang Liang, Huazhong Yang, and Yu Wang. 2021. Evaluating efficient performance estimators of neural architectures. Advances in Neural Information Processing Systems 34 (2021), 12265--12277.
[39]
Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, and Jeff Dean. 2018. Efficient neural architecture search via parameters sharing. In International conference on machine learning. PMLR, 4095--4104.
[40]
Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, Vol. 33. 4780--4789.
[41]
Binxin Ru, Clare Lyle, Lisa Schut, Mark van der Wilk, and Yarin Gal. 2020. Revisiting the train loss: an efficient performance estimator for neural architecture search. stat 1050 (2020), 8.
[42]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision 115 (2015), 211--252.
[43]
Mourad Sefrioui and Jacques Périaux. 2000. Ahierarchical genetic algorithm using multiple models for optimization. In Parallel Problem Solving from Nature PPSN VI: 6th International Conference Paris, France, September 18--20, 2000 Proceedings 6. Springer, 879--888.
[44]
Julien Siems, Lucas Zimmer, Arber Zela, Jovita Lukasik, Margret Keuper, and Frank Hutter. 2020. Nas-bench-301 and the case for surrogate benchmarks for neural architecture search. arXiv preprint arXiv:2008.09777 (2020).
[45]
Rainer Storn and Kenneth Price. 1997. Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. Journal of global optimization 11, 4 (1997), 341.
[46]
Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. 2019. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2820--2828.
[47]
Renbo Tu, Mikhail Khodak, Nicholas Carl Roberts, and Ameet Talwalkar. 2021. Nas-bench-360: Benchmarking diverse tasks for neural architecture search. (2021).
[48]
Colin White. 2022. A Deeper Look at Zero-Cost Proxies for Lightweight NAS. Retrieved January 15, 2022 from https://iclr-blog-track.github.io/2022/03/25/zero-cost-proxies/
[49]
Colin White, Willie Neiswanger, and Yash Savani. 2021. Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10293--10301.
[50]
Colin White, Arber Zela, Robin Ru, Yang Liu, and Frank Hutter. 2021. How powerful are performance predictors in neural architecture search? Advances in Neural Information Processing Systems 34 (2021), 28454--28469.
[51]
Sheng-Hao Wu, Zhi-Hui Zhan, and Jun Zhang. 2021. SAFE: Scale-adaptive fitness evaluation method for expensive optimization problems. IEEE Transactions on Evolutionary Computation 25, 3 (2021), 478--491.
[52]
Lingxi Xie, Xin Chen, Kaifeng Bi, Longhui Wei, Yuhui Xu, Lanfei Wang, Zhengsu Chen, An Xiao, Jianlong Chang, Xiaopeng Zhang, et al. 2021. Weight-sharing neural architecture search: A battle to shrink the optimization gap. ACM Computing Surveys (CSUR) 54, 9 (2021), 1--37.
[53]
Huan Xiong, Lei Huang, Mengyang Yu, Li Liu, Fan Zhu, and Ling Shao. 2020. On the number of linear regions of convolutional neural networks. In International Conference on Machine Learning. PMLR, 10514--10523.
[54]
Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. 2019. Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737 (2019).
[55]
Antoine Yang, Pedro M Esperança, and Fabio M Carlucci. 2019. NAS evaluation is frustratingly hard. arXiv preprint arXiv:1912.12522 (2019).
[56]
Zhi-Hui Zhan, Lin Shi, Kay Chen Tan, and Jun Zhang. 2022. A survey on evolutionary computation for complex continuous optimization. Artificial Intelligence Review (2022), 1--52.
[57]
Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).
[58]
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8697--8710.

Index Terms

  1. MPENAS: Multi-fidelity Predictor-guided Evolutionary Neural Architecture Search with Zero-cost Proxies

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference
      July 2023
      1667 pages
      ISBN:9798400701191
      DOI:10.1145/3583131
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 July 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. neural architecture search
      2. zero-cost proxy
      3. multi-fidelity optimization
      4. surrogate-assisted evolutionary computation

      Qualifiers

      • Research-article

      Conference

      GECCO '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 181
        Total Downloads
      • Downloads (Last 12 months)61
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 25 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media