research-article

MPENAS: Multi-fidelity Predictor-guided Evolutionary Neural Architecture Search with Zero-cost Proxies

Authors:

Suryanarayanan NAV,

Hitoshi IbaAuthors Info & Claims

GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference

Pages 1276 - 1285

https://doi.org/10.1145/3583131.3590513

Published: 12 July 2023 Publication History

Abstract

Neural architecture search (NAS) aims to automatically design suitable architectures of artificial neural networks (ANNs) under various situations. Recently, NAS based on zero-cost proxies can predict the performance of ANNs with the cost of a single forward/backward propagation pass at most. While zero-cost proxies can speed up NAS by orders of magnitude, the gap between the predicted and actual performance of ANNs prevents zero-cost proxies from identifying ANNs with top performance.

One solution is to regard zero-cost proxies as a low-fidelity evaluation method and switch from zero-cost proxies to high-fidelity evaluation methods when the zero-cost proxies struggle at selecting architectures. Based on this idea, we propose Multi-fidelity Predictor-guided Evolutionary Neural Architecture Search (MPENAS). MPENAS is based on a novel surrogate-assisted evolutionary computation framework. With a predictor, MPENAS combines architecture encodings, zero-cost proxies, learning curve extrapolations, and fully trained ANNs' performance into one consistent fitness across different fidelity.

To our knowledge, MPENAS is the first work that integrates zero-cost proxies into a multi-fidelity optimization framework. MPENAS outperforms ten other methods for the NAS-Bench-201 search space in all cases. In addition, we demonstrate the generalizability of MPENAS for the TransNAS-Bench-101 search space.

Supplementary Material

PDF File (p1276-xu-suppl.pdf)

Supplemental material.

Download
1.90 MB

References

[1]

Mohamed S Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, and Nicholas D Lane. 2021. Zero-cost proxies for lightweight nas. arXiv preprint arXiv:2101.08134 (2021).

[2]

Noor Awad, Neeratyoy Mallik, and Frank Hutter. 2021. Dehb: Evolutionary hyperband for scalable, robust and efficient hyperparameter optimization. arXiv preprint arXiv:2105.09821 (2021).

[3]

Bowen Baker, Otkrist Gupta, Ramesh Raskar, and Nikhil Naik. 2017. Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823 (2017).

[4]

James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of machine learning research 13, 2 (2012).

[5]

Hanlin Chen, Ming Lin, Xiuyu Sun, and Hao Li. 2021. Nas-bench-zero: A large scale dataset for understanding zero-shot neural architecture search. (2021).

[6]

Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785--794.

Digital Library

[7]

Wuyang Chen, Xinyu Gong, and Zhangyang Wang. 2021. Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. arXiv preprint arXiv:2102.11535 (2021).

[8]

Xiangning Chen, Ruochen Wang, Minhao Cheng, Xiaocheng Tang, and Cho-Jui Hsieh. 2020. Drnas: Dirichlet neural architecture search. arXiv preprint arXiv:2006.10355 (2020).

[9]

Xuanyi Dong and Yi Yang. 2019. Searching for a robust neural architecture in four gpu hours. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1761--1770.

[10]

Xuanyi Dong and Yi Yang. 2020. Nas-bench-201: Extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326 (2020).

[11]

Tony Duan, Avati Anand, Daisy Yi Ding, Khanh K Thai, Sanjay Basu, Andrew Ng, and Alejandro Schuler. 2020. Ngboost: Natural gradient boosting for probabilistic prediction. In International Conference on Machine Learning. PMLR, 2690--2700.

[12]

Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, and Zhenguo Li. 2021. Transnas-bench-101: Improving transferability and generalizability of cross-task neural architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5251--5260.

[13]

Lukasz Dudziak, Thomas Chau, Mohamed Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas Lane. 2020. Brp-nas: Prediction-based nas using gcns. Advances in Neural Information Processing Systems 33 (2020), 10480--10490.

[14]

Michael TM Emmerich, Kyriakos C Giannakoglou, and Boris Naujoks. 2006. Single-and multiobjective evolutionary optimization assisted by Gaussian random field metamodels. IEEE Transactions on Evolutionary Computation 10, 4 (2006), 421--439.

Digital Library

[15]

Stefan Falkner, Aaron Klein, and Frank Hutter. 2018. BOHB: Robust and efficient hyperparameter optimization at scale. In International Conference on Machine Learning. PMLR, 1437--1446.

[16]

Bryson Greenwood and Tyler McDonnell. 2022. Surrogate-assisted neuroevolution. In Proceedings of the Genetic and Evolutionary Computation Conference. 1048--1056.

Digital Library

[17]

John J Grefenstette and J Michael Fitzpatrick. 2014. Genetic search with approximate function evaluations. In Proceedings of the First International Conference on Genetic Algorithms and Their Applications. Psychology Press, 112--120.

[18]

M Hüscken, Y Jin, and B Sendhoff. 2005. Structure optimization of neural networks for aerodynamic optimization. Soft Computing Journal 9, 1 (2005), 21--28.

Digital Library

[19]

Yesmina Jaafra, Jean Luc Laurent, Aline Deruyver, and Mohamed Saber Naceur. 2019. Reinforcement learning for neural architecture search: A review. Image and Vision Computing 89 (2019), 57--66.

Digital Library

[20]

Arthur Jacot, Franck Gabriel, and Clément Hongler. 2018. Neural tangent kernel: Convergence and generalization in neural networks. Advances in neural information processing systems 31 (2018).

[21]

Kevin Jamieson and Ameet Talwalkar. 2016. Non-stochastic best arm identification and hyperparameter optimization. In Artificial intelligence and statistics. PMLR, 240--248.

[22]

Jun-Rong Jian, Zhi-Hui Zhan, and Jun Zhang. 2020. Large-scale evolutionary optimization: a survey and experimental comparative study. International Journal of Machine Learning and Cybernetics 11, 3 (2020), 729--745.

[23]

Yaochu Jin. 2005. A comprehensive survey of fitness approximation in evolutionary computation. Soft computing 9, 1 (2005), 3--12.

[24]

Yaochu Jin. 2011. Surrogate-assisted evolutionary computation: Recent advances and future challenges. Swarm and Evolutionary Computation 1, 2 (2011), 61--70.

[25]

Yaochu Jin, Markus Olhofer, Bernhard Sendhoff, et al. 2000. On Evolutionary Optimization with Approximate Fitness Functions. In Gecco. 786--793.

[26]

Yaochu Jin and Bernhard Sendhoff. 2004. Reducing fitness evaluations using clustering techniques and neural network ensembles. In Genetic and Evolutionary Computation-GECCO 2004: Genetic and Evolutionary Computation Conference, Seattle, WA, USA, June 26--30, 2004. Proceedings, Part I. Springer, 688--699.

[27]

Yaochu Jin, Handing Wang, Tinkle Chugh, Dan Guo, and Kaisa Miettinen. 2018. Data-driven evolutionary optimization: An overview and case studies. IEEE Transactions on Evolutionary Computation 23, 3 (2018), 442--458.

[28]

Hiroaki Kitano. 1990. Designing neural networks using genetic algorithms with graph generation system. Complex systems 4 (1990), 461--476.

[29]

Lars Kotthoff, Chris Thornton, Holger H Hoos, Frank Hutter, and Kevin Leyton-Brown. 2019. Auto-WEKA: Automatic model selection and hyperparameter optimization in WEKA. Automated machine learning: methods, systems, challenges (2019), 81--95.

[30]

Arjun Krishnakumar, Colin White, Arber Zela, Renbo Tu, Mahmoud Safari, and Frank Hutter. 2022. NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies. arXiv preprint arXiv:2210.03230 (2022).

[31]

Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).

[32]

Jian-Yu Li, Zhi-Hui Zhan, and Jun Zhang. 2022. Evolutionary computation for expensive optimization: A survey. Machine Intelligence Research 19, 1 (2022), 3--23.

[33]

Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2017. Hyperband: A novel bandit-based approach to hyperparameter optimization. The Journal of Machine Learning Research 18, 1 (2017), 6765--6816.

Digital Library

[34]

Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).

[35]

Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, and Tie-Yan Liu. 2020. Accuracy prediction with non-neural model for neural architecture search. arXiv preprint arXiv:2007.04785 (2020).

[36]

Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, and Tie-Yan Liu. 2020. Semi-supervised neural architecture search. Advances in Neural Information Processing Systems 33 (2020), 10547--10557.

[37]

Joe Mellor, Jack Turner, Amos Storkey, and Elliot J Crowley. 2021. Neural architecture search without training. In International Conference on Machine Learning. PMLR, 7588--7598.

[38]

Xuefei Ning, Changcheng Tang, Wenshuo Li, Zixuan Zhou, Shuang Liang, Huazhong Yang, and Yu Wang. 2021. Evaluating efficient performance estimators of neural architectures. Advances in Neural Information Processing Systems 34 (2021), 12265--12277.

[39]

Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, and Jeff Dean. 2018. Efficient neural architecture search via parameters sharing. In International conference on machine learning. PMLR, 4095--4104.

[40]

Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, Vol. 33. 4780--4789.

Digital Library

[41]

Binxin Ru, Clare Lyle, Lisa Schut, Mark van der Wilk, and Yarin Gal. 2020. Revisiting the train loss: an efficient performance estimator for neural architecture search. stat 1050 (2020), 8.

[42]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision 115 (2015), 211--252.

[43]

Mourad Sefrioui and Jacques Périaux. 2000. Ahierarchical genetic algorithm using multiple models for optimization. In Parallel Problem Solving from Nature PPSN VI: 6th International Conference Paris, France, September 18--20, 2000 Proceedings 6. Springer, 879--888.

[44]

Julien Siems, Lucas Zimmer, Arber Zela, Jovita Lukasik, Margret Keuper, and Frank Hutter. 2020. Nas-bench-301 and the case for surrogate benchmarks for neural architecture search. arXiv preprint arXiv:2008.09777 (2020).

[45]

Rainer Storn and Kenneth Price. 1997. Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. Journal of global optimization 11, 4 (1997), 341.

Digital Library

[46]

Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. 2019. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2820--2828.

[47]

Renbo Tu, Mikhail Khodak, Nicholas Carl Roberts, and Ameet Talwalkar. 2021. Nas-bench-360: Benchmarking diverse tasks for neural architecture search. (2021).

[48]

Colin White. 2022. A Deeper Look at Zero-Cost Proxies for Lightweight NAS. Retrieved January 15, 2022 from https://iclr-blog-track.github.io/2022/03/25/zero-cost-proxies/

[49]

Colin White, Willie Neiswanger, and Yash Savani. 2021. Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10293--10301.

[50]

Colin White, Arber Zela, Robin Ru, Yang Liu, and Frank Hutter. 2021. How powerful are performance predictors in neural architecture search? Advances in Neural Information Processing Systems 34 (2021), 28454--28469.

[51]

Sheng-Hao Wu, Zhi-Hui Zhan, and Jun Zhang. 2021. SAFE: Scale-adaptive fitness evaluation method for expensive optimization problems. IEEE Transactions on Evolutionary Computation 25, 3 (2021), 478--491.

Digital Library

[52]

Lingxi Xie, Xin Chen, Kaifeng Bi, Longhui Wei, Yuhui Xu, Lanfei Wang, Zhengsu Chen, An Xiao, Jianlong Chang, Xiaopeng Zhang, et al. 2021. Weight-sharing neural architecture search: A battle to shrink the optimization gap. ACM Computing Surveys (CSUR) 54, 9 (2021), 1--37.

Digital Library

[53]

Huan Xiong, Lei Huang, Mengyang Yu, Li Liu, Fan Zhu, and Ling Shao. 2020. On the number of linear regions of convolutional neural networks. In International Conference on Machine Learning. PMLR, 10514--10523.

[54]

Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. 2019. Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737 (2019).

[55]

Antoine Yang, Pedro M Esperança, and Fabio M Carlucci. 2019. NAS evaluation is frustratingly hard. arXiv preprint arXiv:1912.12522 (2019).

[56]

Zhi-Hui Zhan, Lin Shi, Kay Chen Tan, and Jun Zhang. 2022. A survey on evolutionary computation for complex continuous optimization. Artificial Intelligence Review (2022), 1--52.

[57]

Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).

[58]

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8697--8710.

Index Terms

MPENAS: Multi-fidelity Predictor-guided Evolutionary Neural Architecture Search with Zero-cost Proxies
1. Computing methodologies
  1. Artificial intelligence
    1. Search methodologies
      1. Search with partial observations
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Efficient Multi-Fidelity Neural Architecture Search with Zero-Cost Proxy-Guided Local Search
GECCO '24: Proceedings of the Genetic and Evolutionary Computation Conference

Using zero-cost (ZC) metrics as proxies for network performance in Neural Architecture Search (NAS) allows search algorithms to thoroughly explore the architecture space due to their low computing costs. Nevertheless, recent studies indicate that relying ...
Knowledge reconstruction assisted evolutionary algorithm for neural network architecture search
Abstract
Neural architecture search (NAS) aims to provide a manual-free search method for obtaining robust and high-performance neural network structures. However, limited search space, weak empirical reusability, and low search efficiency ...
Highlights
- A priori knowledge-based search space construction method. Unlike traditional methods, the search space is constructed by reusing part of the architecture ...
Multi-population evolutionary neural architecture search with stacked generalization
Abstract
In recent years, neural architecture search (NAS) algorithms based on Evolutionary Computation (EC) have demonstrated immense potential in the automated design of deep neural network architectures, garnering widespread attention in the field of ...
Highlights
- A novel multi-population evolutionary search strategy is proposed.
- A performance predictor was designed using stacked generalization.
- The multi-head attention mechanism was employed.

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '23: Proceedings of the Genetic and Evolutionary Computation Conference

July 2023

1667 pages

ISBN:9798400701191

DOI:10.1145/3583131

Chair:
Sara Silva,
Program Chair:
Luís Paquete

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

GECCO '23

Sponsor:

SIGEVO

GECCO '23: Genetic and Evolutionary Computation Conference

July 15 - 19, 2023

Lisbon, Portugal

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
181
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)2

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten