咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >EAT-NAS: elastic architecture ... 收藏

EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search

作     者:Jiemin FANG Yukang CHEN Xinbang ZHANG Qian ZHANG Chang HUANG Gaofeng MENG Wenyu LIU Xinggang WANG Jiemin FANG;Yukang CHEN;Xinbang ZHANG;Qian ZHANG;Chang HUANG;Gaofeng MENG;Wenyu LIU;Xinggang WANG

作者机构:Institute of Artificial Intelligence Huazhong University of Science and Technology School of Electronic Information and Communications Huazhong University of Science and Technology National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences 

出 版 物:《Science China(Information Sciences)》 (中国科学:信息科学(英文版))

年 卷 期:2021年第64卷第9期

页      面:103-115页

核心收录:

学科分类:0808[工学-电气工程] 08[工学] 081104[工学-模式识别与智能系统] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:supported by National Natural Science Foundation of China (NSFC) (Grant Nos. 61876212, 61976208, 61733007) Zhejiang Lab (Grant No. 2019NB0AB02) HUST-Horizon Computer Vision Research Center 

主  题:architecture transfer neural architecture search evolutionary algorithm large-scale dataset 

摘      要:Neural architecture search(NAS) methods have been proposed to relieve human experts from tedious architecture engineering. However, most current methods are constrained in small-scale search owing to the issue of huge computational resource consumption. Meanwhile, the direct application of architectures searched on small datasets to large datasets often bears no performance guarantee due to the discrepancy between different datasets. This limitation impedes the wide use of NAS on large-scale tasks. To overcome this obstacle, we propose an elastic architecture transfer mechanism for accelerating large-scale NAS(EATNAS).In our implementations, the architectures are first searched on a small dataset, e.g., CIFAR-10. The best one is chosen as the basic architecture. The search process on a large dataset, e.g., ImageNet, is initialized with the basic architecture as the seed. The large-scale search process is accelerated with the help of the basic architecture. We propose not only a NAS method but also a mechanism for architecture-level transfer learning. In our experiments, we obtain two final models EATNet-A and EATNet-B, which achieve competitive accuracies of 75.5% and 75.6%, respectively, on ImageNet. Both the models also surpass the models searched from scratch on ImageNet under the same settings. For the computational cost, EAT-NAS takes only fewer than 5 days using 8 TITAN X GPUs, which is significantly less than the computational consumption of the state-of-the-art large-scale NAS methods.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分