Approximate Processing Element Design and Analysis for the Implementation of CNN Accelerators
作者机构:Department of Micro-Nano ElectronicsShanghai Jiao Tong UniversityShanghai 200240China School of Integrated CircuitsTsinghua UniversityBeijing 100084China Department of Electrical and Computer EngineeringUniversity of AlbertaEdmontonAB T6G 1H9Canada
出 版 物:《Journal of Computer Science & Technology》 (计算机科学技术学报(英文版))
年 卷 期:2023年第38卷第2期
页 面:309-327页
核心收录:
学科分类:0711[理学-系统科学] 07[理学] 0835[工学-软件工程] 0811[工学-控制科学与工程] 0701[理学-数学] 070101[理学-基础数学]
基 金:supported in part by the National Natural Science Foundation of China under Grant No.62104127 the National Key Research and Development Program of China under Grant No.2022YFB4500200
主 题:approximate computing convolutional neural network(CNN) sum of products(SoP) data representation multiplier
摘 要:As a primary computation unit,a processing element(PE)is key to the energy efficiency of a convolutional neural network(CNN)*** advantage of the inherent error tolerance of CNNs,approximate computing with high hardware efficiency has been considered for implementing the computation units of CNN ***,individual approximate designs such as multipliers and adders can only achieve limited accuracy and hardware *** this paper,an approximate PE is dedicatedly devised for CNN accelerators by synergistically considering the data representation,multiplication and *** approximate data format is defined for the weights using stochastic *** data format enables a simple implementation of multiplication by using small lookup tables,an adder and a *** approximate accumulators are further proposed for the product accumulation in the *** with the exact 8-bit fixed-point design,the proposed PE saves more than 29%and 20%in power-delay product for 3×3 and 5×5 sum of products,***,compared with the PEs consisting of state-of-the-art approximate multipliers,the proposed design shows significantly smaller error bias with lower hardware ***,the application of the approximate PEs in CNN accelerators is analyzed by implementing a multi-task CNN for face detection and *** conclude that 1)an approximate PE is more effective for face detection than for alignment,2)an approximate PE with high statistically-measured accuracy does not necessarily result in good quality in face detection,and 3)properly increasing the number of PEs in a CNN accelerator can improve its power and energy efficiency.