咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Simultaneous Accelerator Paral... 收藏

Simultaneous Accelerator Parallelization and Point-to-Point Interconnect Insertion for Bus-Based Embedded SoCs

Simultaneous Accelerator Parallelization and Point-to-Point Interconnect Insertion for Bus-Based Embedded SoCs

作     者:Daming Zhang Yongpan Liu Shuangchen Li Tongda Wu Huazhong Yang 

作者机构:Department of Electronic EngineeringTsinghua University Department of Electronic and Computer EngineeringUniversity of California 

出 版 物:《Tsinghua Science and Technology》 (清华大学学报(自然科学版(英文版))

年 卷 期:2015年第20卷第6期

页      面:644-660页

核心收录:

学科分类:080903[工学-微电子学与固体电子学] 0809[工学-电子科学与技术(可授工学、理学学位)] 08[工学] 

基  金:supported in part by the National Natural Science Foundation of China (No. 61271269) the National High-Tech Research and Development (863) Program (No. 2013AA01320) the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions (No. YETP0102) 

主  题:accelerator parallelization point-to-point interco 

摘      要:As performance requirements for bus-based embedded System-on-Chips(So Cs) increase, more and more on-chip application-specific hardware accelerators(e.g., filters, FFTs, JPEG encoders, GSMs, and AES encoders) are being integrated into their designs. These accelerators require system-level tradeoffs among performance, area, and scalability. Accelerator parallelization and Point-to-Point(P2P) interconnect insertion are two effective system-level adjustments. The former helps to boost the computing performance at the cost of area,while the latter provides higher bandwidth at the cost of routability. What’s more, they interact with each other. This paper proposes a design flow to optimize accelerator parallelization and P2 P interconnect insertion *** explore the huge optimization space, we develop an effective algorithm, whose goal is to reduce total So C latency under the constraints of So C area and total P2 P wire length. Experimental results show that the performance difference between our proposed algorithm and the optimal results is only 2.33% on average, while the running time of the algorithm is less than 17 s.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分