PDF(7993 KB)
基于国产智能可重构平台的AI加速软硬件设计
郭涛, 周海洋, 余裕鑫, 范晓畅, 王硕, 张彦龙, 陈雷
集成电路与嵌入式系统 ›› 2025, Vol. 25 ›› Issue (12) : 8-17.
PDF(7993 KB)
PDF(7993 KB)
基于国产智能可重构平台的AI加速软硬件设计
AI acceleration software and hardware design based on domestic intelligent reconfigurable platform
针对装备电子系统智能化的需求,基于“鸿芯”智能可重构平台中的可编程逻辑设计了一款神经网络加速器软核及配套的量化编译软件,实现了神经网络模型面向自研加速器软核的统一量化编译与加速运行,同时拓展“鸿途”嵌入式实时操作系统功能,实现了对神经网络硬件加速运行的支持。经实验测试,神经网络加速器软核性能与AMD Xilinx DPU软核相当,“鸿途”嵌入式实时操作系统运行ResNet18、ResNet50的性能相比AMD Xilinx PetaLinux环境提升了4倍,提升了“鸿芯”智能可重构平台中的人工智能能力。
In response to the demand for intelligent equipment electronic systems, this article designs a neural network accelerator soft core and supporting quantitative compilation software based on the programmable logic on the "Hongxin" intelligent reconfigurable platform. It realizes the unified quantitative compilation and deployment of neural network models for self-developed accelerator soft cores, and expands the functions of the "Hongtu" embedded real-time operating system, achieving support for hardware accelerated operation of neural networks. Through experimental testing, the performance of the neural network accelerator soft core is comparable to that of the AMD Xilinx DPU soft core. The "Hongtu" embedded real-time operating system running ResNet18 and ResNet50 delivers four times higher performance than the AMD Xilinx PetaLinux environment. These results enhance the artificial intelligence capabilities of the "Hongxin" intelligent reconfigurable platform.
智能可重构平台 / 神经网络加速器 / 神经网络量化编译软件 / 嵌入式实时操作系统
intelligent reconfigurable platform / neural network accelerator / neural network quantitative compilation software / embedded real-time operating system
| [1] |
隆云滔, 刘海波, 许哲平, 等. 关于构建我国人工智能开源创新生态体系的建议[J]. 中国科学院院刊, 2025, 40(3): 453-458.
|
| [2] |
谢坤鹏, 卢冶, 靳宗明, 等. FAQ-CNN:面向量化卷积神经网络的嵌入式FPGA可扩展加速框架[J]. 计算机研究与发展, 2022, 59(7):1409-1427.
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
籍浩林, 徐伟, 朴永杰, 等. 基于CNN的异构FPGA硬件加速器设计[J]. 液晶与显示, 2025, 40(3):448-456.
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
孟文超, 刘宏森, 龙常青, 等. 基于FPGA实现卷积神经网络的软硬件协同加速方法:CN202311624785.5[P].2024-02-27.
|
| [15] |
吴刚, 陈永正, 张澜, 等. 一种用于部署CNN模型至基于FPGA的高性能加速器的编译系统:CN116301920A[P].2025-04-11.
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
李欣瑶, 刘飞阳, 李鹏. 嵌入式智能计算加速技术综述[C]// 2019年(第四届)中国航空科学技术大会论文集, 2019:1004-1012.
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
潘年鹏, 戴广成, 李璋, 等. DNP-PTQ:一种针对YOLO嵌入式部署的模型量化方法[J]. 激光与光电子学进展, 2025.
|
| [25] |
王琦瑶. 人工智能中公开数据集的专利技术分析[J]. 中国科技信息, 2024(22):14-17.
|
| [26] |
|
| [27] |
|
/
| 〈 |
|
〉 |