本文基于Xilinx Vitis AI对语义分割网络U-Net进行网络定点化、深度学习处理单元DPU定制、软硬件协同优化等加速方法,最终在Xilinx ZCU102异构平台上实现了语义分割加速器的设计,在较低的精度损失下降低硬件资源消耗,完成了整个U-Net网络的软硬件系统开发。实验结果表明,整个U-Net网络硬件加速器的处理帧率可达42 fps,证明了该神经网络加速方案的有效性。
Abstract
In the paper,based on Xilinx Vitis AI,the semantic segmentation network U-Net is implemented with network fixed point,deep learning processing unit DPU customization,software and hardware collaborative optimization and other acceleration methods.Finally,the design of the semantic segmentation accelerator is implemented on the Xilinx ZCU102 heterogeneous platform.The hardware resource consumption is reduced with low precision loss,and the software and hardware system development of the entire U-Net network is completed.The experimental results show that the processing frame rate of the entire U-Net network hardware accelerator can reach 42 fps,which shows the effectiveness of the neural network acceleration scheme.
关键词
现场可编程门阵列 /
深度学习处理单元 /
语义分割 /
Vitis AI /
卷积神经网络
Key words
FPGA /
deep learning processing unit /
semantic segmentation /
Vitis AI /
convolutional neural networks
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Yuan X,Shi J,Gu L.A review of deep learning methods for semantic segmentation of remote sensing imagery[J].Expert Systems with Applications,2021(169):114417.
[2] 徐辉,祝玉华,甄彤,等.深度神经网络图像语义分割方法综述[J].计算机科学与探索,2021,15(1):47.
[3] Asgari Taghanaki S,Abhishek K,Cohen J P,et al.Deep semantic segmentation of natural and medical images:a review[J].Artificial Intelligence Review,2021,54(1):137-178.
[4] 梅亚军,王唯佳,彭析竹.基于FPGA的U-Net网络硬件加速系统的实现[J].电子与封装,2020,20(6):40-45.
[5] Ronneberger O,Fischer P,Brox T.U-net:Convolutional networks for biomedical image segmentation[C]//International Conference on Medical image computing and computer-assisted intervention.Springer,Cham,2015:234-241.
[6] 王赛男,郑雄风.基于边缘计算的图像语义分割应用与研究[J].计算机科学,2020,47(11A):276-280.
[7] Long J,Shelhamer E,Darrell T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition,2015:3431-3440.
[8] Badrinarayanan V,Kendall A,Cipolla R.Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE transactions on pattern analysis and machine intelligence,2017,39(12):2481-2495.
[9] Wu D,Liao M,Zhang W,et al.Yolop:You only look once for panoptic driving perception[J].arXiv preprint arXiv:2108.11250,2021.
[10] Chowdhuri S,Pankaj T,Zipser K.Multinet:Multi-modal multi-task learning for autonomous driving[C]//2019 IEEE Winter Conference on Applications of Computer Vision (WACV).IEEE,2019:1496-1504.
[11] Li B,Zhao J,Fu H.DLT-Net:deep learning transmittance network for single image haze removal[J].Signal,Image and Video Processing,2020,14(6):1245-1253.
[12] 刘腾达,朱君文,张一闻.FPGA 加速深度学习综述[J].计算机科学与探索,2021,15(11):2093-2104.
[13] Hu Y,Liu Y,Liu Z.A Survey on Convolutional Neural Network Accelerators:GPU,FPGA and ASIC[C]//2022 14th International Conference on Computer Research and Development (ICCRD).IEEE,2022:100-107.
基金
*国家自然科学基金资助项目(61972180)。