基于异构平台的三角矩阵回代加速求解研究

时睿, 左芸帆, 闫浩

集成电路与嵌入式系统 ›› 2024, Vol. 24 ›› Issue (1) : 13-18.

PDF(1108 KB)
PDF(1108 KB)
集成电路与嵌入式系统 ›› 2024, Vol. 24 ›› Issue (1) : 13-18. DOI: 10.20193/j.ices2097-4191.2024.01.002
EDA研究专栏

基于异构平台的三角矩阵回代加速求解研究

作者信息 +

Research on accelerating triangular matrix backpropagation based on heterogeneous platforms

Author information +
文章历史 +

摘要

瞬态电路仿真中常建立线性系统模型,而顺序求解多右端项的三角矩阵十分耗时。为了提高瞬态电路仿真中耗时的三角矩阵回代速度,提出了一种基于异构平台的并行计算方法快速求解三角矩阵。通过优先计算与解向量相关的乘法,挖掘了回代计算的并行性。设计了核心是多个浮点计算功能的运算阵列以及主从两层状态机的控制模块。相比于使用MKL求解库的Intel 24核CPU平台,本架构基于XCZU15EG的Zynq UltraScale系列FPGA进行了线性矩阵求解实验,实验所用矩阵均为对称正定、对角占优且稠密度均大于50%。提出的加速架构求解的平均加速比达到22倍,求解误差在10-17~10-14内。实验结果表明,该架构一定程度上提高了矩阵求解速度,适合于较高维度线性矩阵的前后向回代求解。

Abstract

Transient circuit simulations often necessitate the construction of linear system models,where the sequential solution of triangular matrices with multiple right-hand terms becomes a time-intensive process.To expedite the computational efficiency of back-substitution for these matrices in transient circuit simulations,this paper proposes a parallel computing method based on a heterogeneous platform.The method prioritizes the computation of multiplications relevant to the solution vector,exploiting the inherent parallelism of back-substitution calculations.The architecture features a core operation array with multiple floating-point calculation units and a control module employing a two-tiered master-slave state machine.Using the Zynq UltraScale series FPGA,specifically the XCZU15EG model,our architecture is compared to an Intel 24-core CPU platform utilizing the MKL solving library in linear matrix resolution experiments.The matrices used exhibit characteristics of being symmetric positive definite,diagonally dominant,and dense with a sparsity exceeding 50%.The proposed acceleration architecture achieves an average speedup factor of 22,with solution errors falling within the range of 10-17 to 10-14.The experiment results demonstrate the architecture's significant enhancement of matrix solution speed,especially suitable for forward and backward substitution resolution of high-dimensional linear matrices in transient circuit simulations.

关键词

三角矩阵求解 / 硬件加速 / 现场可编程门阵列 / 瞬态仿真

Key words

triangular matrix solution / hardware acceleration / FPGA / transient simulation

引用本文

导出引用
时睿, 左芸帆, 闫浩. 基于异构平台的三角矩阵回代加速求解研究[J]. 集成电路与嵌入式系统. 2024, 24(1): 13-18 https://doi.org/10.20193/j.ices2097-4191.2024.01.002
SHI Rui, ZUO Yunfan, YAN Hao. Research on accelerating triangular matrix backpropagation based on heterogeneous platforms[J]. Integrated Circuits and Embedded Systems. 2024, 24(1): 13-18 https://doi.org/10.20193/j.ices2097-4191.2024.01.002
中图分类号: TP332.1 (逻辑部件)   

参考文献

[1]
ZHUOXUAN SHEN, TONG DUAN, VENKATA DINAVAHI. Design and Implementation of Real-Time Mpsoc-FPGA-Based Electromagnetic Transient Emulator of CIGR’E DC Grid for HIL Application[J]. IEEE Power and Energy Technology Systems Journal, 2018, 5(3).
[2]
XIONG X, WANG J. Parallel forward and back substitution for efficient power grid simulation[C]// Proceedings of the International Conference, 2013.
[3]
WANG X, LIU W, WEI X, et al. swSpTRSV:a fast sparse triangular solve with sparse level tile layout on sunway architectures[C]// the 23rd ACM SIGPLAN Symposium.ACM, 2018.
[4]
刘侃, 王欣亮, 许平, 等. 申威众核处理器上的三对角并行求解器[J]. 计算机科学与探索, 2019, 13(10):10.DOI:10.3778/j.issn.1673-9418.1811030.
LIU K, WANG X L, XU P, et al. Tridiagonal parallel solver on Shenwei multi-core processors[J]. Frontiers of Computer Science & Technology, 2019, 13(10):10.DOI:10.3778/j.issn.1673-9418.1811030 (in Chinese).
[5]
A DEL RIO RUIZ. K BASTERRETXEA. Towards the Development of a CAD Tool for the Implementation of High-Speed Embedded MPCs on FPGAs[C]// 2020 European Control Conference (ECC),St. Petersburg,Russia, 2020:941-94.doi:10.23919/ECC51009.2020.9143666.
[6]
吴志勇, 王晞阳, 陈继林. 一种基于FPGA并行加速的稀疏矩阵求解方法[J]. 电力系统保护与控制, 2021, 49(11):155-162.DOI:10.19783/j.cnki.pspc.200948.
WU ZH Y, WANG X Y, CHEN J L. A Sparse Matrix Solution Method Based on FPGA Parallel Acceleration[J]. Power System Protection and Control, 2021, 49(11):155-162.DOI:10.19783/j.cnki.pspc.200948 (in Chinese).
[7]
STANFORD UNIVERSITY. SPICE Reference[EB/OL].[2023-11-15]. https://web.stanford.edu/class/ee133/handouts/general/spice_ref.pdf.
[8]
SUN R, LIU P, XUE J, et al. Bax:A bundle adjustment accelerator with decoupled access/Execute Architecture for Visual Odometry[J]. IEEE Access, 2020(8):75530-75542.

基金

江苏省研究生实践创新计划“面向 CCS 时序模型的矩阵回代 FPGA 加速”(SJCX220052)

责任编辑: 薛士然
PDF(1108 KB)

Accesses

Citation

Detail

段落导航
相关文章

/