PDF(1108 KB)
PDF(1108 KB)
PDF(1108 KB)
基于异构平台的三角矩阵回代加速求解研究
Research on accelerating triangular matrix backpropagation based on heterogeneous platforms
瞬态电路仿真中常建立线性系统模型,而顺序求解多右端项的三角矩阵十分耗时。为了提高瞬态电路仿真中耗时的三角矩阵回代速度,提出了一种基于异构平台的并行计算方法快速求解三角矩阵。通过优先计算与解向量相关的乘法,挖掘了回代计算的并行性。设计了核心是多个浮点计算功能的运算阵列以及主从两层状态机的控制模块。相比于使用MKL求解库的Intel 24核CPU平台,本架构基于XCZU15EG的Zynq UltraScale系列FPGA进行了线性矩阵求解实验,实验所用矩阵均为对称正定、对角占优且稠密度均大于50%。提出的加速架构求解的平均加速比达到22倍,求解误差在10-17~10-14内。实验结果表明,该架构一定程度上提高了矩阵求解速度,适合于较高维度线性矩阵的前后向回代求解。
Transient circuit simulations often necessitate the construction of linear system models,where the sequential solution of triangular matrices with multiple right-hand terms becomes a time-intensive process.To expedite the computational efficiency of back-substitution for these matrices in transient circuit simulations,this paper proposes a parallel computing method based on a heterogeneous platform.The method prioritizes the computation of multiplications relevant to the solution vector,exploiting the inherent parallelism of back-substitution calculations.The architecture features a core operation array with multiple floating-point calculation units and a control module employing a two-tiered master-slave state machine.Using the Zynq UltraScale series FPGA,specifically the XCZU15EG model,our architecture is compared to an Intel 24-core CPU platform utilizing the MKL solving library in linear matrix resolution experiments.The matrices used exhibit characteristics of being symmetric positive definite,diagonally dominant,and dense with a sparsity exceeding 50%.The proposed acceleration architecture achieves an average speedup factor of 22,with solution errors falling within the range of 10-17 to 10-14.The experiment results demonstrate the architecture's significant enhancement of matrix solution speed,especially suitable for forward and backward substitution resolution of high-dimensional linear matrices in transient circuit simulations.
三角矩阵求解 / 硬件加速 / 现场可编程门阵列 / 瞬态仿真
triangular matrix solution / hardware acceleration / FPGA / transient simulation
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
刘侃, 王欣亮, 许平, 等. 申威众核处理器上的三对角并行求解器[J]. 计算机科学与探索, 2019, 13(10):10.DOI:10.3778/j.issn.1673-9418.1811030.
|
| [5] |
|
| [6] |
吴志勇, 王晞阳, 陈继林. 一种基于FPGA并行加速的稀疏矩阵求解方法[J]. 电力系统保护与控制, 2021, 49(11):155-162.DOI:10.19783/j.cnki.pspc.200948.
|
| [7] |
|
| [8] |
|
/
| 〈 |
|
〉 |