Current Issue

  • Select all
    |
    Cover Articl
  • Cover Articl
    YAN Zhen, HUANG Yicheng, MA Shicheng, WANG Xueyan
    Download PDF ( ) HTML ( )   Knowledge map   Save

    To achieve high parallel computing, fully homomorphic encryption hardware acceleration systems require the instantiation of a large number of cryptographic primitive operation units. As the most crucial primitive operation in fully homomorphic encryption, the circuit implementation area of modular multiplication significant impacts the overall area of the acceleration system. Addressing issues such as excessive resource usage, limited parameter, and dependency on macro core IPs in existing modular multiplier designs, this paper presents an efficient Montgomery modular multiplier based on FPGA. At the algorithmic level, the multiplier reduces the computational load through techniques such as NTT-Friendly modulus characteristics, compression, and encoding. At the circuit level, it minimizes resource through methods like time-division and data integration. Furthermore, the multiplier supports parameter configuration to implement Montgomery modular multiplication for different widths. Experimental results demonstrate that, for a 32-bit width, the designed Montgomery modular multiplier operates at a clock frequency of 223 MHz with a latency of 26.9 ns, utilizing 1 313 LUTs and 213 FFs. Compared to the baseline, the resource consumption is reduced by 32% on average, while the latency is improved by 16% on average, making the design more flexible and highly applicable.

  • Research Paper
  • Research Paper
    WANG Yao, WEN Tiedun, CHEN Yaping, ZHANG Tianhong
    Download PDF ( ) HTML ( )   Knowledge map   Save

    The electronic controller of an aero-engine is a complex circuit system designed with numerous large-scale integrated circuits as the core. The traditional contact-based fault injection and detection methods relying on physical probes fail to meet the testability design requirements of such complex circuits. This paper proposes a fault injection and detection method based on boundary scan for the core circuit of the aero-engine electronic controller. Through the analysis of the core circuit, a boundary scan daisy chain and a boundary scan controller are designed, possessing the ability to conduct fault injection and detection based on the interconnection between chips and the boundary scan units inside the chips. The fault injection and detection functions of the two methods are verified, combined with the overspeed protection logic of the engine.

  • Research Paper
    LI Jiru, TANG Junlong, LI Zhentao, ZOU Wanghui, LIU Min
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Millimeter-wave radar, as an important sensing technology, is widely used in applications such as autonomous driving, intelligent transportation, security monitoring, and industrial inspection, distinguished by its high precision and robust anti-interference capabilities. With the continual advancement in the research of power integrity, signal integrity, and thermal stability, significant progress has been achieved worldwide in these domains. However, existing studies predominantly focus on individual aspects, lacking a comprehensive consideration of the interactions between power noise, signal interference, and thermal effects. This study introduces a unified simulation approach that integrates power, signal, and thermal effects through multi-physics analysis to optimize the overall performance of millimeter-wave radar hardware systems. Additionally, a capacitor optimization strategy is proposed, which involves increasing capacitor configurations within critical frequency bands to effectively enhance power integrity and ensure system stability. Simulation and experimental results demonstrate that the proposed method significantly improves the impedance characteristics of the Power Distribution Network (PDN), reduces power noise and signal interference, and optimizes the system's thermal management. Through these innovative approaches, this research enhances the comprehensive performance of high-speed circuit systems from multiple dimensions, providing new optimization strategies and methodologies for the design of high-performance hardware tailored for millimeter-wave radar applications.

  • Research Paper
    SANG Xianzhen, LI Min, CHENG Hu, WEI Jinghe, ZHAO Wei, WANG Zhengxing
    Download PDF ( ) HTML ( )   Knowledge map   Save

    This paper presents a quantization method for convolutional networks based on in-memory computation, to address the network performance degradation typically caused by statistical methods used for calculating analog-to-digital conversion coefficients when deployed on in-memory computing circuits. This method first quantifies the activation values and weight coefficients of the convolutional layer. Then, based on the characteristics of the single Tile data stream in the in-memory computing circuits, we design an analog-to-digital conversion coefficient quantization network. Afterwards, a method based on KL divergence is developed to calculate the analog-to-digital conversion coefficients. Finally, the analog-to-digital conversion coefficients are mapped to conductance values and fused with the activation values and weight quantization coefficients in the convolutional layer. These values are then converted into shift and fixed-point multiplication forms to achieve the deployment of inference in the in-memory computing circuit of the convolutional network. The software simulation rusults show that compared with other methods for calculating analog-to-digital conversion coefficients, the designed quantization method results in less performance degradation and is suitable for multi-bit width mixed quantization in convolutional networks. Due to the software simulation fully simulating the data flow process of in memory computing circuits, the proposed method can be applied in engineering implementations on in-memory computing circuits.

  • Research Paper
    LI Shaoxi, JIAO Xinquan, NIU Wanlin, SU Jiahao, JIA Xiaoxiao
    Download PDF ( ) HTML ( )   Knowledge map   Save

    During the testing of special equipment, issues arise concerning the low data transmission rate and the transmission reliability, therefore, the test data collection and storage are required. To address this problem, a data transmission system combining FPGA and Gigabit Ethernet is designed, using UDP protocol to increase the data transmission rate while adding data retransmission mechanism and packet counting to improve the reliability of data transmission. The experiment is verified on Xilinx's (acquired by AMD) FPGA board. The experimental results prove that FPGA+Gigabit Ethernet data transmission is feasible and effectively improves the data transmission rate. Additionally, it exhibits favorable maintainability and stability, making it suitable for practical engineering applications.

  • Research Paper
    ZHAO Chenxu, QU Yingjie, WANG Haiting
    Download PDF ( ) HTML ( )   Knowledge map   Save

    With the development of intelligent transportation systems, license plate recognition systems have transitioned from traditional PC platforms to portable embedded terminals, thereby imposing higher demands on the accuracy, speed, and security of existing license plate recognition systems. RISC-V is an instruction set architecture characterized by being open-source, streamlined, efficient, low-power, and modular, offering a high degree of flexibility. In this paper, a license plate recognition system based on the Hummingbird E203 RISC-V processor is designed, utilizing an improved eight-direction Sobel operator for high-precision edge detection. The system is implemented on the Da Vinci PRO development board. The experimental results show that the system has a recognition accuracy rate of 96%, with an average recognition time of around 45 ms. It demonstrates high recognition accuracy and real-time performance. Compared to traditional license plate recognition systems, this system offers better performance.

  • Research Paper
    WANG Shuai, ZHANG Bo
    Download PDF ( ) HTML ( )   Knowledge map   Save

    To address the problems of scarce hardware resources and low development and testing efficiency in airborne communication software development, a real-time simulation verification platform based on QEMU and Huawei's private cloud is designed and implemented. This platform simulates and runs a configurable embedded target machine environment and Rehua operating system, realizes TDMA protocol time slot simulation, and improves simulation real-time performance through the SCHED_RR priority strategy. It uses network namespaces and VPN technology to build a multi-node virtual network. This platform is used for real-time simulation and automated testing of embedded protocol software for airborne communication equipment, effectively improving the efficiency and quality of software development and testing.

  • Research Paper
    WANG Chao, HU Jinhan, ZHANG Zhifu, CHEN Wentao
    Download PDF ( ) HTML ( )   Knowledge map   Save

    The article is based on the LMK04828 high-performance clock chip, combined with the multi-board cascade clock multi-channel JESD204B synchronous sampling application scenario. It analyzes the impact of the division factor on the phase certainty of the clock output from two directions: the divider and the phase-locked loop. On this basis, a cross-board cascade clock synchronization verification system is designed, and the system is explained in terms of mode configuration precautions, the second-level phase-locked loop divider coefficient conditions, and timing constraints between the SYNC signal and SYSREF. A specific synchronization control process is provided. Finally, through repeated power-up and resynchronization experiments, as well as experiments of repeatedly triggering the SYSREF pulse output after a single power-up, it is confirmed that the phase relationship of the clock output from the cross-board cascade clock chip remains unchanged, verifying the effectiveness of the synchronization scheme and phase certainty.

  • Research Paper
    GUO Jing, HUANG Peng, SUI Xiaobo
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Under complex electromagnetic environment conditions, multiple target signals appear simultaneously. The comprehensive testing system needs to have independent real-time analysis capabilities for simultaneously arriving signals. Due to the huge amount of broadband data, existing systems cannot seamlessly extract and process multiple target signals in real time. To meet the extraction requirements of high-density target signal raw IQ data within broadband, this paper uses FPGA high-speed digital signal processing technology to achieve multi-channel down conversion at any frequency point within the analysis bandwidth and variable analysis bandwidth extraction filtering. Combined with framing and time-sharing scheduling read control, it can simultaneously separate up to 32 channels of target signal raw IQ data with a maximum analysis bandwidth of 300 kHz within an 80 MHz intermediate frequency bandwidth using DDR4 buffering and PCIe 3.0×8 bus transmission. The system adapts dynamically to transmit multi-channel IQ data at different rates while displaying the broadband signal spectrum, achieving seamless transmission of multi-channel IQ data. During the measurement process, it supports the real-time modification of the number, frequency, and bandwidth parameters of the target signal. This technology lays the foundation for real-time analysis of multi-target signal parameters within broadband, and improves the real-time analysis and processing capabilities of high-density signals in comprehensive testing systems.

  • Research Paper
    WEN Zhixian, YANG Hongliang, CHU Fumiao
    Download PDF ( ) HTML ( )   Knowledge map   Save

    This paper proposes a simulation chip performance testing scheme based on Huafeng's STS8200, using the UC3842 chip as an example. The paper investigates the testing methods and procedures for several important parameters of the chip, including reference voltage, load regulation, line regulation, oscillator frequency, and rise and fall time.The experimental verification showed that the results of each parameter were within the effective value range. After testing 10 chips and performing a 100-loop test on the 10th chip, the test yield of the chip was 100%, demonstrating the validity and effectiveness of the testing scheme.