Current Issue

  • Select all
    |
    Special Topic on IC Design Automation and High-reliability Design
  • Special Topic on IC Design Automation and High-reliability Design
    CHEN Kehao, LI Zepeng, LIN Ziqing, LIU Genggeng
    Download PDF ( ) HTML ( )   Knowledge map   Save

    As integrated circuit feature sizes continue to shrink, the antenna effect increasingly impacts chip reliability. Layer assignment, a critical step in physical design, allocates 2D routing segments into a multi-layer 3D space. Improper assignment can cause wires to form excessively long antennas that accumulate charge and damage gates. However, existing research primarily focuses on delay and via optimization without adequately considering antenna effects. Moreover, the widely adopted non-default-rule (NDR) wire technology in advanced nodes exacerbates antenna effects due to larger wire widths. This paper proposes an antenna-aware layer assignment algorithm for advanced technology nodes comprising four core strategies. An antenna-cost-aware dynamic programming strategy that reduces violations during initialization. A high-layer-priority segment reassignment strategy that precisely controls antenna area growth. A timing-aware NDR replacement strategy that fixes violations while limiting delay impact. A g-edge resource negotiation strategy that releases routing resources through cross-net coordination. The experimental results demonstrate that the proposed algorithm significantly reduces antenna-violating nets and pins while maintaining excellent delay and via count performance.

  • Special Topic on IC Design Automation and High-reliability Design
    TIAN Chunsheng, ZHAO Xiangyu, WANG Shuo, WANG Zhuoli, CAO Yongzheng, ZHOU Jing, ZHANG Yaowei, CHEN Lei
    Download PDF ( ) HTML ( )   Knowledge map   Save

    The widespread integration of Field Programmable Gate Arrays (FPGAs) in high-performance computing, AI inference, and 5G communications has led to an unprecedented escalation in design scale and timing constraint complexity. These trends impose stringent demands on the runtime efficiency of Static Timing Analysis (STA). Current FPGA STA tools, primarily anchored in single-core or multi-core CPU architectures, are increasingly hitting a performance wall, despite persistent algorithmic refinements, they struggle with computational bottlenecks and suboptimal memory throughput when confronted with large-scale designs. In recent years, Graphics Processing Units (GPUs) with their massive parallel computing capabilities have provided new opportunities for improving FPGA STA performance. However, challenges in memory access patterns under heterogeneous GPU architectures, the optimization for timing graph loop detection, and heterogeneous parallel acceleration strategies continue to hinder the effectiveness of current GPU-accelerated methods in FPGA STA scenarios. To address these issues, we propose an FPGA STA algorithm accelerated by an efficient heterogeneous parallel strategy. First, targeting the problem of discontinuous memory access and field interleaving in traditional object-oriented data structures under CPU-GPU heterogeneous architectures, a structure-of-arrays (SoA)-based data layout strategy is presented. Combined with data reordering operations, this approach effectively reduces memory access latency and improves bandwidth utilization, providing a data foundation for high-performance FPGA STA computational engines. Second, to overcome the limitations of low efficiency and poor robustness in timing graph loop detection, a parallel loop detection optimization algorithm based on color propagation is designed, enabling efficient acceleration in the preprocessing stage of FPGA STA. Furthermore, a task decomposition and timing graph traversal method tailored for CPU-GPU heterogeneous architectures is proposed, achieving efficient acceleration of core STA operations such as delay calculation, levelization, and graph propagation. Finally, experimental results on both the OpenCores and industrial-grade FPGA benchmarks demonstrate that, compared with traditional CPU implementations, the proposed method achieves a runtime speedup of 3.125× to 33.333×, with overall performance surpassing that of the OpenTimer tool. This research provides a practical and feasible approach for efficient timing verification in large-scale FPGA designs.

  • Special Topic on IC Design Automation and High-reliability Design
    MA Jingbo, ZHANG Guangda, WANG Huiquan, PEI Bingxi, FANG Jian, HUANG Chenglong, LUO Hui, JIANG Yande
    Download PDF ( ) HTML ( )   Knowledge map   Save

    As SoC architectures evolve to meet the computational intensity of diverse AI applications, the pursuit of high-performance throughput must be balanced with uncompromising reliability. Consequently, parity check mechanisms have emerged as a cornerstone of modern circuit design, essential for safeguarding the integrity of massive data movement within the SoC fabric. However, in wide-bit-width data transmission scenarios, traditional parity check circuit designs face challenges such as high verification complexity and significant decoding latency, which in turn constrain the overall performance of SoCs, including system master clock frequency and data access bandwidth. To address this technical challenge, this paper innovatively proposes a multi-stage pipelined parity check circuit design method for the AXI bus in SoC memory. This design employs a pipelined architecture to optimize the verification process in stages, significantly reducing the critical path delay in the data pathway. The experiment results demonstrate that, at a minimal cost of a 0.47% increase in total circuit area and a 0.24% rise in power consumption, the proposed design method achieves timing optimization of the date read/write bus critical path, reducing the maximum delay of the AXI bus write and read data circuit paths by 18.62% and by 25.60% respectively, effectively enhancing the overall performance and reliability of the SoC.

  • Special Topic on IC Design Automation and High-reliability Design
    DAI Yunqiong, WU Yuzhang, WANG Sheng, YU Fuan, SUN Wanghong, ZHANG Yongqiang, WANG Shaowei
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Stochastic computing (SC), an unconventional computational paradigm, employs probabilities to represent numerical values. This representation enables complex arithmetic operations to be performed using simple logic gates. This work presents a fast unary median filtering circuit design. The proposed filter utilizes counters to generate stochastic numbers (SNs) and constructs fundamental sorting network units using stochastic correlation logic. A feedback loop, formed based on the output, dynamically terminates computations early without consuming additional hardware area, significantly reducing substantial circuit latency. The experimental results demonstrate that the proposed median filter design outperforms existing implementations in both actual bitstream length and energy consumption. Specifically, the proposed 3×3 window median filter circuit achieves a 55.58% reduction in energy. Further validation using median filtering on images corrupted by salt-and-pepper noise confirms the accuracy of the proposed circuit. For a 16-input sorting network application, the proposed design exhibits lower consumption when inputs range within [0, 0.5], achieving up to a 50% reduction in actual bitstream length and energy consumption.

  • Special Topic on IC Design Automation and High-reliability Design
    WANG Qitao, FENG Haoran, LAO Junjie, YOU Jiaxin, LIN Zefan, LAI Liyang
    Download PDF ( ) HTML ( )   Knowledge map   Save

    With the increasing complexity and integration levels of integrated circuits, Diagnosis-Driven Yield Analysis (DDYA) has become increasingly important in accelerating physical failure analysis and improving yield. However, the low diagnostic resolution of scan chain diagnosis based on scan testing remains a weak link in DDYA. This thesis studies a scan chain diagnosis based on hardware architecture improvement-sideway scan. This technique groups scan chains through clock domain or layout constraints and introduces a cyclic shift sideway transmission path between adjacent scan chains within each group. By transmitting data from the faulty chain to the normal chain and then unloading it, followed by analysis using the sideway diagnostic algorithm, the technique enables precise diagnosis of various fault scenarios. This architecture offers lower hardware overhead compared to the two-dimensional scan and higher diagnostic resolution compared to the bidirectional scan. Comparative experiments across multiple circuits demonstrate that, compared to software-based scan chain diagnosis, Sideway Scan achieves up to 41% improvement in single-fault diagnosis resolution, up to 80% in double-fault diagnosis, and up to 168% in triple-fault diagnosis. Meanwhile, in various fault scenarios, diagnosis time is reduced by over 90%, with the maximum reduction reaching 99%. The study demonstrates the feasibility, stability, time advantage, and diagnostic resolution advantage of the sideway scan, providing a more efficient and precise solution for fault diagnosis in complex integrated circuits.

  • Special Topic on IC Design Automation and High-reliability Design
    ZHOU Shiqi, CAI Huayang, WANG Jingyi, LIU Genggeng
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Continuous-flow microfluidic biochips (CFMBs) are widely used in biochemical analysis due to their high precision and reliability. CFMBs consist of a flow layer and a control layer. To manage complex logic in the control layer with limited control pins, multiplexers are extensively employed. However, the physical design of multiplexers-specifically the co-optimization of valve placement and channel routing-remains underexplored. To address this, this paper proposes a co-optimization method based on Discrete Particle Swarm Optimization (DPSO). First, valve placement regions are constrained via preprocessing to ensure routing feasibility. Second, a DPSO framework encodes placement into particle positions and utilizes an embedded A* router to provide routing cost as fitness, establishing a closed-loop feedback mechanism between placement and routing. Third, X-architecture routing is introduced to expand the solution space and minimize wirelength. Experimental results demonstrate that the proposed method reduces the average control channel length by 8.27%. Notably, the X-architecture contributes a 5.01% improvement over traditional R-type routing, significantly enhancing both layout quality and routing efficiency.

  • Paper
  • Paper
    ZHANG Rong, ZHANG Mingjie, CHEN Xu, CHENG Xiangyu
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Addressing the issue of low measurement accuracy and poor dynamic response where weak thrust signals are easily overwhelmed by noise in a high vacuum, strong electromagnetic interference environment, this paper proposes an optimized signal denoising solution combining Kalman filtering with electromagnetic shielding. This solution suppresses spatial radiation interference at the physical level by constructing a multi-layer composite electromagnetic shielding system. Simultaneously, it establishes a dynamic model of the thrust measurement system and applies the Kalman filter algorithm for optimal state estimation of the acquired signal, in order to separate the deterministic thrust signal from random process noise and measurement noise.The experimental results show that the proposed scheme can increase the signal-to-noise ratio (SNR) by nearly 30 dB. Compared to traditional low-pass filtering, it significantly improves the system's dynamic response characteristics while effectively suppressing noise, providing a reliable technical pathway for high-precision, high-dynamic-range micro-thrust measurement.

  • Paper
    ZHAO Jie, QUAN Longjie, ZHANG Yunfan, LIU Yu, ZHENG Yikang, FAN Shiquan, CHANG Ke, WANG Jinlei, ZHANG Guohe
    Download PDF ( ) HTML ( )   Knowledge map   Save

    A two-stage operational amplifier with low input bias current, rail-to-rail input, high gain, and high bandwidth has been designed. This design combines a folded-cascode first stage with a class-AB output stage, incorporating a linear transconductance (Gm) loop and gain-boosting techniques. The first stage employs a folded-cascode architecture, achieving rail-to-rail input through parallel NMOS and PMOS input differential pairs. A dedicated current compensation scheme ensures a constant output impedance of the first stage. The second stage utilizes a class-AB output configuration, where a translinear loop precisely sets the quiescent current of the output stage, resulting in improved drive capability and reduced power consumption. Gain-boosting techniques are applied to further enhance the output impedance of the cascode structure, thereby increasing the overall DC gain. The op-amp is fabricated in a SMIC 180 nm MS BCD CMOS process. After tape-out, a test platform is independently developed by designing the test circuitry and fabricating a custom PCB board. Key parameters of a broadband low-input-bias-current operational amplifier, including input bias current, offset voltage, open-loop gain, small-signal bandwidth, slew rate, and noise, are measured using an oscilloscope, network analyzer, and spectrum analyzer. Test results demonstrate that with a 2 pF load capacitor, the amplifier achieves a low-frequency gain of 90 dB and a gain-bandwidth product (GBW) of 380 MHz.

  • Paper
    TANG Hao
    Download PDF ( ) HTML ( )   Knowledge map   Save

    This paper presents the design of a high-speed real-time signal processing radar digital front-end based on Xilinx FPGA. The FPGA in this radar front-end fully utilizes its abundant resources, including logic, RAM, DSP, and high-speed interfaces, to implement functional modules such as 10-Gigabit Ethernet, Microblaze, and high-speed cache. This enables the FPGA to perform control, preprocessing, and high-speed data transmission, resulting in a radar processing front-end with a simple hardware structure, high signal processing capability, and fast data transmission speed. To meet the data transmission capacity requirements, in the software implementation, the high-speed data read-write timing is meticulously designed according to radar waveform characteristics. This design has been successfully applied in real-time processing for surveillance radar projects, achieving excellent results.

  • Paper
    FU Zhaoyong, GUO Tianci, WANG Keling, DENG Yuemei, LIN Fangzhu, LAI Zhuoyuan
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Based on the colorimetric principle combined with multi-channel parallel detection technology, an efficient detection system has been developed. This instrument can understand the growth situation of microorganisms by measuring the turbidity of the sample. It integrates the functions of temperature control and shaking culture, accelerates the rapid response of microorganisms, and can test the growth situation of microorganisms in 96-well plates. To achieve online monitoring and cultivation of various microorganisms, the core of this instrument is to determine the growth status of microorganisms by receiving the intensity of the signal from the photodiode. The light source is realized through one-fiber-eight-optical-fiber to achieve independent detection of 8 sample channels. Then, the analog signal is converted into a digital signal by the analog-to-digital conversion chip (ADC) and sent to the main control. At the same time, the main control chip controls the motor, heating plate, fan and other loads. It has the functions of solution mixing and temperature control. It ensures the detection accuracy and significantly improves the detection efficiency.