Current Issue

  • Select all
    |
    Cover Article
  • Cover Article
    CHEN Changhao, XU Shimeng, LIN Pengrong
    Download PDF ( ) HTML ( )   Knowledge map   Save

    The growing demand for massive data processing such as artificial intelligence has greatly promoted the development of chiplet integration technology, which further imposes technical requirements on FC-BGA substrates, including large size, low warpage, electrical performance and high reliability. The late-model glass core substrate has attracted extensive attention owing to its intrinsic low dielectric coefficient, high thermal stability and chemical inertness. However, current glass core substrate technology remains in the initial stage of mass production, lacking comprehensive, reliable and standardized methods of producation, application and testing. This article overviews the history, characteristics, and present challenges of glass core substrate. It also provides a summary and prospect on the future applications of glass core substrate in chiplet integration.

  • Research Paper
  • Research Paper
    WEN Li, ZHANG Yueyang, ZHANG Shenghai
    Download PDF ( ) HTML ( )   Knowledge map   Save

    With the development of semiconductor technology towards the deep sub-micron node, traditional synchronous circuits are facing increasing challenges in clock skew and high power consumption. Compared with traditional synchronous circuits, the asynchronous architecture, which uses a local handshake protocol to replace the global clock signal, is gradually becoming a new paradigm for high-performance computing chip design. Its modular design, inherent immunity to clock skew, and potential for low-power consumption position them as a new frontier in chip design. This paper addresses the evolving demands of high-performance integrated circuit chips in emerging fields such as artificial intelligence and the Internet of Things. From the perspectives of the clock tree and the handshake protocol, it analyzes the limitations of synchronous circuits in large-scale integrated circuits, and reviews the latest progress in asynchronous circuit technology. Furthermore, the paper discusses the advantages and disadvantages of the handshake protocol in terms of operating speed, energy efficiency, and anti-interference. Finally, key research directions for future development of asynchronous circuits are proposed.

  • Research Paper
    ZHANG Rong, ZHANG Mingjie, CHENG Xiangyu
    Download PDF ( ) HTML ( )   Knowledge map   Save

    To address the real-time performance degradation caused by frequent UART interrupts during high-volume data communication in embedded systems, a DMA driver optimization scheme for the FreeModbus protocol stack is proposed which is based on the GD32E230 microcontroller. By restructuring UART transmit/receive interrupt service routines and implementing DMA mechanisms, the solution significantly reduces UART interrupt frequency and CPU occupancy. The experiment results demonstrate that under 115 200 baud rate, the interrupt triggers for 255-byte frame transmission decrease from 256 times to 2 times, with a 99% reduction in CPU occupancy time. This optimization substantially alleviates system load, providing a cost-effective communication enhancement solution for resource-constrained embedded devices.

  • Research Paper
    LIANG Nan, LI Sen, ZHANG Chunfei, LIU Pengfei
    Download PDF ( ) HTML ( )   Knowledge map   Save

    To meet the needs of personalized talent training in Emerging Engineering Education, an experimental platform for moving object detection based on Python and embedded development board is designed. The platform combines advanced technologies such as edge computing, deep learning and image recognition to design experiment module. The experimental platform is configured based on the Linux system of the embedded development board. The moving target detection algorithms based on frame difference method, background subtraction and optical flow are developed using Python language. The deep learning algorithms are deployed based on Tencent ncnn computing framework to recognize and locate moving objects. The platform enables students to choose different experimental projects and methods according to their interest in scientific research to improve the teaching effect.

  • Research Paper
    LI Cong, RAO Kemeng, LIN Yongmei, HUANG Futong, HAN Tuanjun
    Download PDF ( ) HTML ( )   Knowledge map   Save

    To understand the rehabilitation issues of patients with arm disabilities, this paper proposes a rehabilitation robot arm control system based on sEMG signals. Three wet electrode sensors are used to collect electromyographic signals, and the collected data is filtered out of noise using wavelet transform and software/hardware filtering. The filtered signal is identified using a combination of three identification methods in the time domain. The identified signal is wirelessly sent to the execution unit. The actuator, composed of STM32 and servo, performs the corresponding action. Through platform testing, the accuracy of action recognition can reach up to 90%, confirming that the system meets the requirements of medical rehabilitation equipment.

  • Research Paper
    WANG Yi, ZHANG Pingjuan, GUO Shijun
    Download PDF ( ) HTML ( )   Knowledge map   Save

    To Address the resource constraints and energy efficiency bottlenecks in FPGA deployment of super-resolution networks, a lightweight hardware acceleration solution for the ESPCN super-resolution network is proposed. At the algorithm level, the ESPCN network is simplified to significantly reduce its computational complexity, and 16-bit fixed-point quantization is employed to further enhance computing efficiency. In the hardware architecture design, targeted optimizations are implemented for the hard-ware realization of standard convolution, pointwise convolution, and sub-pixel convolution. The experimental results indicate that the accelerator deployed on the ZYNQ7035 platform, operating at 210 MHz, efficiently reconstructs a 480×270 resolution image to 1 920×1 080 resolution, with a forward inference time of only 49.8 ms per image and a total on-chip power consumption of 4.17 W.

  • Research Paper
    LIU Yu, ZHANG Jie, ZHOU Le
    Download PDF ( ) HTML ( )   Knowledge map   Save

    This paper proposes a design for a compute-in-memory processing unit (CIMPU) tailored for high-performance computing processors. The CIMPU integrates a multi-precision arithmetic operator and on-chip storage, enabling computations to be performed locally without accessing external buses. A hardware pipeline is further designed based on the CIMPU's architectural characteristics to optimize processing efficiency. The operation unit design scheme proposed in this paper has a good performance-power-consumption ratio advantage. We evaluated the computational performance of the design, with a performance to power ratio of 2.47 TOPS/W@INT8. It is significantly superior to other similar processor architectures and is suitable for large-scale deployment as a high computing power processor core.

  • Research Paper
    LIU Hailiang, CHEN Sikun, RONG Daohui
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Based on the current productization features of phase-change memory, this paper conducts targeted research on four application scenarios in the SSD controller chip: ROM patch loading, abnormal log saving, power-off data preservation, storage of digital certificates, key update, etc. Storing ROM patches can save the ECO cost of the chip and reduce the production cost. Storing abnormal logs can effectively solve the backup of key information of logs and registers when the CPU is in abnormal state, greatly reducing the difficulty of problem location. The storage of digital certificates with key update functionality ensures that sensitive information remains within the main control chip, enhancing security and reliability. Supporting power-off data preservation greatly reduces the capacity requirement for the backup capacitor and lowers the cost of industrial products, making it possible for consumer-grade SSD to support abnormal power-off data preservation. Based on application requirements, a PCM controller was designed and implemented. Simulation results show that the designed PCM controller functions meet the design expectations. It can be seen that with the continuous improvement of PCM's read-write performance and lifespan, PCM has broad application prospects in solid-state storage and other fields in the future.

  • Research Paper
    HU Jinhui, YU Xiaodong
    Download PDF ( ) HTML ( )   Knowledge map   Save

    To improve the clock synchronization performance of the CT detector, this study proposes a clock synchronization method utilizing a Time-to-Digital Converter (TDC) feedback mechanism and presents a complete prototype architecture of the synchronization system. For the first time, the TDC delay measurement technique based on carry chains is introduced into the clock synchronization scenario of CT detectors. A high-resolution delay measurement module is designed, incorporating multi-level comparators and high-precision delay elements to implement the timing scheduling logic. A finite state machine is employed to control the synchronization process, forming a closed-loop feedback synchronization mechanism. Simulation experiments demonstrate that in a complex scenario with 512 channels and a maximum transmission distance of 1.28 m, the system's synchronization precision remains consistently stable within 10 ps. This represents a remarkable two-order-of-magnitude improvement compared to conventional solutions. Furthermore, the system can effectively apply dynamic compensation to maintain the 10 ps synchronization precision even in the presence of interference-induced jitter. These advancements significantly elevate the precision and reliability of clock synchronization in CT detector systems.