Home Browse Just Accepted

Just Accepted

Note: The articles listed below have been peer-reviewed and accepted for publication in this journal. These articles have not yet been scheduled for a specific issue; their content and layout may undergo minor changes in the final published version. Please refer to the final published version as the definitive one. This journal has assigned each of these articles a unique and persistent DOI. You may use the DOI to cite this article directly.
Please wait a minute...
  • Select all
    |
  • Accepted: 2026-03-27
    Abstract: To address the predictability issue of interrupt latency in real-time operating systems (RTOS) under multi-core heterogeneous architectures, this paper proposes a layered interrupt latency modeling method and a comprehensive benchmark testing framework, using the domestic RK3588 chip and SylixOS real-time operating system as research subjects. Theoretical modeling is employed to analyze the impact of hardware architecture and operating system scheduling strategies on interrupt response time, and a testing scheme incorporating multiple scenarios such as single-core idle, mixed load, and full-core high pressure is designed. The 'modeling-testing-optimization' methodology is proposed, providing a systematic reference for real-time evaluation and optimization of multi-core heterogeneous platforms.
  • Li Ziyi, Xiong Zhengye, Cai Fanglin
    Accepted: 2026-03-26
    To address the challenges of stability assessment caused by the change in the center of gravity after the renewal of ship equipment, as well as the problems of high cost, long time consumption and difficulty in popularization of traditional inclining tests on small and medium-sized fishing, an automatic ship stability measurement system based on embedded technology is designed. The system takes an inertial measurement unit (IMU) as the core sensing module, and realizes high-precision measurement of ship roll attitude angle and rapid calculation of metacentric position through multi-source attitude data acquisition, embedded real-time processing and wireless transmission. Adopting a master-slave architecture and with the STM32F401 microcontroller as the core, the system integrates an accelerometer and an ultrasonic ranging module, and transmits data to the upper computer via 2.4GHz communication to achieve multi-dimensional perception of the dynamic response of the ship under slight inclination. In the outdoor model ship test environment, the system significantly simplifies the measurement procedure through an automated process. Compared with the traditional manual calculation and observation, the single measurement time is greatly reduced. The average relative error of metacentric height measurement is less than 3%, which verifies the high efficiency of the system in data acquisition and algorithm solving. This system provides an efficient, cost-effective and field-operable solution for the stability safety assessment of fishing vessels after equipment renewal, meeting the demand for rapid on-site measurement in marine engineering.
  • Accepted: 2026-03-26
    The wo-transistor capacitorless (2T0C) gain-cell embedded dynamic random access memory (eDRAM) offers long data retention and high potential for three-dimensional (3D) integration, making it a compelling candidate for high-density embedded storage applications. However, write-data uniformity in large-scale 2T0C arrays is susceptible to various degradation mechanisms, thus driving the need for precise memory-channel modeling to ensure reliability. However, the storage-node voltage (VSN) suffers from stage-dependent ambiguity due to nonlinear capacitance and coupling effects, preventing it from being a unique state descriptor. To overcome this, we propose a unified Z-channel model centered on the stored charge (QSN) to accurately describe both write and hold operations. By shifting the core descriptor from VSNto QSN, the proposed framework eliminates representation ambiguity while enabling the direct quantification of three major degradation mechanisms: write-history-dependent effect, the parasitics of array, and the leakage during retention. To validate its generality, comprehensive Monte Carlo simulations were conducted across 2T0C arrays fabricated in multiple technology nodes. The results show that scaling down amorphous-oxide-semiconductor field effect transistors (AOSFETs) effectively suppresses the write-history-dependency, improves write uniformity in large 2T0C arrays, and achieves 7500 seconds data retention.
  • LI Quanliang, WANG Chao, WANG Ruilin, JIAO Yang, QIAO Chuan
    Accepted: 2026-03-24
    For long-life products,the FPGA bitstream stored in NOR Flash may experience bit flips due to floating-gate charge leakage, which leads to FPGA configuration failure.To address this issue,this paper proposes a Flash refresh method based on FPGA multiboot.The Flash is refreshed during annual product maintenance to restore the floating-gate charge.The multiboot of the Kintex-7 series FPGA was investigated,and the configuration data composition was restructured to enhance the robustness of configuration.The optimized configuration data comprises one header file,two identical copies of the bitstream,and three identical sets of auxiliary file.When launching the Flash refresh,FPGA executes a sequential steps including self-check,data refresh,and read-back verification to ensure the reliability of the process.Test results have verified that this refresh method is stable and reliable,exhibiting high practical engineering utility.
  • Accepted: 2026-03-23
    This paper designs and implements an FFT hardware accelerator and host computer system for bridge structural health monitoring.The hardware adopts a sequential 4-base FFT architecture with single-butterfly multiplexing to perform computations.This approach ensures functional integrity while effectively reducing resource overhead and power consumption,enabling configurable FFT sizes of 4,16,64,and 256 points.To enable data interaction and visualization,a host computer platform was further developed for parameter configuration, operational control,and real-time display and analysis of frequency-domain results.The hardware accelerator was verified under CMOS 180nm process conditions and maintains stable operation at 100MHz.Applied to bridge vibration signal processing,this system accurately extracts primary frequency components, meeting the comprehensive requirements of real-time performance,precision, and energy efficiency for bridge structural health monitoring.
  • Accepted: 2026-03-23
    With the continuous and rapid development of Artificial Intelligence (AI), machine vision and embedded control have progressively become foundational technologies for the intelligent manufacturing industry. To meet the urgent demand for teaching and experimental platforms amidst the reform of AI education in universities, this paper proposes and implements an intelligent sorting system based on the Robot Operating System 2 (ROS2) framework. The system utilizes a Raspberry Pi as the upper-computer platform for vision acquisition and inference. Real-time video streams are collected via a USB camera, and OpenCV is employed for preprocessing operations, including video frame decoding, color space conversion, and scaling. Subsequently, ONNX Runtime is utilized for the deployment and inference of deep learning models. At the execution level, the system employs an ESP32 microcontroller as the ROS2 lower-level node. It establishes stable communication with the Raspberry Pi over a Local Area Network (LAN) via micro-ROS, enabling precise control of the conveyor belt motor and the pusher mechanism. All nodes operate within the same Wireless LAN (WLAN) and utilize the DDS protocol for rapid node discovery and reliable message transmission. This paper provides a detailed introduction to the system's design, covering hardware structure, vision processing workflows, communication architecture, and actuator control. Furthermore, the stability, real-time performance, and scalability of the system are validated through multiple rounds of experiments. Finally, centering on the requirements of educational platform construction, this paper analyzes the value of the system in experimental teaching and discusses its future application prospects in intelligent manufacturing and university laboratory platforms.
  • Accepted: 2026-03-20
    In compute-in-memory (CIM) employing high-density 2T0C arrays, parasitic capacitances critically determine charge redistribution and bit line integration dynamics, directly impacting storage-node (SN) disturbance and computational linearity. However, the escalating computational cost of conventional extraction methods with array size obstructs efficient array-level modeling and system analysis. To address this, we propose a high-accuracy approximation method for extracting parasitics from the central cell of large-scale arrays by leveraging the attenuating coupling of long interconnects. The method constructs a nine-port aggregated equivalent network by bundling non-adjacent word/bit lines and derives a quantitative expression for the minimum truncation distance of key capacitances under a 1% relative-error bound, enabling rapid array-level (AM) parameter extraction. This facilitates high-accuracy models for the SN and bit line capacitances (CSN and CRBL) across operational phases, accurately capturing the near-linear scaling of CRBL with array size. Simulations under 10× geometric scaling show a 15% accuracy improvement over the device-level model (DM). Crucially, linearity analysis based on this precise model reveals that using the low-accuracy DM would overestimate the peak integral non-linearity (INL) by approximately 1.5 least significant bit (LSB).
  • Accepted: 2026-03-18
    在国产自主化加速推进与AI算力需求爆发的双重背景下,存储底层盘控核心技术国产化依然不足。针对这一问题,本文设计并实现了一套基于国产软硬件生态的SATA标准NAS存储系统,其核心硬件架构采用复旦微FPGA与龙芯3A3500处理器,软件层面依托银河麒麟操作系统。其中,FPGA实现了PCIe与SATA协议的桥接转换,主要包括PCIe高速数据交互、HBA控制器逻辑与SATA协议处理等核心功能。在此基础上,进一步结合RAID冗余存储技术,设计并实现了集卷组管理与多协议文件共享于一体的NAS存储管理系统,为存储国产化提供了一种可行的技术路径和落地方案。
  • Accepted: 2026-03-17
    Facing the new era's cybersecurity situation, China has timely proposed an active immune protection system based on trusted computing 3.0. Combined with the national standard specifications of the Trusted Platform Control Module (TPCM) in China, a TPCM module solution based on a secure SSD and its implementation method are proposed. Through the built-in SM2, SM3, SM4, random number generator module, and OTP controller module of the security control chip, the power-on self-test, secure boot, data encryption and decryption, and key management functions of the security control chip are realized. Based on the secure SSD, an authentication software is designed to achieve the trusted boot measurement and response of the physical environment such as the computer motherboard, BIOS, peripherals, and IP address. Compared with the existing TPCM cards, the proposed solution has the advantages of high security, good compatibility, strong scalability, convenient deployment, and low cost.
  • Accepted: 2026-03-16
    星载合成孔径雷达(Synthetic Aperture Radar,SAR)实时成像处理需在星上资源严格受限、空间辐照恶劣条件下,实现多模式、大运算量计算。研制专用SAR 成像处理SoC(System on Chip)芯片,通过专用计算引擎电路实现算法加速,可有效提高实时成像处理效率。专用芯片设计过程中,需验证其功能边界及异常场景,而如何提升专用算法加速电路的验证覆盖率是核心难题。本文提出一种基于细粒度电路匹配建模的SAR成像处理芯片的验证方法,对每个计算引擎建立独立的细粒度参考模型,在实现运算功能基础上增加对电路运算精度、电路执行优先级等特征建模,解决了传统参考模型无法用于SAR计算引擎数据比对难题;在此基础上基于UVM(Universal Verification Methodology)构建可复用、可扩展的验证环境,采用算法功能加随机配置的双组合验证策略。经测试试验,本文方法可将计算引擎验证覆盖率提升14%~30%,满足SAR实时成像处理应用需求。
  • Accepted: 2026-03-15
    针对工业领域对高效率、低功耗的嵌入式三维测量系统的迫切需求,设计了一种基于半周期条纹二值编码算法架构的FPGA三维测量系统,首先通过相机控制模块采集结构光图像,并对图像进行缓存和滤波预处理。随后利用FPGA并行运算的优势实现了包裹相位、相位展开和三维重建等关键算法模块的硬件加速。最后通过千兆以太网模块将重建结果传输至上位机进行可视化显示,构建了完整的嵌入式三维测量流程。实验表明,所提系统在处理单组720*540分辨率的图像单次重建速度为4ms,系统功耗为1.687W,对标准小球测量的拟合误差为0.0532mm,可见该系统在保持低功耗的同时具有良好的鲁棒性与实用性,为嵌入式三维测量系统的实现提供了可行的解决方案。
  • Accepted: 2026-03-13
    To address the complex challenges of modern new energy vehicles, such as low network transmission latency and high real-time data requirements, a vehicle gateway controller based on CAN FD (Controller Area Network Flexible Data-rate) was designed using Renesas automotive-grade microcontroller RH850/F1K. It features six bus channels, supports both classical CAN and CAN FD, and includes data storage functionality. The bus transceiver employs the next-generation CAN transceiver TJA1462 to enhance signal transmission performance. The software design is based on the embedded operating system FreeRTOS, enabling multitasking and structured hierarchical functions. Additionally, a specialized driver was developed for the CAN peripheral's unique transceiver method on the RH850/F1K microcontroller. Using a USB-to-CAN tool, the gateway's data transmission and storage capabilities were tested. The results demonstrate that the designed gateway controller can operate stably even when the load rate approaches the bus usage limit, providing an efficient and reliable solution for vehicle gateway controllers.
  • Chen Zhihong, Du Yuan, Du Li
    Accepted: 2026-03-13
    Ensuring the structural integrity of bridge expansion joints is critical for maintaining traffic safety and prolonging infrastructure lifespan. However, conventional inspection methods are often labor-intensive, time-consuming, and disruptive to traffic flow. To overcome these limitations, this study proposes an edge–cloud collaborative acoustic monitoring system for the long-term health assessment of bridge expansion joints. Over an 18-month monitoring period spanning multiple bridges, approximately 21,000 acoustic signature samples of bridge expansion joints were collected. To ensure accuracy and reliability, all samples were meticulously annotated by experienced highway engineers. Based on this dataset, an adaptive edge–cloud collaborative classification framework was developed. Specifically, the edge device employs a cascade classification model based on Support Vector Machines (SVM) for real-time, lightweight inference, while the cloud leverages a deep learning model based on Gated Recurrent Units (GRU) to perform more complex analysis. The cascaded classification model deployed on the edge device achieved an accuracy of 96.7%, with a 95% confidence interval (CI) of (0.9632, 0.9668). In comparison, the GRU-based classifier running on the cloud attained a higher accuracy of 98.2%, with a 95% CI of (0.9783, 0.9823). Furthermore, the proposed adaptive two-stage classification strategy reduced data transmission to less than 5% of the total collected data. These experimental results demonstrate that the proposed system offers a reliable, efficient, and accurate solution for the acoustic health monitoring of bridge expansion joints.
  • Accepted: 2026-03-09
    The Host Port Interface (HPI) of an independently developed Digital Signal Processor (DSP) is the core channel for large-scale data interaction between the host and the DSP, allowing the host to directly access the DSP's memory space through a parallel bus, significantly enhancing the efficiency of data exchange. The limitations of the HPI interface in terms of bandwidth, portability, and compatibility with the Enhanced Direct Memory Access (EDMA) interconnection protocol are addressed in this paper, and a complete 16-bit HPI interface based on the Advanced High-performance Bus (AHB) protocol is proposed and designed. The design adopts a modular approach and is implemented in Verilog, including the register configuration module, AHB slave interface module, read cache and control module, host port interface module, and write cache and control module. Functional verification and logic synthesis of the host's read and write access to the HPI were conducted. The synthesis report indicates that at a 40nm process and 200MHz operating frequency, the overall area of the HPI interface based on the AHB bus is 10344µm², with a bandwidth of 66MB/s and a dynamic power consumption of only 0.386mW. The verification results show that the HPI interface, while achieving data transmission, realizes burst data transmission between the AHB protocol and EDMA, providing a high-bandwidth, easy-to-implement host parallel communication interface with stronger portability and universality.
  • Accepted: 2026-03-06
    Aiming at the problems such as the limited on-chip resources of the self-developed DSP (Digital Signal Processor) and the insufficient storage space for large-scale data and programs, a configurable External Memory Interface (EMIF) design scheme is proposed and designed. This scheme achieves flexible access to three types of memory, namely asynchronous memory, SDRAM and SBSRAM, through four configurable chip selection space registers, and realizes efficient data transmission between the CPU and external memory by using an enhanced direct memory access module. By writing a testbench and comparing the consistency of read data and written data, EMIF has undergone a relatively thorough functional verification. The experimental results show that this design realizes the read-write burst access of 8/16/32bit data. Under the 40nm low-power process, the area is 25,119.36μm2 and the power consumption is 1.296mW.
  • Accepted: 2026-02-25
    As electronic packaging technology advances towards high-density and highly integrated configurations, FC-CCGA packages are increasingly adopted in aerospace and other high-reliability applications due to their superior electrical and thermal performance and I/O density. These packages must withstand vibrational loads during launch and thermal cycling during in-orbit operation, with their core interconnect components susceptible to stress accumulation leading to thermomechanical fatigue failure, directly threatening the reliability and lifespan of the package devices. This study, based on the Anand model, constructs a comprehensive electro-mechanical 3D model encompassing chip bumps, solder columns, epoxy resin, and PCB substrate, systematically investigating the vibration and thermal fatigue reliability of FC-CCGA solder joints and bumps. The analysis includes impact response spectrum assessments pre- and post-electrical reinforcement, simulating the stress-strain response of critical solder joints and bumps under thermal cycling loads, and evaluating the plastic strain distribution and evolution over time. Additionally, the Coffin-Manson model is employed to predict thermomechanical fatigue life of the solder joints and bumps. Temperature cycling tests on packaged devices were conducted, observing morphological changes in solder joints at varying cycle counts; the experimental results align with simulation predictions, providing a theoretical basis and methodological support for optimizing solder joint structures and assessing their lifespan for high-reliability applications.
  • Accepted: 2026-02-14
    To achieve independent controllability and enhance the communication reliability of the EtherCAT Slave Controller (ESC), this paper proposes an FPGA-based redundant communication interface. The design adopts a modular architecture with innovative integration of hardware modules, including link redundancy processing, multi-path frame forwarding, EtherCAT frame parsing, and configurable timing compensation. It supports dual-link redundancy backup and enables real-time detection and isolation of erroneous frames. The system performs automatic frame forwarding and achieves microsecond-level latency through parallel hardware processing. FPGA test results demonstrate that the interface maintains stable communication during single-link failures while consuming minimal hardware resources. The proposed solution effectively improves the fault tolerance and real-time performance of ESC communications, laying a solid technical foundation for the independent development of high-reliability industrial communication chips.
  • Accepted: 2026-02-12
    Short-wave infrared (SWIR) imaging technology, operating in the 1 μm–2.5 μm electromagnetic spectrum, does not rely on object thermal radiation, is less affected by ambient illumination, fills the gap in full-band imaging, and holds significant value across multiple fields. This design develops an uncooled SWIR camera core system based on the ZYNQ platform and a domestic InGaAs focal plane array (FPA) detector to realize image acquisition, preprocessing, and transmission: hardware encompasses a main control board, driver board, TEC temperature control, power supply, and peripheral circuits, with output channels established via Camera Link and Ethernet, while software achieves PL-PS data interaction, completes link synchronization and data reception based on the JESD204B protocol, and optimizes image quality through an improved preprocessing algorithm. The system achieves stable output of 640×512 resolution at 30 fps, boasting light weight, low power consumption, and low cost, which fully meets practical application requirements.
  • Accepted: 2026-02-11
    FPGAs are extensively employed as interface boards in embedded systems. The software operating on these chips primarily interacts with peripheral devices. A significant challenge in verifying such FPGA software lies in developing configurable simulation models (supporting both normal and abnormal operations) based on peripheral chip specifications. Currently, these simulation models are predominantly implemented as static constructs using System Verilog/Verilog-HDL/VHDL language features. These simulation models, akin to RTL models, exhibit a structured nature. This implies that their configuration is static; they do not support dynamic instantiation or teardown during the course of simulation based on runtime needs. Therefore, this type of simulation model is suitable for verification scenarios in which the chip's operating mode remains unchanged during runtime. For verification scenarios where the operating mode needs to switch during runtime, using a static simulation model to simulate such scenarios requires the development of a communication interface with the verification platform. Furthermore, since UVM verification platforms cannot dynamically instantiate such static simulation models, embedding them into a UVM environment prevents the utilization of UVM's powerful class library ecosystem. To address the issues mentioned above, this paper employs the UVM standard class library, which leverages the object-oriented features of SystemVerilog, to construct a simulation model of peripheral chips. This simulation model is then embedded into the UVM verification platform. Experimental results demonstrate that the verification platform can dynamically configure the operating modes of the simulation model during runtime. In other words, the model is capable of simulating scenarios that involve switching between multiple operating modes of peripheral chips, thereby enriching the verification scenarios for FPGA software. The chip simulation model proposed in this paper, based on the UVM standard class library and integrated with the UVM verification platform, reduces to some extent the simulation complexity of peripheral chip multi-mode switching scenarios and alleviates the development and maintenance difficulties associated with simulation models for peripherals featuring multiple operating modes. This approach provides a new and feasible solution for developing simulation models in the verification of FPGA software related to interface timing control.
  • Accepted: 2026-02-11
    To address the problem of cross-scene object tracking, this paper proposes a scene transformation-based algorithm for stable object tracking. By integrating a feature enhancement module and an object refinement module into the Kernelized Correlation Filter (KCF) algorithm, the contour and detailed information of the target are highlighted. Meanwhile, the U-Net neural network is introduced to achieve accurate detection of scene boundaries, and a dynamic update and target recapture strategy is incorporated into the decision-making framework to ensure stable object tracking.Given that the proposed algorithm fails to meet real-time performance requirements due to increased computational complexity, the Atlas 300I edge computing module is adopted for algorithm acceleration. This approach effectively balances the trade-off between algorithm complexity and tracking performance, improving tracking accuracy while satisfying the real-time constraints of practical engineering applications. Experimental results demonstrate that the proposed algorithm can effectively resolve the issue of tracking drift in cross-scene object tracking scenarios, providing valuable insights for future research in this field.