Home Browse Just Accepted

Just Accepted

Note: The articles listed below have been peer-reviewed and accepted for publication in this journal. These articles have not yet been scheduled for a specific issue; their content and layout may undergo minor changes in the final published version. Please refer to the final published version as the definitive one. This journal has assigned each of these articles a unique and persistent DOI. You may use the DOI to cite this article directly.
Please wait a minute...
  • Select all
    |
  • Accepted: 2026-02-14
    To achieve independent controllability and enhance the communication reliability of the EtherCAT Slave Controller (ESC), this paper proposes an FPGA-based redundant communication interface. The design adopts a modular architecture with innovative integration of hardware modules, including link redundancy processing, multi-path frame forwarding, EtherCAT frame parsing, and configurable timing compensation. It supports dual-link redundancy backup and enables real-time detection and isolation of erroneous frames. The system performs automatic frame forwarding and achieves microsecond-level latency through parallel hardware processing. FPGA test results demonstrate that the interface maintains stable communication during single-link failures while consuming minimal hardware resources. The proposed solution effectively improves the fault tolerance and real-time performance of ESC communications, laying a solid technical foundation for the independent development of high-reliability industrial communication chips.
  • Accepted: 2026-02-12
    Short-wave infrared (SWIR) imaging technology, operating in the 1 μm–2.5 μm electromagnetic spectrum, does not rely on object thermal radiation, is less affected by ambient illumination, fills the gap in full-band imaging, and holds significant value across multiple fields. This design develops an uncooled SWIR camera core system based on the ZYNQ platform and a domestic InGaAs focal plane array (FPA) detector to realize image acquisition, preprocessing, and transmission: hardware encompasses a main control board, driver board, TEC temperature control, power supply, and peripheral circuits, with output channels established via Camera Link and Ethernet, while software achieves PL-PS data interaction, completes link synchronization and data reception based on the JESD204B protocol, and optimizes image quality through an improved preprocessing algorithm. The system achieves stable output of 640×512 resolution at 30 fps, boasting light weight, low power consumption, and low cost, which fully meets practical application requirements.
  • Accepted: 2026-02-11
    FPGAs are extensively employed as interface boards in embedded systems. The software operating on these chips primarily interacts with peripheral devices. A significant challenge in verifying such FPGA software lies in developing configurable simulation models (supporting both normal and abnormal operations) based on peripheral chip specifications. Currently, these simulation models are predominantly implemented as static constructs using System Verilog/Verilog-HDL/VHDL language features. These simulation models, akin to RTL models, exhibit a structured nature. This implies that their configuration is static; they do not support dynamic instantiation or teardown during the course of simulation based on runtime needs. Therefore, this type of simulation model is suitable for verification scenarios in which the chip's operating mode remains unchanged during runtime. For verification scenarios where the operating mode needs to switch during runtime, using a static simulation model to simulate such scenarios requires the development of a communication interface with the verification platform. Furthermore, since UVM verification platforms cannot dynamically instantiate such static simulation models, embedding them into a UVM environment prevents the utilization of UVM's powerful class library ecosystem. To address the issues mentioned above, this paper employs the UVM standard class library, which leverages the object-oriented features of SystemVerilog, to construct a simulation model of peripheral chips. This simulation model is then embedded into the UVM verification platform. Experimental results demonstrate that the verification platform can dynamically configure the operating modes of the simulation model during runtime. In other words, the model is capable of simulating scenarios that involve switching between multiple operating modes of peripheral chips, thereby enriching the verification scenarios for FPGA software. The chip simulation model proposed in this paper, based on the UVM standard class library and integrated with the UVM verification platform, reduces to some extent the simulation complexity of peripheral chip multi-mode switching scenarios and alleviates the development and maintenance difficulties associated with simulation models for peripherals featuring multiple operating modes. This approach provides a new and feasible solution for developing simulation models in the verification of FPGA software related to interface timing control.
  • Accepted: 2026-02-11
    To address the problem of cross-scene object tracking, this paper proposes a scene transformation-based algorithm for stable object tracking. By integrating a feature enhancement module and an object refinement module into the Kernelized Correlation Filter (KCF) algorithm, the contour and detailed information of the target are highlighted. Meanwhile, the U-Net neural network is introduced to achieve accurate detection of scene boundaries, and a dynamic update and target recapture strategy is incorporated into the decision-making framework to ensure stable object tracking.Given that the proposed algorithm fails to meet real-time performance requirements due to increased computational complexity, the Atlas 300I edge computing module is adopted for algorithm acceleration. This approach effectively balances the trade-off between algorithm complexity and tracking performance, improving tracking accuracy while satisfying the real-time constraints of practical engineering applications. Experimental results demonstrate that the proposed algorithm can effectively resolve the issue of tracking drift in cross-scene object tracking scenarios, providing valuable insights for future research in this field.
  • Accepted: 2026-02-10
    To address the demand for miniaturized, and low-power Synthetic Aperture Radar (SAR) systems in unmanned platforms, this study developed a high-speed imaging micro-system leveraging heterogeneous System-in-Package (SiP) technology. The system integrates an FPGA, a DSP, multiple DDR3 memory chips, and a LDO. Comprehensive electrical, thermal, and mechanical simulations validated the system’s signal integrity, thermal management, and structural robustness. Electrical simulations confirmed that key signals of DDR3-1600MT/s and 5 GT/s SerDes meet design requirements in terms of insertion loss, return loss, and crosstalk. Thermal simulations verified that the junction temperature of chips remains controllable with the application of a cold plate. Mechanical analysis demonstrated that the stress under thermal cycling and vibration shock conditions remains below the material limits. Imaging experiments showed close agreement between point-target responses, real-world data, and MATLAB simulations, with key metrics such as Peak Side Lobe Ratio (PSLR) meeting performance targets. These results demonstrate the micro-system’s viability for enabling lightweight, high-performance SAR processing in unmanned platforms.
  • Accepted: 2026-01-28
    针对无缆式光伏面板清扫机器人在无市电场景下续航时间短、维护频繁的问题,本文设计了一种基于硬件与软件协同优化的低功耗嵌入式系统。系统以超低功耗微控制器STM32L496为核心,构建了包含高效DC-DC转换器与低静态电流LDO的分级电源管理电路,为通信、传感器等外围模块设计了软件可控的独立电源开关,并对强电伺服系统的工况自适应进行低功耗优化。在FreeRTOS上实现了基于实时时钟(RTC)的定时唤醒机制与动态功耗管理策略,使系统在非作业时段可进入微安级深度休眠状态。理论建模与分析表明:与未低功耗优化的方案相比,本系统在典型工作周期下的平均功耗降低约82.2%,搭载10000mAh锂电池的机器人理论续航时间从7.2天延长至40.5天,显著提升了其持续作业能力与运维经济性。
  • Accepted: 2026-01-27
    Addressing the challenges of heavy maintenance tasks, difficult physical layer fault localization, and high communication latency associated with traditional "FPGA+ARM" dual-chip architectures in current Multifunctional Vehicle Bus (MVB) networks, this paper designs an MVB bus waveform acquisition and fault diagnosis system based on the ZYNQ heterogeneous SoC platform. First, leveraging the on-chip high-bandwidth interconnection characteristics of the PL (Programmable Logic) and PS (Processing System) in the ZYNQ architecture, a hardware-software co-design acquisition architecture is constructed, resolving the timing bottleneck of cross-chip transmission. Second, for common short-circuit, open-circuit, and impedance mismatch faults in the MVB physical link, a diagnosis method combining key time-domain feature extraction and an expert rule base is proposed. Decision thresholds are set based on the IEC 61375 standard and statistical criteria, replacing traditional single-threshold judgment. Finally, a semi-physical simulation experimental platform was built for system verification. Test results show that the system can accurately reconstruct high-speed MVB signal waveforms; in a laboratory environment, the end-to-end delay from acquisition to diagnosis is controlled at the millisecond level, and typical physical layer faults can be accurately diagnosed. This design achieves deep on-chip coupling of data acquisition and intelligent processing, providing a highly integrated engineering solution for train network status monitoring.
  • Accepted: 2026-01-27
    To address the data stream conversion problem between serial input and parallel output of four LCoS zones in 8K LCoS drivers for high-definition multimedia interfaces, and to meet the high-performance driving requirements of ultra-high-definition displays, this paper studies a frame buffer module based on DDR SDRAM. Developed using Verilog HDL, the control logic is constructed using the AXI bus protocol and FDMA IP core. A FIFO buffer is configured to handle asynchronous clock domain data transfer. Address partitioning enables independent read/write operations across the four zones, address control facilitates vertical image flipping, and a synchronization signal module adapts to the output timing of 8K LCoS. Results show that this module can stably buffer data at a resolution of 7680×4320, supports ≥3 frames pre-stored, and achieves the conversion from serial data to parallel output across four zones.
  • Accepted: 2026-01-22
    针对在真空强电磁干扰环境下,微弱推力测量信号易被噪声淹没而导致测量精度低、动态响应差的问题,本文提出一种基于卡尔曼滤波与电磁屏蔽相结合的信号降噪优化方案。该方案通过构建多层复合电磁屏蔽系统,从物理层面抑制空间辐射干扰;同时,建立测力计系统的动力学模型,并应用卡尔曼滤波算法对采集信号进行最优状态估计,以分离确定性推力信号与随机过程噪声和测量噪声。实验结果表明,该方案能够将信噪比(SNR)提升近30dB,与传统低通滤波相比,在有效抑制噪声的同时,显著改善了系统的动态响应特性,为高精度、高动态范围的微推力测量提供了可靠的技术途径。
  • Accepted: 2026-01-22
    This paper presents the design of a high-speed real-time signal processing radar digital front-end based on Xilinx FPGA.The FPGA in this radar front-end fully utilizes its abundant resources, including logic, RAM,DSP, and high-speed interfaces, to implement functional modules such as 10-Gigabit Ethernet, Microblaze, and high-speed cache. This enables the FPGA to perform control, preprocessing, and high-speed data transmission, resulting in a radar processing front-end with a simple hardware structure, high signal processing capability, and fast data transmission speed. In software implementation, the high-speed data read-write timing is meticulously designed according to radar waveform characteristics to meet the data transmission capacity requirements. It has been successfully applied in real-time processing for surveillance radar projects, achieving excellent results
  • Accepted: 2026-01-20
    A two-stage operational amplifier with low input bias current, rail-to-rail input, high gain, and high bandwidth has been designed by combining a folded-cascode first stage with a class-AB output stage, incorporating a linear transconductance (Gm) loop and gain-boosting techniques. The first stage employs a folded-cascode architecture, achieving rail-to-rail input through parallel NMOS and PMOS input differential pairs. A dedicated current compensation scheme ensures a constant output impedance of the first stage. The second stage utilizes a class-AB output configuration, where a translinear loop precisely sets the quiescent current of the output stage, resulting in improved drive capability and reduced power consumption. Gain-boosting techniques are applied to further enhance the output impedance of the cascode structure, thereby increasing the overall DC gain. The op-amp is fabricated in a SMIC 180nm MS BCD CMOS process.After tape-out, a test platform was independently developed by designing the test circuitry and fabricating a custom PCB board. Key parameters of a broadband low-input-bias-current operational amplifier, including input bias current, offset voltage, open-loop gain, small-signal bandwidth, slew rate, and noise, were measured using an oscilloscope, network analyzer, and spectrum analyzer. Test results demonstrate that with a 2pF load capacitor, the amplifier achieves a low-frequency gain of 50dB and a gain-bandwidth product (GBW) of 380MHz.
  • Accepted: 2026-01-20
    The selection of aerospace components is a critical link in space missions. Traditional selection methods are plagued by issues such as low efficiency and high reliance on professional expertise. This paper designs and implements an intelligent selection and recommendation system for aerospace components based on large language models (LLMs) and retrieval-augmented generation (RAG) technology. Adopting an agent-based architecture, the system achieves end-to-end intelligent processing, converting users' natural language requirements into component recommendation schemes seamlessly. A professional knowledge base is constructed, encompassing 468 aerospace-grade components and 60 system-level bills of materials (BOMs). A multi-strategy cascaded retrieval mechanism integrating exact matching and semantic understanding is designed, and an hallucination prevention and control mechanism is developed to meet aerospace safety requirements. The system is capable of handling application scenarios including single component recommendation and system-level scheme generation. In an evaluation experiment involving 65 test cases, the system achieves a macro-average F1-score of 0.829, representing a 10.4% improvement over manual keyword retrieval and a 69.5% improvement compared to pure LLM methods, thus verifying the system's effectiveness. The intelligent selection and recommendation system for aerospace components proposed in this paper can effectively support the intelligent selection of aerospace components.
  • Yansong LI, Jiehong FANG, Xudong LU, Jingzhu WU, Wanang XIAO
    Accepted: 2026-01-20
    This paper presents the design and implementation of a highly efficient and stable electromagnetic wave energy harvesting circuit targeting ultra-high-frequency (UHF) RFID applications. The system consists of a rectifier, a reference circuit, and a voltage regulator, collectively providing a reliable power supply for other chip modules. Key achievements include: a differential rectifier achieving 67% power conversion efficiency (PCE) with 1.84 V output at -11 dBm input power; an ultra-low-power CMOS reference circuit with temperature coefficients of 34.9 ppm/°C (voltage) and 18.4 ppm/°C (current), and a power supply rejection ratio (PSRR) better than -93 dB at 100 Hz; and an LDO-based voltage regulator exhibiting a load regulation of 41.28 mV/mA under varying load conditions. Additionally, clock, reset, demodulation, and modulation circuits are integrated. The circuit is implemented in a 180 nm CMOS process, occupying a layout area of 341μm × 346μm. Post-layout simulations indicate a total power consumption of 6 μW. Silicon measurements confirm that all performance metrics meet the requirements for high conversion efficiency and power supply stability in UHF RFID chips.
  • He Xiang, Guan quansheng, Lin Jiaqun, Liao Shiwen
    Accepted: 2026-01-16
    Facing the processing performance bottleneck of current satellite communication terminals in high-throughput IP service scenarios, this paper proposes a high-speed IP service transmission method based on a CPU+FPGA SoC architecture. The core of this method lies in the separation and cooperation of the control plane and the data plane: functions required for high-speed data forwarding, such as IP access, route addressing, and link frame assembly/disassembly, are offloaded to the FPGA to form a high-performance data plane; control plane logic such as route maintenance and protocol interaction is handled by the CPU. Through key technologies like event-driven synchronization and hardware-level QoS scheduling, engineering challenges in collaborative design, such as entry synchronization and low-latency signaling guarantee, are overcome. This achieves a significant performance leap on existing hardware, providing an effective solution for the smooth performance upgrade of large-scale in-network satellite terminals and the design of compact, low-power terminals.
  • Accepted: 2026-01-13
    For the hot backup working mode of dual-channel redundant electronic controllers, a design scheme of a multi-functional channel management module based on CPLD is proposed to achieve efficient data exchange and synchronization between primary and backup channels. Serial Peripheral Interface (SPI) is used for communication and synchronization, with the management module serving as the SPI communication host to implement data broadcasting to both primary and backup channels and direct bridging between them, while also possessing the capability to monitor communication data. Based on simulation verification, the serial communication management functions normally, with the delay caused by timing logic in SPI bridging and broadcasting being 30ns, indicating good system real-time performance. Tests on the SPI data transmission function of the designed dual-channel redundant electronic controller show that data integrity is good in both serial loopback and SPI data exchange tests, meeting the design requirements.
  • Accepted: 2026-01-12
    In response to the issues such as the reduction in the acoustic wave intensity emitted by the ultrasonic transducer and the decrease in the sensitivity of the received signal of traditional ultrasonic anemometers under conditions like rain and snow, strong wind, dusty environments, and long-term usage, which subsequently lead to low measurement accuracy and low stability of wind speed and direction, an ultrasonic anemometers based on adaptive adjustment of signal gain has been designed. The electrical hardware section mainly consists of the driving circuit and receiving circuit for ultrasonic signal transmission, the programmable signal gain amplification and adjustment circuit, the filtering circuit, the AD acquisition circuit, and the MCU control circuit, etc. The software part mainly employs the time difference method, the ring tone method, the cross-correlation method, and the adaptive adjustment control algorithm of signal gain based on the binary search method to calculate the wind speed and direction. Experimental results indicate that the designed ultrasonic anemometer yields highly accurate and stable data for wind speed and direction measurements in harsh environments,Within the range of 0-15m/s, the error is ±0.5m/s; for wind speeds between 15-40m/s, the error does not exceed ±3%.The absolute error of wind direction is less than 3°.
  • Accepted: 2026-01-12
    With the wide application of Field Programmable Gate Arrays (FPGAs) in high-performance computing, artificial intelligence inference, and 5G communications, the scale of circuit designs and the complexity of timing constraints continue to increase, placing higher demands on the runtime efficiency of Static Timing Analysis (STA). Existing FPGA STA tools predominantly rely on single-core or multi-core Central Processing Unit (CPU) architectures. Although continuous algorithmic optimizations have been made, they still face computational bottlenecks and insufficient memory access efficiency when handling large-scale FPGA designs. In recent years, Graphics Processing Units (GPUs), with their massive parallel computing capabilities, have provided new opportunities for improving FPGA STA performance. However, challenges in memory access patterns under heterogeneous GPU architectures, optimization for timing graph loop detection, and heterogeneous parallel acceleration strategies limit the effectiveness of current GPU-accelerated methods in FPGA STA scenarios. To address these issues, we propose an FPGA STA algorithm accelerated by an efficient heterogeneous parallel strategy. First, targeting the problem of discontinuous memory access and field interleaving in traditional object-oriented data structures under CPU-GPU heterogeneous architectures, a structure-of-arrays (SoA)-based data layout strategy is presented. Combined with data reordering operations, this approach effectively reduces memory access latency and improves bandwidth utilization, providing a data foundation for high-performance FPGA STA computational engines. Second, to overcome the limitations of low efficiency and poor robustness in timing graph loop detection, a parallel loop detection optimization algorithm based on color propagation is designed, enabling efficient acceleration in the preprocessing stage of FPGA STA. Furthermore, a task decomposition and timing graph traversal method tailored for CPU-GPU heterogeneous architectures is proposed, achieving efficient acceleration of core STA operations such as delay calculation, levelization, and graph propagation. Finally, experimental results on both the OpenCores and industrial-grade FPGA benchmarks demonstrate that, compared with traditional CPU implementations, the proposed method achieves a runtime speedup of 3.125× to 33.333×, with overall performance surpassing that of the OpenTimer tool. This research provides a practical and feasible approach for efficient timing verification in large-scale FPGA designs.
  • 王 少威
    Accepted: 2026-01-12
    Stochastic computing (SC), an unconventional computational paradigm, employs probabilities to represent numerical values. This representation enables complex arithmetic operations to be performed using simple logic gates. This work presents a fast unary median filtering circuit design. The proposed filter utilizes counters to generate stochastic numbers (SNs) and constructs fundamental sorting network units using stochastic correlation logic. A feedback loop, formed based on the output, dynamically terminates computations early without consuming additional hardware area, significantly reducing substantial circuit latency. Experimental results demonstrate that the proposed median filter design outperforms existing implementations in both actual bitstream length and energy consumption. Specifically, the proposed 3×3 window median filter circuit achieves a 55.58% reduction in energy. Further validation using median filtering on images corrupted by salt-and-pepper noise confirms the accuracy of the proposed circuit. For a 16-input sorting network application, the proposed design exhibits lower consumption when inputs range within [0, 0.5], achieving up to a 50% reduction in actual bitstream length and energy consumption.
  • 林 晓会
    Accepted: 2026-01-09
    CPLD器件的严格使用场景对其提出了高可靠性测试要求,针对上位机测试CPLD器件过程繁琐、测试效率低等问题,提出了一种基于逻辑分析仪的CPLD配置向量生成方法。该方法以国产CPLD为例,利用逻辑分析仪实时采集JTAG下载配置数据,通过协议解码和对解码后的配置数据深入分析和总结,基于SVF标准语句格式编写生成了配置向量,完成了配置向量转码和ATE在线配置测试验证,测试结果有效证明了生成配置向量的正确性和该生成方法的可行性,对后续CPLD器件使用ATE自动化量产测试具有重要指导意义。
  • Accepted: 2026-01-08
    Continuous-flow microfluidic biochips (CFMBs) are widely used in biochemical analysis due to their high precision and reliability. CFMBs consist of a flow layer and a control layer. To manage complex logic in the control layer with limited control pins, multiplexers are widely adopted. However, the physical design of multiplexers—specifically the co-optimization of valve placement and channel routing—remains largely unexplored. To address this, this paper proposes a co-optimization method based on Discrete Particle Swarm Optimization (DPSO). First, valve placement regions are constrained via preprocessing to ensure routing feasibility. Second, a DPSO framework encodes placement into particle positions and utilizes an embedded A* router to provide routing cost as fitness, establishing a closed-loop feedback mechanism between placement and routing. Third, X-architecture routing is introduced to expand the solution space and minimize wirelength. Experimental results demonstrate that the proposed method reduces the average control channel length by 8.27%. Notably, the X-architecture contributes a 5.01% improvement over traditional R-type routing, significantly enhancing both layout quality and routing efficiency.