Current Issue

  • Select all
    |
    Special Topic of Energy-efficient Dedicated Chips for Intelligent Robots
  • Special Topic of Energy-efficient Dedicated Chips for Intelligent Robots
    LIU Bingqiang, SHEN Zixuan, WANG Jipeng, XIAO Jian, TAN Yulong, HE Zaisheng, XU Dengke, WANG Ke, QU Weixin, WANG Chao, SUN Lining
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Robots represent a revolutionary engine of new productive forces, reshaping human life and work. Simultaneous Localization And Mapping (SLAM) technology enables robots to navigate autonomously in unknown environments and construct maps of their surroundings, serving as the cornerstone for the intelligence of autonomous mobile robots. However, given that SLAM algorithms are complex and computationally intensive, implementations based on general-purpose CPU chips suffer from long delays and high power consumption, which fails to meet the real-time and power consumption requirements of autonomous mobile robots, especially small, micro, and nano ones. Consequently, the design of specialized hardware accelerator chips to accelerate computation-intensive SLAM algorithms has received considerable attention from both the academic and industrial communities in recent years. This article starts with the basic concepts and application scenarios of SLAM technology, and highlights the necessity of hardware acceleration for SLAM algorithms. It then reviews the current research status and development trends from the perspectives of algorithms and dedicated chip design, and discusses the technical challenges and solutions related to SLAM dedicated chips, providing recommendions for future development.

  • Special Topic of Energy-efficient Dedicated Chips for Intelligent Robots
    CHEN Zhuoyu, AN Fengwei
    Download PDF ( ) HTML ( )   Knowledge map   Save

    With the rapid development of the robotics industry, robotic technology has emerged as a new driving force for enhancing productivity, particularly highlighting the importance of technologies such as 3D reconstruction and obstacle avoidance navigation. However, active 3D imaging technologies based on Time of Flight (ToF) and structured light suffer from limitations such as low resolution, lack of original color information, and and susceptibility to ambient light interference, leading to suboptimal performance. Therefore, passive binocular stereo vision sensors, which can output dense depth and color information (RGB-D) in real-time, have been widely applied in fields such as autonomous robots, automobiles, and drones. Nonetheless, binocular stereo vision technology, which calculates disparity by mimicking human binocular vision for depth information, is computationally intensive and reliant on general-purpose computing platforms. This results in high energy consumption and latency for binocular stereo vision processors, limiting the technology's application in high-speed scenarios, small robots and edge computing. In recent years, binocular stereo vision processors integrated with hardware accelerators for stereo vision algorithms have gained significant attention in both academia and industry. This article systematically explains the theoretical foundation of binocular 3D stereo vision and its application examples in robotic stereo vision in the first section. It then introduces the structural components of binocular stereo vision processors, including core parts such as image acquisition, camera calibration and correction, and stereo matching. For the convenience of stereo vision hardware developers, this paper reviews the basic concepts, research status, challenges, and future trends based on the core components of the binocular stereo vision system, with a special focus on comparing new hardware computing architectures.

  • Special Topic of Energy-efficient Dedicated Chips for Intelligent Robots
    MO Xiaorui, ZHANG Weiyi, NIAN Cheng, GUO Yushi, NIU Liting, ZHANG Baiwen, ZHANG Chun
    Download PDF ( ) HTML ( )   Knowledge map   Save

    In Visual Simultaneous Localization and Mapping (V-SLAM) systems, Bundle Adjustment (BA) plays a crucial role in optimizing camera parameters and the positions of 3D points. However, due to the high computational complexity and real-time requirements of BA, traditional computing platforms struggle to meet efficiency demands. Recently, the introduction of dedicated hardware accelerators has provided new solutions for BA optimization. This paper reviews the current status of research and development trends in BA optimization-specific chips. It covers the application scenarios, definitions, and basic principles of BA algorithms; the acceleration of BA on Field-Programmable Gate Arrays (FPGA), Application-Specific Integrated Circuits (ASIC), and Graphics Processing Units (GPU), as well as the development trends of these accelerators. Furthermore, this paper discusses the technical challenges in implementing BA accelerators and anticipates future development directions. By summarizing current research advancements, this review aims to provide guidance and insights for future studies on BA optimization-specific chips.

  • Special Topic of Energy-efficient Dedicated Chips for Intelligent Robots
    WU Lizhou, ZHU Haozhe, CHEN Chixiao
    Download PDF ( ) HTML ( )   Knowledge map   Save

    Neural Radiance Fields (NeRF) is an emerging method for reconstructing 3D scenes, garnering significant attention for its potential applications in the field of robotics. NeRF uses Multi-Layer Perceptrons (MLPs) to learn 3D scene features, achieving high-fidelity image rendering and providing a foundation for navigation, localization, and perception in complex environments. Its core processes, including ray sampling, feature extraction, and volumetric rendering, are computationally intensive and involve irregular memory access patterns, which limits deployment on existing hardware platforms, especially edge devices. To advance the practical application of NeRF technology, new hardware architectures and solutions for co-optimization of hardware and software are necessary. This review systematically elucidates the principles and evolution of NeRF technology, exploring the performance bottlenecks encountered during its hardware execution. The review provides a detailed review of classic NeRF hardware accelerators, summarizing three main optimization directions: image similarity optimization, spatial sparsity optimization, and memory access optimization, and analyzes the commonalities and differences among various techniques. Additionally, the review examines the technical limitations and challenges of current NeRF accelerators in handling open scene tasks, considering applications such as SLAM and AIGC, particularly in terms of scalability and storage constraints. Finally, the review offers suggestions for future development to inspire further applications and optimization of NeRF hardware accelerators.

  • Special Topic of Energy-efficient Dedicated Chips for Intelligent Robots
    QI Xiuyuan, LIU Ye, HAO Shuang, ZHOU Jun
    Download PDF ( ) HTML ( )   Knowledge map   Save

    With the continuous iteration and development of computer vision technology, intelligent applications and devices centered on computer vision are increasingly playing a crucial role in daily life and work. Among these, visual Simultaneous Localization and Mapping (SLAM) technology finds extensive applications in fields such as robotics, drones, and autonomous driving. These fields critically rely on visual SLAM to provide accurate localization information for precise mapping and autonomous navigation. However, due to the inherent characteristics of visual SLAM algorithms, which involve high computational complexity and significant data dependency, traditional hardware platforms (CPU or GPU) struggle to meet the real-time and low-power requirements of edge applications. This limitation has become a key obstacle to the widespread adoption of visual SLAM. To address this issue, this paper proposes a high-efficiency domain-specific accelerator for ORB feature extraction in SLAM, designed through a co-optimization strategy of algorithms and hardware. Various hardware design techniques are employed to enhance computational performance and energy efficiency, include multi-level parallel computing based on decoupling data dependencies, data storage technology based on multi-size buckets, and pixel-level symmetric lightweight descriptor generation and direction calculation strategies. The proposed visual SLAM accelerator was tested and verified on the Xilinx ZCU104. Compared to the algorithm accuracy of ORB-SLAM2, the accuracy of this accelerator is within 5%, and the frame rate has increased to 108 fps. When compared to other hardware accelerators of the same period, the lookup table usage is reduced by 32.7%, the flip-flop (FF) usage is reduced by 41.17%, while the frame rate is increased by 1.4x and 0.74x.

  • Special Topic of Energy-efficient Dedicated Chips for Intelligent Robots
    GAO Jinyang, FAN Zhendong, BAO Minjie, WANG Ke, LI Ruifeng, KANG Peng
    Download PDF ( ) HTML ( )   Knowledge map   Save

    The combination of Robots and artificial intelligence will lead the transformation of new intelligent technologies. Furthermore, as a crucial component of artificial intelligence, neural networks demonstrate immense potential in robotic perception. However, the increasing complexity of AI algorithms and the prominent energy efficiency bottleneck of general-purpose processors such as CPUs pose significant challenges. Traditional processing chips fail to effectively accommodate the inference computing tasks of large-scale neural networks. In recent years, robotic AI chips, with high computing performance and low power consumption, have emerged as an ideal choice for deploying of neural networks in robot systems due to their, attracting widespread attention. This article focusing on robotic applications, studies the current status of AI algorithms, reviews the latest advances in AI chip design technology, proposes technical difficulties and feasible technical routes, and discusses the technical trends and challenges in the design of robotic AI chips.

  • Research Paper
  • Research Paper
    WANG Biao, ZHENG Hongbo, ZHANG Haoping, HE Jie, YANG Fan, TU Buhua
    Download PDF ( ) HTML ( )   Knowledge map   Save

    This paper presents a design scheme for an efficient instrument management system centered on the ATMEL(acpuired by Microchip) 32-bit SPARC V8 architecture processor AT697, with interface expansion facilitated by FPGA. The design optimizes the monitoring and control management of complex satellite payload systems, particularly in the areas of satellite communication, standalone equipment control, payload thermal management, time calibration, command processing, and interface control. The system features rapid command response, high-reliability communication capabilities, and efficient parallel data processing, thereby enhancing overall system stability, reliability, and operational efficiency.

  • Research Paper
    LIU Hailiang, LYU Hui, YANG Wanyun
    Download PDF ( ) HTML ( )   Knowledge map   Save

    There are risks of malicious attacks on SoC chips with JTAG/cJTAG interfaces that are not disabled during the mass productization stage, or the JTAG/cJTAG interfaces are simply and permanently disabled by OTP/eFuse, which makes it difficult to locate problems during mass production or limits debugging means when the CPU pointer runs away, making it difficult to locate the problem. This article designs a SoC security chip debug system based on permission management. Compared to traditional debugging methods, this article has made two modifications. For JTAG/cJTAG debugging, permission control bit design, verification password design, and permission comparison design have been added while retaining traditional debugging methods. Regarding the UART debugging method, a UART access register bus design has been added on the basis of retaining traditional debugging methods, and the UART access register function can be disabled through OTP/eFuse. It not only provides problem analysis methods for SoC chip CPU hanging and pointer running away, but also provides secure and convenient JTAG/cJTAG/UART debugging for SoC chip mass production stage.