TFLitemicro内存管理与分配策略的优化

许鹏, 宋岩

集成电路与嵌入式系统 ›› 2022, Vol. 22 ›› Issue (10) : 11-15.

PDF(1082 KB)
PDF(1082 KB)
集成电路与嵌入式系统 ›› 2022, Vol. 22 ›› Issue (10) : 11-15.
技术专题—嵌入式人工智能

TFLitemicro内存管理与分配策略的优化

  • 许鹏1, 宋岩2
作者信息 +

Optimizations of TFLite-micro Memory Management and Allocation Policy

  • Xu Peng1, Song Yan2
Author information +
文章历史 +

摘要

TFLitemicro(TFLm)是当前在微控制器平台上流行的神经网络推理框架。本文分析了TFLm在推理模型时的内存管理机制与分配策略,以及其在使用内存时的局限性。当前TFLm仅支持使用单块内存(Tensor Arena) 来保存模型推理所需的中间结果,本文扩展TFLm的内存管理以支持使用多块不连续且访问性能有巨大差异的内存,还给可以重叠的tensor分配相同的内存。通过这样的改进,既把数据流量更多地引到片上快速内存中,又降低了峰值内存的占用。通过在i.MX RT1170上的实验数据表明,本文策略对于含有快速片上RAM(以DTCM为代表)的微控制器,能大大提高片上快速RAM的利用率,显著缓解存储器带宽带来的瓶颈,使推理时间缩短至一半以上。

Abstract

TFLite-micro(TFLm) is a popular inference engine on MCU.We analyze the memory management mechanism and allocation strategy of TFLm,and the limitations.Currently,TFLm can only support the use of a single block of memory (Tensor Arena) for intermediate results required by model inference.This paper optimizes the memory management of TFLm to support the use of multiple blocks of discontinuous memory with very different read-write performance,and also creates overlaying tensors when possible.This improvement,not only more data traffic is drainaged to the on-chip fast memory,but also the peak memory usage is reduced.The experiment on i.MX RT1170 shows that the strategy in this paper can greatly improve the utilization of fast on-chip RAM for microcontrollers,which significantly alleviate the bottleneck of memory bandwidth,and shorten the inference time by up to more than a half.

关键词

TFLitemicro / TFLm / TinyML / Tensor Arena / i.MX RT1170 / DTCM

Key words

TFLite-micro / TFLm / TinyML / Tensor Arena / i.MX RT1170 / DTCM

引用本文

导出引用
许鹏, 宋岩. TFLitemicro内存管理与分配策略的优化[J]. 集成电路与嵌入式系统. 2022, 22(10): 11-15
Xu Peng, Song Yan. Optimizations of TFLite-micro Memory Management and Allocation Policy[J]. Integrated Circuits and Embedded Systems. 2022, 22(10): 11-15
中图分类号: TP872   

参考文献

[1] David R,Duke J,Jain A,et al.TensorFlow Lite Micro:Embedded Machine Learning on TinyML Systems[J].2020.
[2] IC Insights.MCUs Expected to Make ModestComeback after 2020 Drop,2020.
[3] Lai L,Suda N,Chandra V. CMSISNN:Efficient Neural Network Kernels for Arm CortexM CPUs[J].2018.
[4] Howard A G,Zhu M,Chen B,et al.MobileNets:Efficient Convolutional Neural Networks for Mobile Vision Applications[J].2017.
[5] Sandler M,Howard A,Zhu M,et al.Inverted Residuals and Linear Bottlenecks:Mobile Networks for Classification,Detection and Segmentation[J].2018.

PDF(1082 KB)

Accesses

Citation

Detail

段落导航
相关文章

/