从微型机器学习的定义、优点、当前存在问题等方面做简要介绍;从专属或通用的微型机器学习部署方式、基于ARM Cortex-M或者RISC-V的微处理器设计、基于神经架构搜索的部署算法等方面存在的问题进行讨论,并介绍研究现状。对微型机器学习的未来发展进行展望,认为未来需要功能齐全的微型机器学习部署框架、硬件研究更多是基于RISC-V与硬件神经网络加速单元组成微处理器,并探讨如何提高搜索效率、减少神经架构搜索的耗时等。最后在上述基础上针对如何完善和发展微型机器学习生态提出思考。
Abstract
In the paper,the definition,advantages and current problems of tiny machine learning are introduced.From exclusive or general tiny machine learning deployment methods,microprocessor design based on ARM Cortex-M or RISC-V,and deployment algorithms based on neural architecture search,the existing problems are discussed and the research status is introduced.Looking forward to the future development of tiny machine learning,it is believed that a full-featured tiny machine learning deployment framework is needed in the future,and hardware research is more based on RISC-V and hardware neural network acceleration units to form microprocessors,and how to improve search efficiency and reduce neural architecture search time.Finally,on the basis of the above,some thoughts are put forward on how to improve and develop the tiny machine learning ecology.
关键词
微型机器学习 /
微处理器 /
神经架构搜索
Key words
tiny machine learning /
microcontroller /
neural architecture search
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] TinyML Foundation.About tinyML Foundation[EB/OL].[2022-11].https://www.tinyml.org/about/.
[2] Banbury C,Reddi V J,Lam M,et al.Benchmarking TinyML systems: Challenges and direction[J].arXiv preprint arXiv:2003.04821.
[3] David R,Duke J,Jain A,et al.Tensor Flow lite micro:Embedded machine learning for tinyml systems[C]//Proceedings of Machine Learning and Systems,2021:800-811.
[4] Pete W,Daniel S.TinyML:Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers [M].O'Reilly Media,Inc.2019:321-322.
[5] JanJongboom.Introducing EON:Neural Networks in Up to 55% Less RAM and 35% Less ROM[EB/OL].[2022-11]. https://www.edgeimpulse.com/blog/introducing-eon.
[6] Pham H T,Nguyen M A,Sun C C.AIoT solution survey and comparison in machine learning on low-cost microcontrolle[C]//In 2019 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS).IEEE,2019:1-2.
[7] Lai L,Suda N,Chandra V.Cmsis-nn:Efficient neural network kernels for arm cortex-m cpus[J].arXiv preprint arXiv:1801.06601.
[8] Lai L,Suda N.Enabling deep learning at the LoT Edge[C]//In 2018 IEEE/ACM International Conference on Computer-Aided Design(ICCAD).IEEE,2018:1-6.
[9] STMicroelectronics.AI expansion pack for STM32Cu-beMX[EB/OL].[2022-11]. https://www.st.com/en/embedded-software/x-cube-ai.html#tools-software.
[10] 杨凯歌.Cortex-M3扩展可编程神经网络加速系统设计[D].西安:西安电子科技大学,2019.
[11] ARM.Introduction to the Armv8.1-M Architecture[EB/OL].[2022-11]. https://www.arm.com/resource-s/white-paper/intro-armv8-1-m-architecture.
[12] Schiavone PD,Rossi D,Pullini A,et al.Quentin: an ultra-low-power pulpissimo soc in 22nm fdx[C]//In 2018 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S). IEEE,2018:1-3.
[13] Flamand E,Rossi D,Conti F,et al.GAP-8:A RISC-V SoC for AI at the Edge of the IoT[C]//2018 IEEE 29th International Conference on Application-specific Systems,Architectures and Processors (ASAP).IEEE,2018:1-4.
[14] Zhang S,Tong J,Zhang J,et al.A RISC-V Based Coprocessor Accelerator Technology Research for Convolution Neural Networks[C]//In Journal of Physics:Conference Series. IOP Publishing,2020:1631.
[15] WU N,JIANG T,ZHANGL,et al.A Reconfigurable Convolutional Neural Network-Accelerated Coprocessor Based on RISC-V Instruction Set[J].Electronics,2020,9(6):1005.
[16] 王松.基于RISC-V与CNN协处理器片上系统设计[D].西安:西安电子科技大学,2020.
[17] Gupta C,Suggala A S,Goyal A,et al.ProtoNN:Compressed and Accurate kNN for Resource-scarce Devices[C]//In International conference on machine learning. PMLR,2017:1331-1340.
[18] Kumar A,Goyal S,Varma M.Resource-efficient machine learning in 2 KB RAM for the internet of things[C]//In International conference on machine learning.PMLR,2017:1935-1944.
[19] Molchanov D,Ashukha A,Vetrov D.Variational dropout sparsifies deep neural networks[C]//Proceedings of the 34th International Conference on Machine Learning.PMLR 2017:70,2498-2507.
[20] Louizos C,Ullrich K,Welling M.Bayesian compression for deep learning[C]//Advances in neural information processing systems,2017.
[21] Fedorov I,Adams R P,Mattina M,et al.SpArSe: sparse architecture search for CNNs on resource-constrained microcontrollers[C]//In Proceedings of the 33rd International Conference on Neural Information Processing Systems,2019:4977-4989.
[22] Li Y,Chen Y,Dai X,et al.MicroNet: Towards image recognition with extremely low FLOPs[J].arXiv preprint arXiv:2011.12289.
[23] Lin J,Chen W M,Lin Y,et al.Mcunet:Tiny deep learning on iot devices[C]//Advances in Neural Information Processing Systems,2020:11711-11722.
[24] Lin J,Chen W M,Cai H,et al.Mcunetv2:Memory efficient patch-based inference for tiny deep learning[J].arXiv preprint arXiv:2110.15352.
[25] Fedorov I,Matas R,Tann H,et al.UDC:Unified DNAS for Compressible TinyML Models[J].arXiv preprint arXiv:2201.05842.
[26] Jacob B,Kligys S,Chen B,et al.Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]//In Proceedings of the IEEE conference on computer vision and pattern recognition,2018:2704-2713.
[27] Wang K,Liu Z,Lin Y,et al.Haq:Hardware-aware automated quantization with mixed precision[C]//In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:8612-8620.
[28] Rusci M,Fariselli M,Capotondi A,et al.Leveraging automated mixed-low-precision quantization for tiny edge microcontrollers[C]//In IoT Streams for Data-Driven Predictive Maintenance and IoT,Edge,and Mobile for Embedded Machine Learning.Springer,2020:296-308.
[29] Torres-Sánchez E,Alastruey-Benedé J,Torres-Moreno E.Developing an AI IoT application with open software on a RISC-V SoC[C]//In 2020 XXXV Conference on Design of Circuits and Integrated Systems.IEEE,2020:1-6.
基金
*2021年江门市创新实践博士后课题研究资助项目(JMBSH2021B04);广东省重点领域研发计划(2020B0101030002)。