首先,介绍了现阶段目标检测的发展并进行分类;然后阐述了YOLO系列算法,特别是YOLO中重要的核心机制,如损失函数、网络结构、优化策略、k-means聚类和批归一化;其次,对YOLO的应用场景进行介绍,如应用于行人检测、工业以及医学方面;最后,总结YOLO系列算法的特点以及未来改进方向。本文对研究基于深度学习的目标检测系统具有一定的指导意义。
Abstract
In the paper,the development and classification of single-stage object detection are introduced firstly.Then,the YOLO series of algorithms are introduced,especially the important core mechanisms in YOLO,such as loss function,network structure,optimization strategy,k-means clustering and batch normalization.This is followed by an introduction to YOLO's application scenarios,such as pedestrian detection,industry and medicine.Finally,the characteristics of YOLO series algorithms and identify future improvement directions are summarized,and this paper has certain guiding significance for the study of deep learning object detection.
关键词
深度学习 /
目标检测 /
YOLO算法 /
卷积神经网络
Key words
deep learning /
object detection /
YOLO algorithm /
convolutional neural network
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] Liu L,Ouyang W,Wang X,et al.Deep Learning for Generic Object Detection: A Survey[J].International Journal of Computer Vision,2020,128(2):261-318.
[2] Everingham M,Van Gool L,Williams C K I,et al.The Pascal Visual Object Classes (VOC) Challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
[3] Redmon J,Divvala S,Girshick R,et al.You Only Look Once:Unified, Real-Time Object Detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA:IEEE,2016:779-788.
[4] Redmon J,Farhadi A.YOLO9000:Better,Faster,Stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Honolulu,HI:IEEE,2017:6517-6525.
[5] Ioffe S,Szegedy C.Batch Normalization:Accelerating Deep Network Training by Reducing Internal Covariate Shift[Z].arXiv,2015(2015-03-02).
[6] Redmon J,Farhadi A.YOLOv3:An Incremental Improvement[Z].arXiv,2018(2018-04-08).
[7] He K,Zhang X,Ren S, et al.Deep Residual Learning for Image Recognition[Z].arXiv,2015(2015-12-10).
[8] Li C,Li L,Jiang H,et al.YOLOv6:A Single-Stage Object Detection Framework for Industrial Applications[Z].arXiv,2022(2022-09-07).
[9] Liu S,Qi L,Qin H,et al.Path Aggregation Network for Instance Segmentation[Z].arXiv,2018(2018-09-18).
[10] 娄翔飞,吕文涛,叶冬,等.基于计算机视觉的行人检测方法研究进展[J].浙江理工大学学报(自然科学版), 2023(2):1-12.
[11] 张立国,马子荐,金梅,等.基于YOLO的轻量红外图像行人检测方法[J].激光与红外,2022,52(11):1737-1744.
[12] 彭雨诺,刘敏,万智,等.基于改进YOLO的双网络桥梁表观病害快速检测算法[J].自动化学报,2022,48(4):1018-1032.
[13] 郑雯,张标标,吴俊宏,等.适于多尺度宫颈癌细胞检测的改进算法[J].光电子·激光,2022,33(9):948-958.
[14] Tan M,Le Q.EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks[C]//Proceedings of the 36th International Conference on Machine Learning.PMLR,2019:6105-6114.
[15] Howard A G, Zhu M, Chen B, et al.MobileNets:Efficient Convolutional Neural Networks for Mobile Vision Applications[Z].arXiv,2017(2017-04-16).
[16] Krizhevsky A,Sutskever I,Hinton G E.ImageNet Classification with Deep Convolutional Neural Networks[C]//Advances in Neural Information Processing Systems.Curran Associates,Inc.,2012.
[17] Dosovitskiy A,Beyer L,Kolesnikov A,et al.An Image is Worth 16×16 Words:Transformers for Image Recognition at Scale[Z].arXiv,2021(2021-06-03).
[18] Han K,Wang Y,Tian Q,et al.GhostNet:More Features from Cheap Operations[Z].arXiv,2020(2020-03-13).
[19] Ma N,Zhang X,Huang J,et al.WeightNet:Revisiting the Design Space of Weight Networks[Z].arXiv, 2020(2020-07-24).
[20] Iandola F N,Han S,Moskewicz M W,et al.SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size[Z].arXiv,2016(2016-11-04).
[21] Borji A.What is a salient object? A dataset and a baseline model for salient object detection[J].IEEE Transactions on Image Processing, 2015,24(2):742-756.
基金
*重庆市研究生联合培养基地项目(JDLHPYJD2018003)。