DETR的目标检测算法研究综述及展望

PDF(931 KB)

集成电路与嵌入式系统 ›› 2023, Vol. 23 ›› Issue (5) : 40-42.

技术纵横

DETR的目标检测算法研究综述及展望

李小军, 刘颖

作者信息 +

Review of Target Detection Algorithm Research Based on DETR

Li Xiaojun, Liu Ying

Author information +

文章历史 +

摘要

目标检测是许多计算机视觉任务的基础和前提,是计算机视觉研究的核心问题。在Transformer之前,目标检测算法大多基于卷积神经网络,随着Transformer在自然语言处理领域的巨大成功,目标检测算法也在Transformer上面做出了尝试,并产生了以DETR为首的许多算法,取得了不错的结果。首先介绍Transformer以及它在计算机视觉中的应用,然后介绍DETR算法及其改进方案,并对DETR算法在目标检测任务未来的发展进行了展望。

Abstract

Target detection is the basis and premise of many computer vision tasks,and is the core issue of computer vision research.Before Transformer,most target detection algorithms are based on convolutional neural networks.With Transformer's great success in the field of natural language processing,target detection algorithms also make attempts on Transformer,and produces many algorithms led by DETR,and achieves good results.This paper first introduces Transformer and its application in computer vision,then introduces the DETR algorithm and its improvement,and looks forward to the future development of the DETR algorithm in the target detection task.

导出引用

李小军, 刘颖. DETR的目标检测算法研究综述及展望[J]. 集成电路与嵌入式系统. 2023, 23(5): 40-42

Li Xiaojun, Liu Ying. Review of Target Detection Algorithm Research Based on DETR[J]. Integrated Circuits and Embedded Systems. 2023, 23(5): 40-42

中图分类号： TP391.4

参考文献

[1] 宁健,马淼,柴立臣,等.深度学习的目标检测算法综述[J].信息记录材料,2022,23(10):1-4.
[2] 包晓敏,王思琪.基于深度学习的目标检测算法综述[J].传感器与微系统,2022,41(4):5-9.
[3] Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[J].Advances in neural information processing systems,2017.
[4] Zaremba W,Sutskever I,Vinyals O.Recurrent neural network regularization[J].arXiv preprint arXiv:1409.2329,2014.
[5] Dosovitskiy A,Beyer L,Kolesnikov A,et al.An image is worth 16×16 words:Transformers for image recognition at scale[J].arXiv preprint arXiv:2010.11929,2020.
[6] Carion N,Massa F,Synnaeve G,et al.End-to-end object detection with transformers[C]//Computer Vision-ECCV 2020:16th European Conference,Glasgow,UK,August 23-28,2020,Proceedings,Part I 16.Springer International Publishing,2020:213-229.
[7] Ren S,He K,Girshick R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[8] Kuhn H W.The Hungarian method for the assignment problem[J].Naval research logistics quarterly,1955,2(1-2):83-97.
[9] Zhu X,Su W,Lu L,et al.Deformable detr:Deformable transformers for end-to-end object detection[J].arXiv preprint arXiv:2010.04159,2020.
[10] Dai J,Qi H,Xiong Y,et al.Deformable convolutional networks[C]//Proceedings of the IEEE international conference on computer vision,2017:764-773.
[11] Tan M,Pang R,Le Q V.Efficientdet:Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,2020:10781-10790.
[12] Lin T Y,Maire M,Belongie S,et al.Microsoft coco:Common objects in context[C]//Computer Vision-ECCV 2014:3th European Conference,Zurich,Switzerland,September 6-12,2014,Proceedings,Part V 13.Springer International Publishing,2014:740-755.
[13] Meng D,Chen X,Fan Z,et al.Conditional detr for fast training convergence[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:3651-3660.
[14] He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition,2016:770-778.
[15] Liu S,Li F,Zhang H,et al.Dab-detr: Dynamic anchor boxes are better queries for detr[J].arXiv preprint arXiv:2201.12329,2022.
[16] Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision,2015:1440-1448.
[17] Li F,Zhang H,Liu S,et al.Dn-detr:Accelerate detr training by introducing query denoising[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:13619-13627.
[18] Zhang H,Li F,Liu S,et al.Dino:Detr with improved denoising anchor boxes for end-to-end object detection[J].arXiv preprint arXiv:2203.03605,2022.
[19] Liu Z,Lin Y,Cao Y,et al.Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF international conference on computer vision,2021:10012-10022.