UAV Image Object Detection Based on Attention Mechanism and Dilated Convolution
Research Article
Open Access
CC BY

UAV Image Object Detection Based on Attention Mechanism and Dilated Convolution

Shijie Lyu 1*
1 Georgia Institute of Technology, North Avenue, Atlanta, GA, 30332, United States of America
*Corresponding author: slyu41@gatech.edu
Published on 4 July 2025
Journal Cover
ACE Vol.173
ISSN (Print): 2755-273X
ISSN (Online): 2755-2721
ISBN (Print): 978-1-80590-231-7
ISBN (Online): 978-1-80590-232-4
Download Cover

Abstract

Existing algorithms for unmanned aerial vehicle (UAV) image object detection often face challenges such as low detection accuracy for small objects and missed detections of multi-scale objects. To address these issues, this paper proposes a UAV image object detection algorithm that integrates a channel attention mechanism with parallel-structured dilated convolution feature fusion. To enhance the algorithm’s feature representation capabilities in terms of channel attention and receptive field, the ResNet50 backbone is redesigned by incorporating the Squeeze-and-Excitation Network (SENet) and a Parallel-Structured Dilated Convolution Feature Fusion Network (PSDCFFN). Additionally, Region of Interest (ROI) Align is employed, and the Region Proposal Network (RPN) anchor sizes are optimized using K-Means clustering to minimize coordinate deviations during object regression. Experimental results demonstrate that the proposed algorithm significantly improves object detection accuracy in UAV images. On the RSOD-Dataset and a custom UAV image dataset, the mean Average Precision (mAP) reaches 92.52% and 98.07%, respectively.

Keywords:

UAV Image, Faster R-CNN, Attention Mechanism, Dilated Convolution, Feature Fusion, Object Detection

View PDF
Lyu,S. (2025). UAV Image Object Detection Based on Attention Mechanism and Dilated Convolution. Applied and Computational Engineering,173,15-21.

References

[1]. Su A, Sun X, Zhang Y, et al.Efficient rotation-invariant histogram of oriented gradient descriptors for car detection in satellite images [J].IET Computer Vision, 2017, 10: 634-640.

[2]. Bay H, Ess A, Tuytelaars T, et al.Speeded-up robust features(SURF) [J].Computer Vision and Image Understanding, 2008, 110(3): 346-359.

[3]. Girshick R, Donahue J, Darrell T, et al.Rich feature hierarchies for accurate object detection and semantic segmentation [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.

[4]. Girshick R.Fast R-CNN [C]//2015 IEEE International Conference on Computer Vision, 2015: 1440-1448.

[5]. Ren S, He K, Girshick R, et al.Faster R-CNN: Towards real-time object detection with region proposal networks [C] //28th International Conference on Neural Information Processing Systems, 2015: 91-99.

[6]. Redmon J, Divvala S, Girshick R, et al.You only look once: Unified, real-time object detection [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.

[7]. Liu W, Anguelov D, Erhan D, et al.SSD: Single shot MultiBox detector [C]//European Conference on Computer Vision, 2016: 21-37.

[8]. Hu J, Shen L, Sun G.Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.

[9]. He K, Zhang X, Ren S, et al.Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.

[10]. Li Y, Chen Y, Wang N, et al.Scale-aware trident networks for object detection [C]//2019 IEEE/CVF International Conference on Computer Vision, 2019: 6054-6063.

[11]. Fang Y, Li Y, Tu X, et al.Face completion with Hybrid Dilated Convolution [J].Signal Processing: Image Communication, 2020, 80: 115664.

[12]. Long Y, Gong Y, Xiao Z, et al.Accurate object localization in remote sensing images based on convolutional neural networks [J].IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(5): 2486-2498.

[13]. Xiao Z, Liu Q, Tang G, et al.Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images [J].International Journal of Remote Sensing, 2015, 36(2): 618-644.

Cite this article

Lyu,S. (2025). UAV Image Object Detection Based on Attention Mechanism and Dilated Convolution. Applied and Computational Engineering,173,15-21.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

About volume

Volume title: Proceedings of the 7th International Conference on Computing and Data Science

ISBN: 978-1-80590-231-7(Print) / 978-1-80590-232-4(Online)
Editor: Marwan Omar
Conference website: https://2025.confcds.org/
Conference date: 25 September 2025
Series: Applied and Computational Engineering
Volume number: Vol.173
ISSN: 2755-2721(Print) / 2755-273X(Online)