Feng et al. (2025) Field-deployable lightweight YOLOv8n for real-time detection and counting of Maize seedlings using UAV RGB imagery

Identification

Journal: Frontiers in Plant Science
Year: 2025
Date: 2025-09-08
Authors: Pengbo Feng, Zhigang Nie, Guang Li
DOI: 10.3389/fpls.2025.1639533

Research Groups

College of Information Science and Technology, Gansu Agricultural University, Lanzhou, China
State Key Laboratory of Aridland Crop Science, Gansu Agricultural University, Lanzhou, Gansu, China
Hexi University, Zhangye, Gansu, China

Short Summary

This study proposes YOLOv8-FLY, a lightweight deep learning model for real-time detection and counting of maize seedlings using UAV RGB imagery. The model achieves 96.5% detection accuracy while significantly reducing model size, parameters, and computational cost, making it suitable for resource-constrained edge devices.

Objective

To develop a lightweight, real-time YOLOv8n-based maize seedling detection algorithm (YOLOv8-FLY) for UAV RGB imagery, addressing issues of large model size, high computational cost, and limited real-time performance in existing models, while maintaining high detection accuracy.

Study Configuration

Spatial Scale: Experimental site in Huarui Ranch, Minle County, Gansu Province, China (38°44′3.32″N, 100°42′5.03″E; 1,683 m above sea level). UAV flight altitude of 3 m, resulting in a ground sampling distance (GSD) of approximately 0.07 cm/pixel. Images have a resolution of 5280 × 2970 pixels, with target maize seedlings typically ranging from 106 to 370 pixels in width and 59 to 297 pixels in height.
Temporal Scale: Data collected on 3 and 4 May 2024, between 9:00 and 14:00, targeting maize seedlings at the 2-leaf and 1-center stage (approximately 20 days after sowing). Data collection for the experimental plot took approximately 45–60 minutes.

Methodology and Data

Models used:
- Base model: YOLOv8n
- Proposed model: YOLOv8-FLY, incorporating:
  - Rep_HGBlock: A lightweight multi-scale backbone module designed by fusing RepConv with HGNetV2.
  - BiFPN (Bidirectional Feature Pyramid Network): Introduced into the neck network for enhanced multi-scale information fusion.
  - TDADH (Task Dynamically Aligned Detection Head): A lightweight detection head based on GroupNorm, shared convolution, and task-decoupled interaction mechanisms.
- Visualization technique: Grad-CAM++ for model interpretability.
- Comparison models: YOLOv3, YOLOv5s, YOLOv6, YOLOv8s, YOLOv8m, YOLOv8l, YOLOv8x, YOLOv9s, YOLOv10s, Faster R-CNN, SSD.
Data sources:
- Custom-built maize seedling dataset collected using a DJI Mavic Classic 3 drone with an RGB camera.
- 1,213 original images (5280 × 2970 pixels) collected from a drip-irrigated field, with 993 high-quality images selected.
- 21,974 manually annotated instances using LabelImg 1.8.6 in YOLO format.
- Dataset split: 70% for training, 20% for validation, and 10% for testing.
- Data augmentation techniques: Horizontal flip, vertical flip, 90-degree rotation, Gaussian noise, salt and pepper noise, and brightness adjustment, generating 65,600 augmented images (one representative image retained per original ID).

Main Results

The YOLOv8-FLY model achieved a detection accuracy (mAP@0.5) of 96.5%.
Model weight size was reduced to 3.5 MB, a 43% reduction compared to the original YOLOv8n (6.3 MB).
The number of parameters was reduced to 1.58 M, a 47% reduction compared to the original YOLOv8n (3.01 M).
Computational FLOPs were reduced to 7.4 G, an 8.6% reduction compared to the original YOLOv8n (8.1 G).
The inference speed (FPS) was 146.3, a minor reduction of 2.4% from the original YOLOv8n (149.98 FPS).
Grad-CAM++ visualization demonstrated that YOLOv8-FLY exhibited improved feature fusion and more accurate attention focusing on target objects compared to the original model.
The developed real-time detection system, integrating YOLOv8-FLY, supports image and video stream processing, displaying real-time plant positions, target counts, and frame rates, and is compatible with live UAV camera input.

Contributions

Proposed YOLOv8-FLY, a novel lightweight object detection model specifically tailored for real-time maize seedling monitoring using UAV RGB imagery in complex field environments.
Designed Rep_HGBlock, a lightweight multi-scale backbone module, by fusing RepConv with HGNetV2 to enhance feature representation for small targets while reducing network complexity.
Integrated BiFPN into the neck layer to improve multi-scale information fusion and reduce computational burden, enhancing robustness in UAV images with occlusions and lighting changes.
Developed TDADH, a lightweight detection head based on GroupNorm, shared convolution, and task-decoupled interaction mechanisms, to achieve high detection capability with fewer parameters, facilitating edge device deployment.
Constructed a comprehensive maize seedling dataset using UAV-acquired RGB imagery from a drip-irrigated field, specifically for small and densely planted targets.
Developed and validated a practical real-time detection and counting system, demonstrating the model's deployability and efficiency for early-stage crop management and precision agriculture.
Achieved a superior balance of detection accuracy, model size, computational cost, and inference speed compared to various mainstream lightweight detectors.

Funding

2024 Central Guided Local Science and Technology Development Funds Project (Grant No. 24ZYQA023)
Gansu Provincial Industry Support Programme Project (2025CYZC-042)
Gansu Provincial Key Science and Technology Special Project (24ZD13NA019)

Citation

@article{Feng2025Fielddeployable,
  author = {Feng, Pengbo and Nie, Zhigang and Li, Guang},
  title = {Field-deployable lightweight YOLOv8n for real-time detection and counting of Maize seedlings using UAV RGB imagery},
  journal = {Frontiers in Plant Science},
  year = {2025},
  doi = {10.3389/fpls.2025.1639533},
  url = {https://doi.org/10.3389/fpls.2025.1639533}
}

Original Source: https://doi.org/10.3389/fpls.2025.1639533