Abstract:
Objectives The YOLOv8 algorithm has been widely recognized in the field of object detection due to its high computational efficiency and excellent detection performance. However, when applied to ship detection tasks in complex maritime environments, it still suffers from several limitations, including insufficient robustness to sea clutter and wake interference, inadequate capability for multi-scale feature extraction, and relatively large model parameters, which restrict its deployment on resource-constrained devices. These problems are particularly prominent in scenarios involving small targets, dense distributions, and complex backgrounds. To address these challenges, this paper proposes a lightweight and feature-enhanced ship detection algorithm, termed SAP-YOLOv8, aiming to improve detection accuracy, robustness, and efficiency under complex sea conditions while maintaining a good balance between performance and computational cost.
Methods First, to enhance feature representation capability, spatial depthwise convolution and dilated convolution are integrated into standard convolution to construct the SDGD module. By combining spatial channel decomposition with expanded receptive fields, the SDGD module effectively suppresses sea clutter interference while capturing richer contextual information, thereby significantly improving multi-scale feature extraction performance, especially for small and weak targets. In addition, this design enables the network to better adapt to variations in target scale and background complexity. Next, the AIFI module from the RT-DETR framework is introduced to replace the original SPPF module. Through attention-based same-scale feature interaction, this module strengthens global context modeling and enhances the ability of the network to perceive long-range dependencies and complex spatial relationships in challenging maritime environments. Finally, to further optimize computational efficiency, a novel C3k2_PCCA module is designed based on partial convolution (PConv) and coordinate attention (CA). By reducing redundant feature computation and introducing spatial coordinate information into channel attention, this module effectively decreases parameter redundancy and computational complexity, while maintaining discriminative feature representation, thus improving the lightweight performance and inference efficiency of the model.
Results Experimental results on the public HRSID dataset demonstrate that the proposed SAP-YOLOv8 achieves nearly the same parameter scale as the baseline YOLOv8 model, while improving precision, recall, and mean average precision by 1.5%, 0.7%, and 1.6%, respectively. Moreover, the proposed method shows more stable detection performance in complex maritime scenarios, particularly under strong background interference and significant scale variation, indicating its enhanced robustness and generalization capability compared with several representative classical detection algorithms. Furthermore, the proposed method maintains competitive efficiency while achieving these improvements, making it suitable for practical deployment.
Conclusions The proposed SAP-YOLOv8 algorithm not only improves detection accuracy and computational efficiency but also demonstrates stronger robustness and adaptability in complex sea environments. These advantages indicate that it has promising practical value for real-world ship detection tasks, especially in scenarios with limited computational resources and high environmental complexity.