Emanuel Di Nardo works at University of Napoli, Parthenope.

Posts

G-Litter Marine Litter Dataset Augmentation with Diffusion Models and Large Language Models on GPU Acceleration

Marine litter detection is crucial for environmental monitoring, yet the imbalance in existing datasets limits model performance in identifying various types of waste accurately. This paper presents an efficient data augmentation pipeline that combines generative diffusion models (e.g., Stable Diffusion) and Large Language Models (LLMs) to expand the G-Litter dataset, a marine litter dataset designed for autonomous detection in heterogeneous environments. Leveraging scalable diffusion models for image generation and Alpaca LLMs for diverse prompt generation, our approach augments underrepresented classes by generating over 200 additional images per class, significantly improving the dataset’s balance. Training G-Litter augmented dataset using YOLOv8 for object detection demonstrated an increase in detection performance, improving recall by 7.82% and mAP50 by 3.87% (compared with baseline results). This study emphasizes the potential for combining generative AI with HPC resources to automate data augmentation on large-scale, unstructured datasets, particularly in edge computing contexts for real-time marine monitoring. The models were tested on real videos captured during simulated missions, demonstrating a superior ability to detect submerged objects in dynamic scenarios. These results highlight the potential of generative AI techniques to improve dataset quality and detection model performance, laying the foundation for further expansion in real-time marine monitoring.