ICTACT Journals

VIDEO FRAME OBJECT DETECTION IN MULTIMEDIA APPLICATIONS USING GENERATIVE ADVERSARIAL NETWORK

ICTACT Journal on Image and Video Processing ( Volume: 15 , Issue: 3 )

Abstract

Multimedia applications, particularly video analytics, demand robust and accurate object detection mechanisms to manage the ever- increasing volume and complexity of video data. Existing object detection methods often suffer from performance bottlenecks when processing high-resolution video frames, leading to challenges in accuracy, processing time, and scalability. Addressing these limitations, this research proposes a Generative Adversarial Network (GAN)-driven optimization framework designed to enhance object detection in video frames for multimedia applications. The proposed method leverages the generative capability of GANs to generate high- quality synthetic video frames, which augment the training dataset, addressing data imbalance and improving detection robustness. A detection module powered by a refined YOLOv5 model is incorporated, optimized using GAN-synthesized data. The framework is further fine- tuned by integrating an attention mechanism to improve the detection accuracy of smaller and occluded objects, reducing false negatives significantly. Experimental results demonstrate that the proposed GAN-driven approach achieves an average precision (AP) of 92.6% on the COCO dataset and 94.3% on the custom video dataset, surpassing baseline methods like Faster R-CNN and SSD by 5.2% and 4.1%, respectively. Additionally, the framework reduces inference time per frame to 27 milliseconds, making it suitable for real-time applications. The synthetic data augmentation increases the diversity of training data by 38%, enhancing the detection of underrepresented object classes. These results highlight the potential of GAN-driven optimization to revolutionize object detection in multimedia applications by achieving higher accuracy, scalability, and efficiency.

Authors

H.C. Kantharaju¹, Vatsala Anand²
Vemana Institute of Technology, India¹, Chitkara University, India²

Keywords

GAN-driven Optimization, Object Detection, Video Analytics, Multimedia Applications, YOLOv5

Published By

ICTACT

Published In

ICTACT Journal on Image and Video Processing
( Volume: 15 , Issue: 3 )

Date of Publication

February 2025

Pages

3483 - 3488

Doi

10.21917/ijivp.2025.0493

Page Views

297

Article Details ICTACT Journals