VIDEO FRAME OBJECT DETECTION IN MULTIMEDIA APPLICATIONS USING GENERATIVE ADVERSARIAL NETWORK
Abstract
Multimedia applications, particularly video analytics, demand robust and accurate object detection mechanisms to manage the ever- increasing volume and complexity of video data. Existing object detection methods often suffer from performance bottlenecks when processing high-resolution video frames, leading to challenges in accuracy, processing time, and scalability. Addressing these limitations, this research proposes a Generative Adversarial Network (GAN)-driven optimization framework designed to enhance object detection in video frames for multimedia applications. The proposed method leverages the generative capability of GANs to generate high- quality synthetic video frames, which augment the training dataset, addressing data imbalance and improving detection robustness. A detection module powered by a refined YOLOv5 model is incorporated, optimized using GAN-synthesized data. The framework is further fine- tuned by integrating an attention mechanism to improve the detection accuracy of smaller and occluded objects, reducing false negatives significantly. Experimental results demonstrate that the proposed GAN-driven approach achieves an average precision (AP) of 92.6% on the COCO dataset and 94.3% on the custom video dataset, surpassing baseline methods like Faster R-CNN and SSD by 5.2% and 4.1%, respectively. Additionally, the framework reduces inference time per frame to 27 milliseconds, making it suitable for real-time applications. The synthetic data augmentation increases the diversity of training data by 38%, enhancing the detection of underrepresented object classes. These results highlight the potential of GAN-driven optimization to revolutionize object detection in multimedia applications by achieving higher accuracy, scalability, and efficiency.

Authors
H.C. Kantharaju1, Vatsala Anand2
Vemana Institute of Technology, India1, Chitkara University, India2

Keywords
GAN-driven Optimization, Object Detection, Video Analytics, Multimedia Applications, YOLOv5
Yearly Full Views
JanuaryFebruaryMarchAprilMayJuneJulyAugustSeptemberOctoberNovemberDecember
090000000000
Published By :
ICTACT
Published In :
ICTACT Journal on Image and Video Processing
( Volume: 15 , Issue: 3 , Pages: 3483 - 3488 )
Date of Publication :
February 2025
Page Views :
28
Full Text Views :
9

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.