VIDEO FRAME OBJECT DETECTION IN MULTIMEDIA APPLICATIONS USING GENERATIVE ADVERSARIAL NETWORK

ICTACT Journal on Image and Video Processing ( Volume: 15 , Issue: 3 )

Abstract

Multimedia applications, particularly video analytics, demand robust and accurate object detection mechanisms to manage the ever- increasing volume and complexity of video data. Existing object detection methods often suffer from performance bottlenecks when processing high-resolution video frames, leading to challenges in accuracy, processing time, and scalability. Addressing these limitations, this research proposes a Generative Adversarial Network (GAN)-driven optimization framework designed to enhance object detection in video frames for multimedia applications. The proposed method leverages the generative capability of GANs to generate high- quality synthetic video frames, which augment the training dataset, addressing data imbalance and improving detection robustness. A detection module powered by a refined YOLOv5 model is incorporated, optimized using GAN-synthesized data. The framework is further fine- tuned by integrating an attention mechanism to improve the detection accuracy of smaller and occluded objects, reducing false negatives significantly. Experimental results demonstrate that the proposed GAN-driven approach achieves an average precision (AP) of 92.6% on the COCO dataset and 94.3% on the custom video dataset, surpassing baseline methods like Faster R-CNN and SSD by 5.2% and 4.1%, respectively. Additionally, the framework reduces inference time per frame to 27 milliseconds, making it suitable for real-time applications. The synthetic data augmentation increases the diversity of training data by 38%, enhancing the detection of underrepresented object classes. These results highlight the potential of GAN-driven optimization to revolutionize object detection in multimedia applications by achieving higher accuracy, scalability, and efficiency.

Authors

H.C. Kantharaju1, Vatsala Anand2
Vemana Institute of Technology, India1, Chitkara University, India2

Keywords

GAN-driven Optimization, Object Detection, Video Analytics, Multimedia Applications, YOLOv5

Published By
ICTACT
Published In
ICTACT Journal on Image and Video Processing
( Volume: 15 , Issue: 3 )
Date of Publication
February 2025
Pages
3483 - 3488
Page Views
297
Full Text Views
22

ICT Academy is an initiative of the Government of India in collaboration with the state Governments and Industries. ICT Academy is a not-for-profit society, the first of its kind pioneer venture under the Public-Private-Partnership (PPP) model

Contact Us

ICT Academy
Module No E6 -03, 6th floor Block - E
IIT Madras Research Park
Kanagam Road, Taramani,
Chennai 600 113,
Tamil Nadu, India

For Journal Subscription: journalsales@ictacademy.in

For further Queries and Assistance, write to us at: ictacademy.journal@ictacademy.in