Multimedia applications, particularly video analytics, demand robust
and accurate object detection mechanisms to manage the ever-
increasing volume and complexity of video data. Existing object
detection methods often suffer from performance bottlenecks when
processing high-resolution video frames, leading to challenges in
accuracy, processing time, and scalability. Addressing these
limitations, this research proposes a Generative Adversarial Network
(GAN)-driven optimization framework designed to enhance object
detection in video frames for multimedia applications. The proposed
method leverages the generative capability of GANs to generate high-
quality synthetic video frames, which augment the training dataset,
addressing data imbalance and improving detection robustness. A
detection module powered by a refined YOLOv5 model is incorporated,
optimized using GAN-synthesized data. The framework is further fine-
tuned by integrating an attention mechanism to improve the detection
accuracy of smaller and occluded objects, reducing false negatives
significantly. Experimental results demonstrate that the proposed
GAN-driven approach achieves an average precision (AP) of 92.6% on
the COCO dataset and 94.3% on the custom video dataset, surpassing
baseline methods like Faster R-CNN and SSD by 5.2% and 4.1%,
respectively. Additionally, the framework reduces inference time per
frame to 27 milliseconds, making it suitable for real-time applications.
The synthetic data augmentation increases the diversity of training data
by 38%, enhancing the detection of underrepresented object classes.
These results highlight the potential of GAN-driven optimization to
revolutionize object detection in multimedia applications by achieving
higher accuracy, scalability, and efficiency.
H.C. Kantharaju1, Vatsala Anand2 Vemana Institute of Technology, India1, Chitkara University, India2
GAN-driven Optimization, Object Detection, Video Analytics, Multimedia Applications, YOLOv5
January | February | March | April | May | June | July | August | September | October | November | December |
0 | 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Published By : ICTACT
Published In :
ICTACT Journal on Image and Video Processing ( Volume: 15 , Issue: 3 , Pages: 3483 - 3488 )
Date of Publication :
February 2025
Page Views :
28
Full Text Views :
9
|