FEATURE EXTRACTION USING I-VECTOR AND X-VECTOR METHODS FOR SPEAKER DIARIZATION

ICTACT Journal on Soft Computing ( Volume: 15 , Issue: 4 )

Abstract

Speaker diarization is the process of identifying who is speaking at different times in audio recordings. This is important in various situations, such as recording meetings, monitoring calls in call centers, or analyzing media. In this paper, examine how well different methods for speaker diarization perform in real-life scenarios. focus on two modern techniques: I-vectors and X-vectors. I-vectors are effective for automatic speaker recognition because they create compact and efficient representations of speakers using statistical models. However, they struggle in situations involving overlapping voices or background noise. On the other hand, X-vectors overcome these limitations. They use deep neural networks to create more complex and reliable representations, making them better suited for challenging conditions. To evaluate these two approaches, used standard datasets, specifically the AMI Meeting Corpus and VoxCeleb. measured their performance using two indicators: Diarization Error Rate (DER) and Jaccard Error Rate (JER). Results show that while I-vectors are less resource- intensive and work well in ideal conditions, X-vectors perform better in real-world settings where noise and overlapping speech are present. This study provides guidance for practitioners in choosing the right approach based on their needs, considering factors such as accuracy, computational costs, and reliability.

Authors

Vinod K. Pande1, Vijay K. Kale2, Sangramsing N. Kayte3
Dr G.Y. Pathrikar College of Computer Science and Information Technology, India1,2, University of Copenhagen, Denmark3

Keywords

Speaker Diarization, I-Vector, X-Vector, MFCC, Speech Recognition

Published By
ICTACT
Published In
ICTACT Journal on Soft Computing
( Volume: 15 , Issue: 4 )
Date of Publication
January 2025
Pages
3717 - 3721
Page Views
301
Full Text Views
33

ICT Academy is an initiative of the Government of India in collaboration with the state Governments and Industries. ICT Academy is a not-for-profit society, the first of its kind pioneer venture under the Public-Private-Partnership (PPP) model

Contact Us

ICT Academy
Module No E6 -03, 6th floor Block - E
IIT Madras Research Park
Kanagam Road, Taramani,
Chennai 600 113,
Tamil Nadu, India

For Journal Subscription: journalsales@ictacademy.in

For further Queries and Assistance, write to us at: ictacademy.journal@ictacademy.in