Abstract
Blind Source Separation (BSS) plays a crucial role in signal
processing, enabling the extraction of individual sources from mixed
signals without prior knowledge of their origin. This capability is
essential in applications such as speech enhancement, hearing aids,
multimedia forensics, and human–computer interaction. Traditional
approaches, however, often struggle with noisy environments,
overlapping frequency components, and highly correlated audio-visual
data streams. While Independent Component Analysis (ICA) and
conventional
matrix factorization methods have achieved
noTable.success, their performance often degrades when signals
exhibit sparsity or when temporal dependencies are nonlinear. In
particular, mixed audio-visual data pose challenges due to the presence
of redundant information, cross-domain interference, and the demand
for high reconstruction accuracy. This study introduces an Enhanced
Sparse Adaptive Decomposition (ESAD) framework integrated with
Non-Negative Matrix Factorization (NMF) to address these
limitations. The ESAD component adaptively enforces sparsity
constraints, ensuring that the decomposed sources are well-separated
and less prone to interference. NMF is then applied to extract
meaningful latent structures, leveraging non-negativity to maintain
physical interpretability of both audio and visual features. Together,
the hybrid approach exploits both the sparsity and the structural
coherence of the signals. Results showed a 15–20% improvement in
separation accuracy and a noticeable enhancement in the intelligibility
of speech under noisy conditions.
Authors
Parimala Gandhi Ayyavu1, Erdi Raju Dayakar2, N. Vigneshwari3, K. Jayaram4
Paavai Engineering College, India1, Sri Krishna College of Engineering and Technology, India2,3, SSM Institute of Engineering and Technology, India4
Keywords
Blind Source Separation, Sparse Adaptive Decomposition, Non Negative Matrix Factorization, Audio-Visual Signal Processing, Signal Decoupling