Generative AI has revolutionized natural language processing (NLP)
by enabling the creation of coherent and contextually relevant text.
However, these models are susceptible to biases embedded in training
datasets, leading to ethical concerns about fairness and equitable
representation. This problem becomes critical in applications such as
recruitment, healthcare, and education, where biased decisions can
exacerbate social inequalities. Addressing these challenges requires
robust methodologies to detect and mitigate bias in large language
models. This study explores adversarial training as a method for bias
mitigation in generative AI. Adversarial training introduces carefully
crafted adversarial examples during the training process to expose
biases and recalibrate the model''s parameters for fairness. A
benchmark dataset comprising diverse demographic and cultural
inputs is used to train a large language model, employing an
adversarially augmented loss function to identify and correct biased
representations. The effectiveness of the proposed approach is
evaluated on fairness metrics such as Demographic Parity Difference
(DPD), Equal Opportunity Difference (EOD), and bias amplification
reduction. The experimental results demonstrate a significant
reduction in bias amplification by 37%, an improvement in DPD from
0.21 to 0.05, and a decrease in EOD from 0.18 to 0.03 compared to
baseline models. Additionally, the adversarially trained model
maintains competitive performance with a marginal accuracy drop of
only 1.2% on language generation tasks. These findings underscore the
potential of adversarial training in promoting ethical and fair outcomes
in generative AI systems.
Niby Babu1, Varghese S Chooralil2, Jucy Vareed3, K.P. Hrudya4 CVV Institute of Science and Technology, Chinmaya Vishwa Vidyapeeth, India1, Rajagiri School of Engineering & Technology, India2, Vidya Academy of Science and Technology, India3, Sahrdaya College of Engineering and Technology, India4
Ethics in AI, Fairness, Generative AI, Adversarial Training, Bias Mitigation
January | February | March | April | May | June | July | August | September | October | November | December |
3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Published By : ICTACT
Published In :
ICTACT Journal on Soft Computing ( Volume: 15 , Issue: 3 , Pages: 3598 - 3607 )
Date of Publication :
January 2025
Page Views :
62
Full Text Views :
4
|