ETHICS AND FAIRNESS IN GENERATIVE AI USING MITIGATING BIAS IN LARGE LANGUAGE MODELS USING ADVERSARIAL TRAINING
Abstract
Generative AI has revolutionized natural language processing (NLP) by enabling the creation of coherent and contextually relevant text. However, these models are susceptible to biases embedded in training datasets, leading to ethical concerns about fairness and equitable representation. This problem becomes critical in applications such as recruitment, healthcare, and education, where biased decisions can exacerbate social inequalities. Addressing these challenges requires robust methodologies to detect and mitigate bias in large language models. This study explores adversarial training as a method for bias mitigation in generative AI. Adversarial training introduces carefully crafted adversarial examples during the training process to expose biases and recalibrate the model''s parameters for fairness. A benchmark dataset comprising diverse demographic and cultural inputs is used to train a large language model, employing an adversarially augmented loss function to identify and correct biased representations. The effectiveness of the proposed approach is evaluated on fairness metrics such as Demographic Parity Difference (DPD), Equal Opportunity Difference (EOD), and bias amplification reduction. The experimental results demonstrate a significant reduction in bias amplification by 37%, an improvement in DPD from 0.21 to 0.05, and a decrease in EOD from 0.18 to 0.03 compared to baseline models. Additionally, the adversarially trained model maintains competitive performance with a marginal accuracy drop of only 1.2% on language generation tasks. These findings underscore the potential of adversarial training in promoting ethical and fair outcomes in generative AI systems.

Authors
Niby Babu1, Varghese S Chooralil2, Jucy Vareed3, K.P. Hrudya4
CVV Institute of Science and Technology, Chinmaya Vishwa Vidyapeeth, India1, Rajagiri School of Engineering & Technology, India2, Vidya Academy of Science and Technology, India3, Sahrdaya College of Engineering and Technology, India4

Keywords
Ethics in AI, Fairness, Generative AI, Adversarial Training, Bias Mitigation
Yearly Full Views
JanuaryFebruaryMarchAprilMayJuneJulyAugustSeptemberOctoberNovemberDecember
310000000000
Published By :
ICTACT
Published In :
ICTACT Journal on Soft Computing
( Volume: 15 , Issue: 3 , Pages: 3598 - 3607 )
Date of Publication :
January 2025
Page Views :
62
Full Text Views :
4

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.