ETHICS AND FAIRNESS IN GENERATIVE AI USING MITIGATING BIAS IN LARGE LANGUAGE MODELS USING ADVERSARIAL TRAINING

ICTACT Journal on Soft Computing ( Volume: 15 , Issue: 3 )

Abstract

Generative AI has revolutionized natural language processing (NLP) by enabling the creation of coherent and contextually relevant text. However, these models are susceptible to biases embedded in training datasets, leading to ethical concerns about fairness and equitable representation. This problem becomes critical in applications such as recruitment, healthcare, and education, where biased decisions can exacerbate social inequalities. Addressing these challenges requires robust methodologies to detect and mitigate bias in large language models. This study explores adversarial training as a method for bias mitigation in generative AI. Adversarial training introduces carefully crafted adversarial examples during the training process to expose biases and recalibrate the model''s parameters for fairness. A benchmark dataset comprising diverse demographic and cultural inputs is used to train a large language model, employing an adversarially augmented loss function to identify and correct biased representations. The effectiveness of the proposed approach is evaluated on fairness metrics such as Demographic Parity Difference (DPD), Equal Opportunity Difference (EOD), and bias amplification reduction. The experimental results demonstrate a significant reduction in bias amplification by 37%, an improvement in DPD from 0.21 to 0.05, and a decrease in EOD from 0.18 to 0.03 compared to baseline models. Additionally, the adversarially trained model maintains competitive performance with a marginal accuracy drop of only 1.2% on language generation tasks. These findings underscore the potential of adversarial training in promoting ethical and fair outcomes in generative AI systems.

Authors

Niby Babu1, Varghese S Chooralil2, Jucy Vareed3, K.P. Hrudya4
CVV Institute of Science and Technology, Chinmaya Vishwa Vidyapeeth, India1, Rajagiri School of Engineering & Technology, India2, Vidya Academy of Science and Technology, India3, Sahrdaya College of Engineering and Technology, India4

Keywords

Ethics in AI, Fairness, Generative AI, Adversarial Training, Bias Mitigation

Published By
ICTACT
Published In
ICTACT Journal on Soft Computing
( Volume: 15 , Issue: 3 )
Date of Publication
January 2025
Pages
3598 - 3607
Page Views
283
Full Text Views
11

ICT Academy is an initiative of the Government of India in collaboration with the state Governments and Industries. ICT Academy is a not-for-profit society, the first of its kind pioneer venture under the Public-Private-Partnership (PPP) model

Contact Us

ICT Academy
Module No E6 -03, 6th floor Block - E
IIT Madras Research Park
Kanagam Road, Taramani,
Chennai 600 113,
Tamil Nadu, India

For Journal Subscription: journalsales@ictacademy.in

For further Queries and Assistance, write to us at: ictacademy.journal@ictacademy.in