ICTACT Journals

ETHICS AND FAIRNESS IN GENERATIVE AI USING MITIGATING BIAS IN LARGE LANGUAGE MODELS USING ADVERSARIAL TRAINING

ICTACT Journal on Soft Computing ( Volume: 15 , Issue: 3 )

Abstract

Generative AI has revolutionized natural language processing (NLP) by enabling the creation of coherent and contextually relevant text. However, these models are susceptible to biases embedded in training datasets, leading to ethical concerns about fairness and equitable representation. This problem becomes critical in applications such as recruitment, healthcare, and education, where biased decisions can exacerbate social inequalities. Addressing these challenges requires robust methodologies to detect and mitigate bias in large language models. This study explores adversarial training as a method for bias mitigation in generative AI. Adversarial training introduces carefully crafted adversarial examples during the training process to expose biases and recalibrate the model''s parameters for fairness. A benchmark dataset comprising diverse demographic and cultural inputs is used to train a large language model, employing an adversarially augmented loss function to identify and correct biased representations. The effectiveness of the proposed approach is evaluated on fairness metrics such as Demographic Parity Difference (DPD), Equal Opportunity Difference (EOD), and bias amplification reduction. The experimental results demonstrate a significant reduction in bias amplification by 37%, an improvement in DPD from 0.21 to 0.05, and a decrease in EOD from 0.18 to 0.03 compared to baseline models. Additionally, the adversarially trained model maintains competitive performance with a marginal accuracy drop of only 1.2% on language generation tasks. These findings underscore the potential of adversarial training in promoting ethical and fair outcomes in generative AI systems.

Authors

Niby Babu¹, Varghese S Chooralil², Jucy Vareed³, K.P. Hrudya⁴
CVV Institute of Science and Technology, Chinmaya Vishwa Vidyapeeth, India¹, Rajagiri School of Engineering & Technology, India², Vidya Academy of Science and Technology, India³, Sahrdaya College of Engineering and Technology, India⁴

Keywords

Ethics in AI, Fairness, Generative AI, Adversarial Training, Bias Mitigation

Published By

ICTACT

Published In

ICTACT Journal on Soft Computing
( Volume: 15 , Issue: 3 )

Date of Publication

January 2025

Pages

3598 - 3607

Doi

10.21917/ijsc.2025.0500

Page Views

911

Full Text Views

View Issue

Article Details ICTACT Journals