Get Introduction to Responsible AI: Implement Ethical AI Using Python 1st Edition Manure free all chapters
Get Introduction to Responsible AI: Implement Ethical AI Using Python 1st Edition Manure free all chapters
com
https://ebookmass.com/product/introduction-to-responsible-
ai-implement-ethical-ai-using-python-1st-edition-manure/
OR CLICK HERE
DOWLOAD NOW
https://ebookmass.com/product/introduction-to-datafication-implement-
datafication-using-ai-and-ml-algorithms-shivakumar-r-goniwada-2/
ebookmass.com
https://ebookmass.com/product/introduction-to-datafication-implement-
datafication-using-ai-and-ml-algorithms-shivakumar-r-goniwada/
ebookmass.com
https://ebookmass.com/product/manual-physical-therapy-of-the-spine-e-
book-2nd-edition-ebook-pdf/
ebookmass.com
Song for a Cowboy Sasha Summers
https://ebookmass.com/product/song-for-a-cowboy-sasha-summers/
ebookmass.com
https://ebookmass.com/product/hollow-dolls-marcykate-connolly/
ebookmass.com
https://ebookmass.com/product/class-12b-fights-back-harris/
ebookmass.com
https://ebookmass.com/product/oca-java-se-8-programmer-i-exam-guide-
exam-1z0-808-bates/
ebookmass.com
Theatre in the Context of the Yugoslav Wars 1st ed.
Edition Jana Dole■ki
https://ebookmass.com/product/theatre-in-the-context-of-the-yugoslav-
wars-1st-ed-edition-jana-dolecki/
ebookmass.com
Avinash Manure, Shaleen Bengani and Saravanan S
Introduction to Responsible AI
Implement Ethical AI Using Python
Avinash Manure
Bangalore, Karnataka, India
Shaleen Bengani
Kolkata, West Bengal, India
Saravanan S
Chennai, Tamil Nadu, India
Apress Standard
The publisher, the authors, and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
Shaleen Bengani
is a machine learning engineer with more than four years of experience
in building, deploying, and managing cutting-edge machine learning
solutions across varied industries. He has developed several
frameworks and platforms that have significantly streamlined
processes and improved efficiency of machine learning teams.
Bengani has authored the book Operationalizing Machine Learning
Pipelines as well as multiple research papers in the deep learning space.
He holds a bachelor’s degree in
computer science and engineering from
BITS Pilani, Dubai Campus, where he
was awarded the Director’s Medal for
outstanding all-around performance. In
his leisure time, he likes playing table
tennis and reading.
Saravanan S
is an AI engineer with more than six
years of experience in data science and
data engineering. He has developed
robust data pipelines for developing and
deploying advanced machine learning
models, generating insightful reports,
and ensuring cutting-edge solutions for
diverse industries.
Saravanan earned a master’s degree
in statistics from Loyola College from
Chennai. In his spare time, he likes
traveling, reading books, and playing
games.
About the Technical Reviewer
Akshay Kulkarni
is an AI and machine learning evangelist
and thought leader. He has consulted
with several Fortune 500 and global
enterprises to drive AI- and data
science–led strategic transformations.
He is a Google developer expert, author,
and regular speaker at major AI and data
science conferences (including Strata,
O’Reilly AI Conf, and GIDS). He is a
visiting faculty member for some of the
top graduate institutes in India. In 2019,
he was also featured as one of the top 40
under 40 data scientists in India. In his
spare time, he enjoys reading, writing,
coding, and building next-gen AI products.
© The Author(s), under exclusive license to APress Media, LLC, part of Springer
Nature 2023
A. Manure et al., Introduction to Responsible AI
https://doi.org/10.1007/978-1-4842-9982-1_1
1. Introduction
Avinash Manure1 , Shaleen Bengani2 and Saravanan S3
(1) Bangalore, Karnataka, India
(2) Kolkata, West Bengal, India
(3) Chennai, Tamil Nadu, India
Conclusion
Transparency and explainability are not mere technical prerequisites
but rather essential pillars of responsible AI. They foster trust,
accountability, and informed decision making in an AI-driven world. By
shedding light on AI’s decision-making processes and bridging the gap
between algorithms and human understanding, transparency and
explainability lay the foundation for an ethical, fair, and inclusive AI
landscape. As AI continues to evolve, embracing these principles
ensures that the journey into the future is guided by clarity, integrity,
and empowerment.
In the artificial intelligence (AI) landscape, bias’s impact on decisions is paramount. From individual
choices to complex models, bias distorts outcomes and fairness. Grasping bias’s nuances is essential for
equitable systems. It’s a complex interplay of data and beliefs with profound implications. Detecting and
mitigating bias is empowered by technology, nurturing transparent and responsible AI. This ongoing quest
aligns with ethics, sculpting AI that champions diversity and societal progress.
In this chapter, we delve into the intricate relationship between bias, fairness, and artificial
intelligence. We explore how bias can impact decision making across various domains, from individual
judgments to automated systems. Understanding the types and sources of bias helps us identify its
presence in data and models. We also delve into the importance of recognizing bias for creating fair and
equitable systems and how explainable AI aids in this process. Additionally, we touch on techniques to
detect, assess, and mitigate bias, as well as the trade-offs between model complexity and interpretability.
This comprehensive exploration equips us to navigate the complexities of bias and fairness in the AI
landscape, fostering ethical and inclusive AI systems.
Types of Bias
Bias in machine learning refers to the presence of systematic and unfair errors in data or models that can
lead to inaccurate or unjust predictions, decisions, or outcomes. There are several types of bias that can
manifest in different stages of the machine learning pipeline (see Figure 2-1).
1. Data Bias: Data bias encompasses biases present in the data used to train and test machine learning
models. This bias can arise due to various reasons, such as the following:
Sampling Bias: When the collected data is not representative of the entire population, leading to
over- or under-representation of certain groups or attributes. For instance, in a medical diagnosis
dataset, if only one demographic group is represented, the model might perform poorly for
underrepresented groups.
Measurement Bias: Errors or inconsistencies introduced during data-collection or measurement
processes can introduce bias. For example, if a survey is conducted in a language not understood by
a specific community, their perspectives will be omitted, leading to biased conclusions.
Coverage Bias: Occurs when certain groups or perspectives are missing from the dataset. This can
result from biased data-collection methods, incomplete sampling, or systemic exclusion.
2. Model Bias: Model bias emerges from the learning algorithms’ reliance on biased data during
training, which can perpetuate and sometimes amplify biases, as follows:
Representation Bias: This occurs when the features or attributes used for training
disproportionately favor certain groups. Models tend to learn from the biases present in the training
data, potentially leading to biased predictions.
Algorithmic Bias: Some machine learning algorithms inherently perpetuate biases. For example, if
a decision-tree algorithm learns to split data based on biased features, it will reflect those biases in
its predictions.
Feedback Loop Bias: When models’ predictions influence real-world decisions that subsequently
affect the data used for future training, a feedback loop is created. Biased predictions can perpetuate
over time, reinforcing existing biases.
3. Social Bias: Social bias pertains to the biases present in society that get reflected in data and models,
as follows:
Cultural Bias: Cultural norms, beliefs, and values can shape how data is collected and interpreted,
leading to biased outcomes.
Gender Bias: Historical and societal gender roles can result in unequal representation in datasets,
affecting model performance.
Racial Bias: Biased historical practices can lead to underrepresentation or misrepresentation of
racial groups in data, impacting model accuracy.
Economic Bias: Socioeconomic disparities can lead to differences in data availability and quality,
influencing model outcomes.
Understanding these types of bias is essential for developing strategies to detect, mitigate, and prevent
bias. Addressing bias involves a combination of careful data collection, preprocessing, algorithm selection,
and post-processing interventions. Techniques such as reweighting, resampling, and using fairness-aware
algorithms can help mitigate bias at various stages of model development.
However, ethical considerations play a crucial role in addressing bias. Being aware of the potential
impact of bias on decision-making processes and actively working to mitigate it aligns AI development
with principles of fairness, transparency, and accountability. By understanding the different types of bias,
stakeholders can work toward creating AI systems that promote equitable outcomes across diverse
contexts and populations.
2. Racial Bias in Criminal Risk Assessment: Several criminal risk assessment tools used in the
criminal justice system have been criticized for exhibiting racial bias. These tools predict the
likelihood of reoffending based on historical arrest and conviction data. However, the historical bias in
the data can lead to overestimating the risk for minority groups, leading to discriminatory sentencing
and parole decisions.
3. Google Photos’ Racist Labeling: In 2015, Google Photos’ auto-tagging feature was found to label
images of Black people as “gorillas.” This was a result of the model’s biased training data, which did
not include enough diverse examples of Black individuals. The incident highlighted the potential harm
of biased training data and the need for inclusive datasets.
4. Biased Loan Approval Models: Machine learning models used for loan approval have shown bias in
favor of certain demographic groups. Some models have unfairly denied loans to minority applicants
or offered them higher interest rates, reflecting historical biases in lending data.
5. Facial Recognition and Racial Bias: Facial recognition systems have been criticized for their racial
bias, where they are more likely to misidentify people with darker skin tones, particularly women.
This bias can result in inaccurate surveillance, racial profiling, and infringement of civil rights.
These real-world examples underscore the urgency of addressing bias in AI systems. To prevent such
biased behavior, it’s crucial to carefully curate diverse and representative training data, use fairness-aware
algorithms, implement bias detection and mitigation techniques, and continuously monitor and evaluate
model outputs for fairness. By proactively addressing bias, developers can ensure that AI systems
contribute positively to society and uphold ethical standards.
2. Reweighting: Assigning different weights to different classes or samples can adjust the model’s
learning process to address imbalances.
3. Fairness-Aware Algorithms:
Adversarial Debiasing: Incorporates an additional adversarial network to reduce bias while
training the main model, forcing it to disregard features correlated with bias.
Equalized Odds: Adjusts model thresholds to ensure equal opportunity for positive outcomes
across different groups.
Reject Option Classification: Allows the model to decline to make a prediction when uncertainty
about its fairness exists.
4. Regularization Techniques:
Fairness Constraints: Adding fairness constraints to the model’s optimization process to ensure
predictions are within acceptable fairness bounds.
Lagrangian Relaxation: Balancing fairness and accuracy trade-offs by introducing Lagrange
multipliers during optimization.
5. Post-processing Interventions:
Calibration: Adjusting model predictions to align with desired fairness criteria while maintaining
overall accuracy.
Reranking: Reordering model predictions to promote fairness without significantly compromising
accuracy.
6. Preprocessing Interventions:
Data Augmentation: Adding synthesized data points to underrepresented groups to improve
model performance and reduce bias.
De-biasing Data Preprocessing: Using techniques like reweighting or resampling during data
preprocessing to mitigate bias before training.
7. Fair Feature Engineering: Creating or selecting features that are less correlated with bias, which
can help the model focus on relevant and fair attributes.
8. Ensemble Methods: Combining multiple models that are trained with different strategies can help
mitigate bias, as biases in individual models are less likely to coincide.
9. Regular Monitoring and Updates: Continuously monitoring model performance for bias in real-
world scenarios and updating the model as new data becomes available to ensure ongoing fairness.
10. Ethical and Inclusive Design: Prioritizing diverse representation and ethical considerations in data
collection, preprocessing, and model development to prevent bias from entering the system.
11. Collaborative Development: Involving stakeholders from diverse backgrounds, including ethicists
and affected communities, to collaboratively address bias and ensure that mitigation strategies align
with ethical values.
12. Transparency and Communication: Being transparent about the steps taken to mitigate bias and
communicating these efforts to users and stakeholders to build trust in the system.
13. Legal and Regulatory Compliance: Ensuring that the AI system adheres to relevant laws and
regulations concerning discrimination and bias, and actively working to comply with them.
Dataset Details
We have used an individual’s annual income results from various factors. Intuitively, it is influenced by the
individual’s education level, age, gender, occupation, etc.
Source: https://archive.ics.uci.edu/dataset/2/adult
The dataset contains the following 16 columns:
Age: Continuous
Workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay,
Never-worked
Fnlwgt: Continuous
Education: Bachelor’s, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th,
12th, Master’s, 1st-4th, 10th, Doctorate, 5th-6th, Preschool
Marital Status: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-
absent, Married-AF-spouse
Occupation: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-
cleaners, Machine-op-inspect, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv,
Protective-serv, Armed-Forces
Relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried
Race: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black
Sex: Female, Male
Capital Gain: Continuous
Capital Loss: Continuous
Hours-per-week: Continuous
Native Country: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US, India,
Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico,
Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary,
Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong,
Holand-Netherlands
Income (>50k or <=50k): Target variable
Getting Started
The following is the process for implementing data bias detection and mitigation process in Python.
[In]:
# Import necessary libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix
from sklearn.utils import resample
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.metrics import classification_report
Step 2: Loading the Data
[In]:
# Read the dataset into a pandas DataFrame
df = pd.read_csv(" Income.csv")
[In]:
# Display basic information about the dataset
df.info()
[Out]:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 48842 entries, 0 to 48841
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 age 48842 non-null int64
1 workclass 48842 non-null int32
2 fnlwgt 48842 non-null int64
3 education 48842 non-null int32
4 education-num 48842 non-null int64
5 marital-status 48842 non-null int32
6 occupation 48842 non-null int32
7 relationship 48842 non-null int32
8 race 48842 non-null int32
9 sex 48842 non-null int32
10 capital-gain 48842 non-null int64
11 capital-loss 48842 non-null int64
12 hours-per-week 48842 non-null int64
13 native-country 48842 non-null int32
14 income 48842 non-null int32
dtypes: int32(9), int64(6)
memory usage: 3.9 MB
There are no null values present in the data, so we can proceed with the data preprocessing steps.
[In]:
# Define a list of categorical columns to be encoded and perform label
encoding for categorical columns
categorical_columns = ['sex', 'race', 'education', 'marital-status',
'occupation', 'relationship', 'native-country', 'workclass', 'income']
label_encoders = {}
for column in categorical_columns:
label_encoders[column] = LabelEncoder()
df[column] = label_encoders[column].fit_transform(df[column])
Categorical columns contain multiple categorical values. To use these categorical values for model
building, apply dummy variable-creation techniques to columns having more than two unique values.
[In]:
# Perform one-hot encoding for columns with more than 2 categories
get_dummies = []
label_encoding = []
for i in categorical_columns:
print('Column Name:', i, ', Unique Value Counts:', len(df[i].unique()),
', Values:', df[i].unique())
if len(df[i].unique()) > 2:
get_dummies.append(i)
else:
label_encoding.append(i)
df = pd.get_dummies(df, prefix=get_dummies, columns=get_dummies)
[Out]:
Column Name: sex, Unique Value Counts: 2, Values: [1 0]
Column Name: race, Unique Value Counts: 2, Values: [1 0]
Column Name: education, Unique Value Counts: 16, Values: [ 9 11 1 12 6
15 7 8 5 10 14 4 0 3 13 2]
Column Name: marital-status, Unique Value Counts: 7, Values: [4 2 0 3 5 1
6]
Column Name: occupation, Unique Value Counts: 15, Values: [ 1 4 6 10 8
12 3 14 5 7 13 0 11 2 9]
Column Name: relationship, Unique Value Counts: 6, Values: [1 0 5 3 4 2]
Column Name: native-country, Unique Value Counts: 42, Values: [39 5 23
19 0 26 35 33 16 9 2 11 20 30 22 31 4 1 37 7 25 36 14 32
6 8 10 13 3 24 41 29 28 34 38 12 27 40 17 21 18 15]
Column Name: workclass, Unique Value Counts: 9, Values: [7 6 4 1 2 0 5 8 3]
Column Name: income, Unique Value Counts: 2, Values: [0 1]
[In]:
# Gender distribution graph
df['sex'].value_counts().plot(kind='bar')
[Out]:
As shown in Figure 2-3, with 67% of the population identified as male and 33% as female, which is
considered as imbalanced dataset in the context of machine learning. After comparing both gender and
Random documents with unrelated
content Scribd suggests to you:
We highly appreciate the Preservative, knowing as we do its value
by having heretofore weighed it in the test balance and found
nothing wanting. We would find it hard to dispense with.
HERSHEY BROS.
JOHN F. RYAN,
Supt. of Morgue.
Resp’y,
G. B. CONN.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookmass.com