Bias Mini Crash Course

Bountiful Futures

yuda.io

ANITAB.ORG

CONTRIBUTIONS FROM: MADDIE SHANG, DAISY MAYORGA, KHADIJAH MCGILL

Balance the AI, Balance the Future

BIAS Crash Course

for AI Professionals

Ethics First in AI

Introduction

This crash course is designed to provide a concise overview of key bias concepts and practices, tailored for AI professionals and enthusiasts with little to no background in AI Ethics. Its aim is to enhance awareness and understanding of ethical principles, types of bias, and detecting and mitigating Bias in the context of AI systems and technologies.This guide will help get you started on creating your own framework on bias.

Key Ethical Principles

Experimental Grainy Gradient Decorative Shapes

Transparency

What It Is: Being clear about how AI systems work and make decisions.

Why It's Important: Transparency helps everyone understand and trust AI. It's like knowing the ingredients and process that goes into our food. When we know how an AI makes decisions, we can make choices so it's safe and fair for us, according to our individual opinions.

Experimental Grainy Gradient Decorative Rectangles

Accountability

What It Is: Taking responsibility for what AI does.

Why It's Important: If an AI system makes a mistake, who is responsible for limiting the impact and fixing the mistake? Just like if there is a recall of an sandwich, is it the store, the chef or the farmer who is responsible? What is the responsibility of each party in reducing the likelihood of this happening again? Who is responsible for compensating the consumer?

Experimental Grainy Gradient Decorative Semi-circles

Fairness

What It Is: Ensuring AI outcomes affects everyone with equity (not necessarily equally), without discrimination and bias.

Why It's Important: Fairness has no unique agreed upon definition that works in all situations. So here we consider if AI is making predictions and affecting outcome based on protected and sensitive attributes (i.e. race, gender identity, sexual orientation, age) where it should not matter. Most of us have a sense of what is unjust and wrong, it’s important to have mechanisms for all of us to make this determination and communicate what we believe is unfair so ML can improve and better serve more of humanity (even if we don’t always agree).

Experimental Grainy Gradient Decorative Flower

Privacy

What It Is: Protecting personal information that AI systems use.

Why It's Important: Privacy in AI is like keeping your emails or bank account details secure. It ensures that sensitive information, like your personal messages or financial data, isn't shared or accessed without your permission. Privacy is a fundamental and economical right (if you privately wrote a unpublished novel, it has potential value and should not be used without your permission). Better privacy also ensures machine learning does not use factor that may contain bias in predictions and cause biased outcomes.

Bias and Fairness

Understanding Bias

Origins of bias in data and algorithms

Let's begin by understanding where bias in AI can start, and helping you identify this in practice. Imagine bias in AI like a pair of glasses that only see one color. This bias can start right from the data we feed into AI, similar to a pair of glasses being taught to only recognize red. If the AI only learns from information about certain types of people or situations, it won’t understand or treat everyone equally. This can happen if the data we use is too narrow (like only seeing red), if the rules for making decisions aren't fair or the data itself contains prior judgement which may be unfair (women were historically less likely to be hired in some fields, if machine learning models is trained on these historical data, it’ll likely carry on with the same bias). Since AI is now used in crucial areas like healthcare, banking, and law, it's very important to make sure we are conscious of potential bias and be ready to recongize and correct for it.

Implications of bias in real-world applications

Here is an real example of how bias could be affecting AI today (AI trained for medicine have to contend with the fact that, upto 79% of studies have only male participants https://fortune.com/2022/06/10/world-built-for-men-women-bodies-gender-gap-health-research-medicine-care-jain-bruzek/

In this case it has the potential to mis-diagnose and impact treatment outcomes if trained on data underrepresenting non-male patients, therefore it is important for AI professionals in medicine to specifically detect and correct for this potential bias.

Types of Bias

Selection bias

Where certain individuals, categories or groups of individuals are more likely to be selected based on the problem area or means of data collection. Essentially the data used to train the machine learning model isn't large enough, not representative enough or is too incomplete to sufficiently train the system.

Confirmation bias

When human data collectors or analysts skew their data collection methods and analysis in a way that is manipulated or misrepresented to prove a predetermined assumption with a tendency to focus on information that confirms one's preconceptions.

Out-group homogeneity bias

This is a case of not knowing what one doesn’t know. There is a tendency for people to have a better understanding of ingroup members—the group one belongs to—and to think they are more diverse than outgroup members. The result can be developers creating algorithms that are less capable of distinguishing between individuals who are not part of the majority group in the training data, leading to racial bias, misclassification and incorrect answers.

Exclusion bias

When certain individuals, categories or groups of individuals are excluded from selection either intentionally or unintentionally based on methods of data collection.

Algorithm bias

Misinformation can result if the problem or question asked is not fully correct or specific, or if the feedback to the machine learning algorithm does not help guide the search for a solution.

Prejudice bias

Occurs when stereotypes and faulty societal assumptions find their way into the algorithm’s dataset, which inevitably leads to biased results. For example, AI could return results showing that only males are doctors and all nurses are female.

Reporting bias

When certain observations are more or less likely to be reported based on the nature of that data, resulting in data sets that don't represent reality.

Measurement bias

Caused by incomplete data. This is most often an oversight or lack of preparation that results in the dataset not including the whole population that should be considered.

Stereotyping bias

This happens when an AI system—usually inadvertently—reinforces harmful stereotypes. For example, a language translation system could associate some languages with certain genders or ethnic stereotypes.

Detecting and Mitigating Bias

Tools and techniques

Identify Potential Sources of Bias

Approach: Examine the data carefully.

Techniques:

Evaluate data selection processes for biases.
Check for errors in data capture or observation.
Avoid using historical data sets that may contain prejudice or confirmation bias.

Goal: By understanding where bias might arise, you can work to eliminate it.

Set Guidelines and Rules for Eliminating Bias

Approach: Establish clear organizational guidelines.

Techniques:

Create procedures for identifying and mitigating data set bias.
Document cases of bias and the steps taken to address them.

Goal: Ensuring a consistent, transparent approach to handling bias.

Identify Accurate Representative Data

Approach: Understand the population to be modeled.

Techniques:

Analyze characteristics of the target population.
Ensure the data set matches these characteristics to reduce bias.

Goal: Create a data set that accurately represents the diversity of the target population.

Document and Share Data Selection and Cleansing Methods

Approach: Maintain transparency in data handling.

Techniques:

Keep records of how data is selected and cleansed.
Allow external review of the models to identify unnoticed biases.

Goal: Prevent bias during data selection and cleansing stages.

Screen Models for Bias as Well as Performance

Approach: Include bias detection in model evaluations.

Techniques:

Assess models not just for accuracy and precision but also for potential biases.

Goal: Ensure models perform fairly across different groups and scenarios.

Monitor and Review Models in Operation

Approach: Continuous monitoring post-deployment.

Techniques:

Keep track of the model's real-world performance.

Look for signs of bias during operation.

Goal: Quickly address any biases that become evident after deployment.

Case studies and examples

India Grainy Gradient Gates and Window Shapes Box Square

Joy Buolamwini's Gender Shades project

This research project uncovered significant bias in commercial gender classification systems, leading to changes in industry practices.

Google's What-If Tool

Part of Google's TensorFlow, the What-If Tool allows users to analyze machine learning models without writing code and to visualize the model's decision-making process. It's useful for investigating model performances and biases.

IBM's AI Fairness 360

This is an open-source toolkit by IBM to help detect and mitigate bias in machine learning models. It includes a comprehensive set of metrics for datasets and models to test for biases, and algorithms to mitigate these biases.

How we fought our landlord’s secretive plan for facial recognition—and won

Tenants in the Atlantic Plaza Towers apartment complex in New York’s Brownsville neighborhood were fighting to prevent their landlord, Nelson Management Group, from installing facial recognition technology to open the front door to their buildings, calling it an intrusion of their privacy.