Bountiful Futures
yuda.io
ANITAB.ORG
CONTRIBUTIONS FROM: MADDIE SHANG, DAISY MAYORGA, KHADIJAH MCGILL
Balance the AI, Balance the Future
for AI Professionals
Ethics First in AI
This crash course is designed to provide a concise overview of key bias concepts and practices, tailored for AI professionals and enthusiasts with little to no background in AI Ethics. Its aim is to enhance awareness and understanding of ethical principles, types of bias, and detecting and mitigating Bias in the context of AI systems and technologies.This guide will help get you started on creating your own framework on bias.
Transparency
What It Is: Being clear about how AI systems work and make decisions.
Why It's Important: Transparency helps everyone understand and trust AI. It's like knowing the ingredients and process that goes into our food. When we know how an AI makes decisions, we can make choices so it's safe and fair for us, according to our individual opinions.
Accountability
What It Is: Taking responsibility for what AI does.
Why It's Important: If an AI system makes a mistake, who is responsible for limiting the impact and fixing the mistake? Just like if there is a recall of an sandwich, is it the store, the chef or the farmer who is responsible? What is the responsibility of each party in reducing the likelihood of this happening again? Who is responsible for compensating the consumer?
Fairness
What It Is: Ensuring AI outcomes affects everyone with equity (not necessarily equally), without discrimination and bias.
Why It's Important: Fairness has no unique agreed upon definition that works in all situations. So here we consider if AI is making predictions and affecting outcome based on protected and sensitive attributes (i.e. race, gender identity, sexual orientation, age) where it should not matter. Most of us have a sense of what is unjust and wrong, it’s important to have mechanisms for all of us to make this determination and communicate what we believe is unfair so ML can improve and better serve more of humanity (even if we don’t always agree).
Privacy
What It Is: Protecting personal information that AI systems use.
Why It's Important: Privacy in AI is like keeping your emails or bank account details secure. It ensures that sensitive information, like your personal messages or financial data, isn't shared or accessed without your permission. Privacy is a fundamental and economical right (if you privately wrote a unpublished novel, it has potential value and should not be used without your permission). Better privacy also ensures machine learning does not use factor that may contain bias in predictions and cause biased outcomes.
Origins of bias in data and algorithms
Let's begin by understanding where bias in AI can start, and helping you identify this in practice. Imagine bias in AI like a pair of glasses that only see one color. This bias can start right from the data we feed into AI, similar to a pair of glasses being taught to only recognize red. If the AI only learns from information about certain types of people or situations, it won’t understand or treat everyone equally. This can happen if the data we use is too narrow (like only seeing red), if the rules for making decisions aren't fair or the data itself contains prior judgement which may be unfair (women were historically less likely to be hired in some fields, if machine learning models is trained on these historical data, it’ll likely carry on with the same bias). Since AI is now used in crucial areas like healthcare, banking, and law, it's very important to make sure we are conscious of potential bias and be ready to recongize and correct for it.
Implications of bias in real-world applications
Here is an real example of how bias could be affecting AI today (AI trained for medicine have to contend with the fact that, upto 79% of studies have only male participants https://fortune.com/2022/06/10/world-built-for-men-women-bodies-gender-gap-health-research-medicine-care-jain-bruzek/
In this case it has the potential to mis-diagnose and impact treatment outcomes if trained on data underrepresenting non-male patients, therefore it is important for AI professionals in medicine to specifically detect and correct for this potential bias.
Selection bias
Where certain individuals, categories or groups of individuals are more likely to be selected based on the problem area or means of data collection. Essentially the data used to train the machine learning model isn't large enough, not representative enough or is too incomplete to sufficiently train the system.
Confirmation bias
When human data collectors or analysts skew their data collection methods and analysis in a way that is manipulated or misrepresented to prove a predetermined assumption with a tendency to focus on information that confirms one's preconceptions.
Out-group homogeneity bias
This is a case of not knowing what one doesn’t know. There is a tendency for people to have a better understanding of ingroup members—the group one belongs to—and to think they are more diverse than outgroup members. The result can be developers creating algorithms that are less capable of distinguishing between individuals who are not part of the majority group in the training data, leading to racial bias, misclassification and incorrect answers.
Exclusion bias
When certain individuals, categories or groups of individuals are excluded from selection either intentionally or unintentionally based on methods of data collection.
Algorithm bias
Misinformation can result if the problem or question asked is not fully correct or specific, or if the feedback to the machine learning algorithm does not help guide the search for a solution.
Prejudice bias
Occurs when stereotypes and faulty societal assumptions find their way into the algorithm’s dataset, which inevitably leads to biased results. For example, AI could return results showing that only males are doctors and all nurses are female.
Reporting bias
When certain observations are more or less likely to be reported based on the nature of that data, resulting in data sets that don't represent reality.
Measurement bias
Caused by incomplete data. This is most often an oversight or lack of preparation that results in the dataset not including the whole population that should be considered.
Stereotyping bias
This happens when an AI system—usually inadvertently—reinforces harmful stereotypes. For example, a language translation system could associate some languages with certain genders or ethnic stereotypes.
Identify Potential Sources of Bias
Approach: Examine the data carefully.
Techniques:
Goal: By understanding where bias might arise, you can work to eliminate it.
Set Guidelines and Rules for Eliminating Bias
Approach: Establish clear organizational guidelines.
Techniques:
Goal: Ensuring a consistent, transparent approach to handling bias.
Identify Accurate Representative Data
Approach: Understand the population to be modeled.
Techniques:
Goal: Create a data set that accurately represents the diversity of the target population.
Document and Share Data Selection and Cleansing Methods
Approach: Maintain transparency in data handling.
Techniques:
Goal: Prevent bias during data selection and cleansing stages.
Screen Models for Bias as Well as Performance
Approach: Include bias detection in model evaluations.
Techniques:
Goal: Ensure models perform fairly across different groups and scenarios.
Monitor and Review Models in Operation
Approach: Continuous monitoring post-deployment.
Techniques:
Look for signs of bias during operation.
Goal: Quickly address any biases that become evident after deployment.
This research project uncovered significant bias in commercial gender classification systems, leading to changes in industry practices.
Part of Google's TensorFlow, the What-If Tool allows users to analyze machine learning models without writing code and to visualize the model's decision-making process. It's useful for investigating model performances and biases.
This is an open-source toolkit by IBM to help detect and mitigate bias in machine learning models. It includes a comprehensive set of metrics for datasets and models to test for biases, and algorithms to mitigate these biases.
Tenants in the Atlantic Plaza Towers apartment complex in New York’s Brownsville neighborhood were fighting to prevent their landlord, Nelson Management Group, from installing facial recognition technology to open the front door to their buildings, calling it an intrusion of their privacy.