Security Crash Course

Bountiful Futures

Secure AI, secure future.

for AI Professionals

Securing the CIA

Introduction

This crash course is designed to provide a concise overview of key security concepts and practices, tailored for AI professionals and enthusiasts with little to no background in security. Its aim is to enhance awareness and understanding of potential vulnerabilities, attacks, and protective measures in the context of AI systems and technologies. Through this guide, you'll gain insights into common threats, best practices for safeguarding systems and handling incident response, as well as general strategies for thinking about AI application security, all aimed at fostering a more secure and resilient digital environment.

Basic Security Principles

2.1 CIA Triad

Confidentiality

Restrict access to sensitive information to authorized users only. Example failure: private information is transferred using an outdated encryption scheme, allowing anybody who sees the traffic to decrypt the message.

Integrity

Ensure data and models are accurate and unaltered. Example failure: insufficient access control allows anybody to upload and replace a machine-learning model on a model-sharing website.

Availability

Maintain reliable access to AI systems and data for legitimate users. Example failure: an inefficient regular expression allows an attacker to craft inputs which reliably cause out-of-memory server crashes, reducing availability of the service to legitimate users.

2.2 Key Strategies

Experimental Grainy Gradient Decorative Hexagon

Least Privilege

Grant minimum necessary access for each user or component

Defense in Depth

Layer multiple security measures (e.g., encryption, access controls).

2.3 Risk Management Framework

Experimental Grainy Gradient Decorative Shapes

Identify Risks

Recognize potential security threats to AI systems.

Experimental Grainy Gradient Decorative Rectangles

Assess Risks

Evaluate the likelihood and impact of threats.

Experimental Grainy Gradient Decorative Semi-circles

Mitigate Risks

Implement measures to reduce threat likelihood and impact.

Experimental Grainy Gradient Decorative Flower

Monitor and Review

Continuously check and adjust security measures.

AI System Components

Data, Models, Infrastructure

Gradient Glossy Buildable Surrealist Cloud

Data

Includes training datasets, validation datasets, and user data. Security concerns revolve around unauthorized access, data tampering, and privacy breaches.

Gradient Glossy Buildable Surrealist Cube

Models

Encompasses algorithms and trained models. Vulnerabilities could include model theft, adversarial attacks, and unintended model biases.

Gradient Glossy Buildable Surrealist Geometric Rectangle

Infrastructure

Consists of hardware and software environments where AI operates, including servers, storage, and networking. Risks involve system intrusions, resource hijacking, and service disruptions.

Common Security Vulnerabilities

OWASP Top 10 Web Application Vulnerabilities

4.1

1. Broken Access Control: Failure in implementing proper access restrictions on users.

2. Cryptographic Failures: Failures related to cryptography often lead to sensitive data exposure or system compromise.

6.Vulnerable and Outdated Components: Use of outdated or vulnerable components.

7.Identification and Authentication Failures: Flaws in session management, authentication, and identification.

3. Injection: Including SQL, NoSQL, command injection, where untrusted data is sent as part of a command or query.

4. Insecure Design: Risks associated with design flaws rather than implementation bugs.

8. Software and Data Integrity Failures: Insecure software updates, critical data tampering, and lack of validation.

9.Security Logging and Monitoring Failures: Insufficient logging, ineffective integration with incident response.

5. Security Misconfiguration: Poorly configured security settings or incomplete setups.

10.Server-Side Request Forgery (SSRF): By manipulating server-side requests, attackers can read or modify internal resources.

AI-Specific Vulnerabilities

4.2

Data Poisoning

Introducing harmful data into training sets, leading to biased or flawed outputs.

Model Inversion Attacks

Techniques to extract sensitive training data from models.

Adversarial Attacks

Inputs crafted to deceive AI models into making incorrect decisions.

LLM Prompt Injection

Manipulating prompts to Large Language Models to extract unintended information or trigger specific responses.

Security Best Practices for AI Systems

Secure Coding Practices 5.1

Input Validation

Ensure all inputs are validated and/or sanitized to prevent injection attacks.

Experimental Grainy Gradient Decorative Circles

Error Handling

Implement secure error handling that does not expose sensitive information.

Code Reviews and Auditing

Regularly review and audit code for potential security vulnerabilities.

Data Protection and Privacy 5.2

Access Controls

Implement stringent access controls to restrict data access to authorized personnel only.

Encryption

Encrypt sensitive data, both in transit and at rest.

Anonymization

Whenever possible, use anonymized data to reduce privacy risks.

Privacy-Preserving Machine Learning

Where applicable, consider using techniques to protect user privacy such as differential privacy or homomorphic encryption.

Model Hardening Techniques 5.3

Keep AI models updated to protect against newly discovered vulnerabilities.

Regular Updates

Regularly test models against adversarial attacks and other manipulation techniques.

Robustness Testing

Continuously monitor model performance for signs of tampering or unusual activity.

Monitoring and Anomaly Detection

Basic Security Principles

6.1 Identifying Vulnerabilities

Automated Code Review

Use automated tools to scan code for common vulnerabilities and coding errors.

Simulated Cyber Attacks

Employ penetration testing techniques to uncover potential security weaknesses in your system. Options include external auditing/penetration testing companies, as well as setting up a bug bounty scheme.

Stay Updated

Regularly update software and systems with the latest security patches and updates.

6.2 Incident Response Essentials Framework

1. Immediate Isolation

Quickly isolate affected systems to prevent further spread during an incident.

3. Remediation

Implement fixes, which may involve patching software or updating access controls.

2. Assess and Report

Evaluate the impact and report the incident following your organization’s protocol

4. Learn and Adapt

Analyze the incident, learn from it, and update your security strategies to prevent future occurrences.

Case Studies

Web Security: Ashley Madison

7.1

In 2015, there was a significant data breach targeting the users of the Ashley Madison website, which facilitated extramarital affairs. Hackers stole and leaked personal information of users, including names, email addresses, and credit card details. This breach had profound privacy implications, leading to public shaming and personal crises for exposed users. This event highlights the critical importance of securing sensitive personal and financial information in online platforms, as well as the relevance of general web security to data storage and model deployment.

Source

Data Poisoning: Spam Filter Manipulation

7.2

Data poisoning is one method used by spammers in order to bypass spam filter protection. Spammers will use consumer email clients such as Gmail to mark large volumes of spam emails as “not spam”, in an attempt to influence the classification model which implements the spam filter. This attack highlights the importance of paying close attention to your data, especially when it comes from untrusted sources. For a dynamic system like a spam filter, anomaly detection is crucial for preventing abuse.

Source

Model Inversion: Proofpoint

7.3

In 2019, a significant vulnerability was identified in Proofpoint Email Protection and officially assigned the first ML-specific CVE number (CVE-2019-20634). This vulnerability allowed attackers to build a copy-cat classification model by collecting scores from Proofpoint email headers. Utilizing insights from this model, attackers were able to craft emails that received preferable scores, facilitating the delivery of malicious emails. This case highlights the danger of model inversion when exposing rich output data, and demonstrates how inverted models can be used to manipulate ML services.

Source

Adversarial Attacks: Physical Adversarial Attacks on Autonomous Driving

7.4

In 2019, researchers found that simple modifications to the physical environment—drawing black lines on the road—could cause end-to-end autonomous driving models to exhibit dangerous behavior such as failing to make turns. The study underscores the importance of enhancing the robustness of these systems against unexpected input data, including potential adversarial attacks.

Source

Prompt Injection: Chevrolet of Watsonville

7.5

In 2023, users noticed that a car dealership’s support chat was powered by ChatGPT. Various mischievous prompts were submitted, and the chatbot was successfully convinced to offer “legally binding” $1 deals on cars, as well as unrelated support on programming tasks (allowing users to essentially steal LLM compute from the dealership). This incident highlights the dangers of using general-purpose models for specific tasks, as well as the importance of protecting services against prompt injection attacks.

Source

Additional Resources

OWASP Web Application Top 10

OWASP AI Exchange

OWASP LLM Top 10

PortSwigger’s Web Security Academy

OpenMined courses on Private AI

AI Village blog posts on AI security research