What counts as sensitive data

Recognize six categories of sensitive data (personal details, financial records, health information, confidential business secrets, security credentials, and legal documents) and explain why each category triggers legal…

Time: 20–25 min
Type: exercise
Bloom: Apply → Create
XP: 100

Concept architecture for What counts as sensitive data — Lesson 3.1 — concept architecture

You'll be able to

Recognize six categories of sensitive data (personal details, financial records, health information, confidential business secrets, security credentials, and legal documents) and explain why each category triggers legal or ethical red lines in AI application contexts[^3].
Decide whether a given dataset or prompt input contains sensitive information that requires cleaning before entering an AI training pipeline or runtime system, applying the principle of least privilege (giving the system only the minimum data it needs)[^3].
Apply the NIST AI Risk Management Framework's MAP function (a step that identifies and documents potential harms) to describe the likelihood and magnitude of harmful impacts from sensitive information disclosure, using concrete workplace scenarios[^5].
Explain in plain professional language how processing of personal data under regulations such as GDPR (the European Union's General Data Protection Regulation) exposes AI systems to data protection, privacy, and security risks, and communicate these risks to colleagues including senior executives[^2][^6].
Create a risk-based data sanitization plan that integrates strict input validation, access controls, and transparency policies, ensuring the plan is outcome-focused and adaptable to different sectors without prescribing one-size-fits-all requirements[^2][^3].

Key concepts · tap to reveal

1/15·Watch·Beat 1 · Hook

Hook

What turns a helpful chatbot into a compliance incident?

Prompt Labruns here · claude

Your task Write a prompt that asks Claude to recommend the right AI setup for a real task you're facing — then weigh its answer against this lesson, "What counts as sensitive data."

a strong prompt:role · context · task · format · example

⌘↵ to run

Create a hierarchical taxonomy flowchart showing sensitive data categories and their regulatory frameworks. At the top level, display five main branches: PII (Personal Identifiable Information), PHI (Protected Health Information), Financial Data, Int — Diagram · generated brief

Exercise · scenario

A regional hospital network is deploying an AI-powered triage chatbot that collects patient symptoms, medication lists, and insurance provider names during initial intake. The system logs all conversations for quality improvement. The compliance officer asks whether this data qualifies as sensitive under federal regulations. The chatbot operates in the United States and serves approximately 50,000 patients annually across three states.

Deliverable

You will produce a **Sensitive Data Classification Matrix** as a Markdown document that catalogs at least ten real-world data types encountered in AI workflows, classifies each by sensitivity tier (public, internal, confidential, restricted), and maps each to a concrete mitigation control drawn from authoritative frameworks.

Reveal model answer

Sensitive data under HIPAA—PHI requiring specific safeguards

Practice · Scenarios

0 of 8 revealed

Scenario 1 of 8

An e-commerce platform operating across the European Union deploys a recommendation engine that processes customers' purchase histories, browsing behavior, IP addresses, and device fingerprints. The system also infers users' religious affiliations based on purchases of religious texts and holiday-specific items to personalize marketing campaigns. The data science team maintains that this inference improves conversion rates by 18% and enhances user experience.

Step 1 · Classify

Not sensitive—behavioral data and commercial preferences are standard analyticsSensitive under GDPR Article 9—special category data requiring explicit consentSensitive only if users manually enter religious informationConfidential business intelligence but not personal sensitive data

Common misconceptions

“If data is already public somewhere on the internet, it is not sensitive and can be freely used in AI training or outputs”
Personal data remains subject to regulations like GDPR even if it appears in publicly accessible sources, and voice interactions or other user-generated content are personal data that can expose users to data protection, privacy, and security risks. Legal obligations to protect data do not disappear simply because the data was once visible online.

Quiz · adaptive · 3 items

Mastery check

Match each term to its definition. Pass at 80% to earn the lesson's XP and unlock the next.

Sources

[1]OWASP Top 10 for LLM Applications (2025)·OWASP Top 10 for LLM Applications (2025) (2025) · Vendor
[2]NIST AI Risk Management Framework 1.0·NIST AI Risk Management Framework 1.0 > Function: MANAGE > Category: MANAGE 4 Risk > MANAGE 4.3: Incidents and errors are communicated to re (2025) · Regulation
[3]DigComp 2.2 (EU Digital Competence Framework, JRC128415)·DigComp 2.2 (EU Digital Competence Framework, JRC128415) (2025) · Research
[4]Plain Language Guidelines (plainlanguage.gov)·Plain Language Guidelines (plainlanguage.gov) (2025) · Vendor

Capstone artifact · auto-graded

Submit your work for review

Paste your capstone artifact below. You'll get back a 4-level rubric grade, per-criterion feedback, and three concrete edits to strengthen it.

0 chars · minimum 50