Amazon Macie
Sensitive data discovery — automated scanning of S3 buckets for PII, PHI, financial data, and other sensitive content using machine learning.
Overview
Amazon Macie is a data security service that uses machine learning and pattern matching to discover and protect sensitive data stored in Amazon S3 — it automatically identifies Personally Identifiable Information (PII), Protected Health Information (PHI), financial data, and credentials.
Macie provides a centralised view of S3 bucket security posture including public access status, encryption settings, and sharing configurations across the entire account or organization.
Core Concepts
| Concept | Description |
|---|---|
| Sensitive Data Discovery Job | A scheduled or one-time scan of S3 objects for sensitive data |
| Finding | A report of sensitive data detected or a bucket security issue |
| Managed Data Identifier | Pre-built detection rules for common sensitive data types (maintained by AWS) |
| Custom Data Identifier | User-defined regex, keywords, and proximity rules for organisation-specific data |
| Allow List | Defines data patterns to ignore (e.g., test credit card numbers) |
| S3 Bucket Inventory | Automatic assessment of all S3 buckets — encryption, public access, sharing status |
What Macie Detects
| Category | Examples |
|---|---|
| PII | Names, addresses, email, phone numbers, SSNs, passport numbers |
| Financial | Credit card numbers, bank account numbers, tax IDs |
| Health (PHI) | Medical record numbers, health insurance IDs, drug names |
| Credentials | AWS access keys, private keys, API tokens in S3 objects |
| Custom patterns | Employee IDs, internal codes (via custom data identifiers) |
Multi-Account Management
- Integrates with AWS Organizations — designate a delegated administrator
- Administrator account can manage jobs and view findings for all member accounts
- All findings aggregated in a central account for unified analysis
Automation
Macie Finding
→ EventBridge Rule (match finding type, severity, or data type)
→ Lambda / Step Functions
→ Quarantine object (move to restricted bucket)
→ Tag object for review
→ Notify compliance team (SNS)
→ Create Security Hub findingCommon Use Cases
- Compliance auditing — Scan S3 for PII/PHI to meet GDPR, HIPAA, or PCI DSS requirements.
- Data classification — Automatically classify S3 objects by sensitivity level for data governance.
- Bucket posture assessment — Identify publicly accessible, unencrypted, or shared buckets across the organization.
- Credential leak detection — Find AWS access keys or API secrets accidentally stored in S3.
- Automated remediation — EventBridge + Lambda to quarantine objects containing sensitive data.
SAA/SAP Exam Tips
SAA/SAP Tip: Macie is the answer for "discover PII in S3," "find sensitive data in S3 buckets," or "data classification for compliance." It is S3-specific — it does not scan databases, EBS volumes, or other storage.
Exam Trap: Macie discovers sensitive data; it does not encrypt or delete it. Combine Macie with S3 bucket policies, KMS encryption, or Lambda remediation for protection.
Cross-Cloud Equivalents
| Provider | Service / Solution | Notes |
|---|---|---|
| AWS | Amazon Macie | Baseline |
| Azure | Microsoft Purview (Data Classification) | Broader data catalog + classification |
| GCP | Google Cloud Sensitive Data Protection (DLP API) | Scans any data source (not storage-specific) |
| On-Premises | Varonis, Digital Guardian, Symantec DLP | Enterprise data loss prevention platforms |
Pricing Model
| Dimension | Unit | Notes |
|---|---|---|
| S3 bucket inventory | Free | Continuous bucket-level security assessment |
| Sensitive data discovery | Per GB scanned | First 1 GB/month free; tiered after |
| Automated discovery | Per GB evaluated | Continuous sampling at reduced cost |
Related Services / See Also
- Amazon S3 — the only data store Macie scans
- Amazon GuardDuty — threat detection (complements Macie's data discovery)
- AWS Security Hub — aggregates Macie findings with other security services
- AWS KMS and CloudHSM — encryption for protecting discovered sensitive data
AWS KMS and CloudHSM
Key management — customer-managed keys, envelope encryption, key rotation, and CloudHSM for hardware-based cryptographic operations.
AWS Organizations
Multi-account management — Organizational Units, Service Control Policies, consolidated billing, and AWS Control Tower for governance at scale.