Xoxoftware - XOXO Creative Studio | Web & Mobile App Development | Fred Cheung | Hong Kong
AWSSecurity

Amazon Macie

Sensitive data discovery — automated scanning of S3 buckets for PII, PHI, financial data, and other sensitive content using machine learning.

Overview

Amazon Macie is a data security service that uses machine learning and pattern matching to discover and protect sensitive data stored in Amazon S3 — it automatically identifies Personally Identifiable Information (PII), Protected Health Information (PHI), financial data, and credentials.

Macie provides a centralised view of S3 bucket security posture including public access status, encryption settings, and sharing configurations across the entire account or organization.


Core Concepts

ConceptDescription
Sensitive Data Discovery JobA scheduled or one-time scan of S3 objects for sensitive data
FindingA report of sensitive data detected or a bucket security issue
Managed Data IdentifierPre-built detection rules for common sensitive data types (maintained by AWS)
Custom Data IdentifierUser-defined regex, keywords, and proximity rules for organisation-specific data
Allow ListDefines data patterns to ignore (e.g., test credit card numbers)
S3 Bucket InventoryAutomatic assessment of all S3 buckets — encryption, public access, sharing status

What Macie Detects

CategoryExamples
PIINames, addresses, email, phone numbers, SSNs, passport numbers
FinancialCredit card numbers, bank account numbers, tax IDs
Health (PHI)Medical record numbers, health insurance IDs, drug names
CredentialsAWS access keys, private keys, API tokens in S3 objects
Custom patternsEmployee IDs, internal codes (via custom data identifiers)

Multi-Account Management

  • Integrates with AWS Organizations — designate a delegated administrator
  • Administrator account can manage jobs and view findings for all member accounts
  • All findings aggregated in a central account for unified analysis

Automation

Macie Finding
    → EventBridge Rule (match finding type, severity, or data type)
        → Lambda / Step Functions
            → Quarantine object (move to restricted bucket)
            → Tag object for review
            → Notify compliance team (SNS)
            → Create Security Hub finding

Common Use Cases

  • Compliance auditing — Scan S3 for PII/PHI to meet GDPR, HIPAA, or PCI DSS requirements.
  • Data classification — Automatically classify S3 objects by sensitivity level for data governance.
  • Bucket posture assessment — Identify publicly accessible, unencrypted, or shared buckets across the organization.
  • Credential leak detection — Find AWS access keys or API secrets accidentally stored in S3.
  • Automated remediation — EventBridge + Lambda to quarantine objects containing sensitive data.

SAA/SAP Exam Tips

SAA/SAP Tip: Macie is the answer for "discover PII in S3," "find sensitive data in S3 buckets," or "data classification for compliance." It is S3-specific — it does not scan databases, EBS volumes, or other storage.

Exam Trap: Macie discovers sensitive data; it does not encrypt or delete it. Combine Macie with S3 bucket policies, KMS encryption, or Lambda remediation for protection.


Cross-Cloud Equivalents

ProviderService / SolutionNotes
AWSAmazon MacieBaseline
AzureMicrosoft Purview (Data Classification)Broader data catalog + classification
GCPGoogle Cloud Sensitive Data Protection (DLP API)Scans any data source (not storage-specific)
On-PremisesVaronis, Digital Guardian, Symantec DLPEnterprise data loss prevention platforms

Pricing Model

DimensionUnitNotes
S3 bucket inventoryFreeContinuous bucket-level security assessment
Sensitive data discoveryPer GB scannedFirst 1 GB/month free; tiered after
Automated discoveryPer GB evaluatedContinuous sampling at reduced cost

Built by Fred Cheung @CookedRicer · Powered by Fumadocs & Github Copilot

On this page