Amazon Timestream

Amazon Timestream — AWS's purpose-built serverless time-series database. Covers time-series data model, storage tiers, query engine, and cross-cloud equivalents.

Overview

Amazon Timestream is AWS's fully managed, serverless time-series database — optimized for storing and querying data where time is the primary axis, such as IoT sensor readings, application metrics, server telemetry, and financial tick data.

What is Time-Series Data?

Time-series data is a sequence of values recorded at successive points in time. Every record has three components:

[ timestamp ]  [ dimension(s) ]  [ measure(s) ]
  10:00:01       server="web-01"   cpu=72.4
  10:00:01       server="web-02"   cpu=45.1
  10:00:02       server="web-01"   cpu=73.1

Component	Description	Example
Timestamp	When the data point was recorded	`2024-01-15 10:00:01.000`
Dimension	Metadata that identifies the source — does not change frequently	`server="web-01"`, `region="us-east-1"`
Measure	The actual value being recorded — changes every interval	`cpu=72.4`, `temperature=21.3`

Why not just use a regular relational database?

A general-purpose database can store time-series data, but time-series workloads have characteristics that make it inefficient:

Characteristic	Impact on a regular DB	Timestream's approach
Write volume is extremely high (millions of points/second)	Table grows unboundedly; INSERT performance degrades	Append-only ingestion; no indexes on writes
Queries almost always filter by time range	Full table scans or manual partitioning by date needed	Time is a native, first-class query dimension
Recent data is hot; old data is cold	Developer manages partitioning + archival manually	Automatic two-tier storage (memory → S3)
Data is naturally ordered by time	B-tree indexes waste space on monotonically increasing timestamps	Columnar storage optimized for time ordering

Architecture

Two-Tier Automatic Storage

Timestream automatically manages a two-tier storage model — retention policies are configured and data moves between tiers automatically:

Write                     Memory Store                    Magnetic Store
  │       recent, hot data    │        older, cold data        │
  └──────▶  (in-memory)       │  ──auto-moves after TTL──▶  (S3-backed columnar)
            ms latency        │                               ms–seconds latency
            hours to days     │                               months to years

Tier	Storage Type	Query Latency	Typical Retention	Cost
Memory Store	In-memory	Milliseconds	Hours to days	Higher
Magnetic Store	S3-backed columnar	Milliseconds to seconds	Months to years	Much lower

A memory store retention (e.g. 24 hours) and a magnetic store retention (e.g. 1 year) are configured per table. Data older than the memory retention threshold is automatically moved to magnetic storage. No manual partitioning or archival jobs needed.

SAA/SAP Tip: The two-tier model is the key exam differentiator for Timestream. Recent data stays fast and expensive (memory); historical data becomes cheap and slightly slower (magnetic). This maps directly to the concept of hot/warm/cold storage tiering.

Serverless Scaling

No cluster to provision — Timestream automatically scales read and write throughput
Billing is based on data written, data stored (per tier), and data queried
No capacity planning required

Data Model

Timestream organizes data into databases → tables. Each table stores time-series records.

A record must have:

A timestamp (nanosecond precision)
One or more dimensions (string key-value pairs identifying the source)
One or more measures (the numeric or string values being tracked)

Database: "infrastructure"
  Table: "server_metrics"
    Record:
      time:       2024-01-15 10:00:01.000000000
      dimensions: [server="web-01", region="us-east-1", az="us-east-1a"]
      measures:   [cpu_utilization=72.4, memory_used_gb=6.2, disk_io_ops=1240]

Dimensions are automatically indexed. Measures are stored as columnar data optimized for range queries over time.

Query Engine

Timestream uses a purpose-built SQL dialect with time-series specific functions that would be complex to write in standard SQL:

-- Average CPU per server over the last hour, in 5-minute buckets
SELECT server,
       bin(time, 5m) AS time_bucket,
       avg(cpu_utilization) AS avg_cpu
FROM "infrastructure"."server_metrics"
WHERE time BETWEEN ago(1h) AND now()
GROUP BY server, bin(time, 5m)
ORDER BY server, time_bucket;

Built-in time-series functions:

Function	What It Does
`ago(duration)`	Returns a timestamp N time units in the past; e.g. `ago(1h)`
`bin(time, interval)`	Groups timestamps into fixed-size buckets (downsampling)
`interpolate_linear()`	Fills gaps in sparse time-series with linear interpolation
`derivative()`	Computes rate of change between consecutive data points
`smooth()`	Applies moving average to reduce noise

Common Use Cases

Use Case	Example
IoT telemetry	Temperature, pressure, vibration from factory sensors every 100ms
Infrastructure monitoring	CPU, memory, network from thousands of EC2 instances every minute
Application metrics	Request latency, error rates, queue depth from microservices
Financial tick data	Stock prices, trade volumes at millisecond granularity
DevOps / observability	Custom application metrics feeding dashboards (pairs with Amazon Managed Grafana)

Timestream vs. Other AWS Databases

Scenario	Use
Store millions of sensor readings per second, query by time range	Amazon Timestream
Store user profile data, session records	Amazon DynamoDB
Store orders, transactions	Amazon RDS / Aurora
Store large historical datasets for BI reports	Amazon Redshift
Store server logs for ad-hoc analysis	Amazon S3 + Athena

Exam Trap: Timestream is not a general-purpose database. Do not use it for relational data, document storage, or workloads where time is not the primary query dimension. The exam will present IoT or metrics scenarios specifically to test whether the exam-taker knows Timestream exists.

SAA/SAP Tip: Any exam scenario mentioning IoT sensor data, DevOps metrics, time-series, or monitoring data at high write volume that needs efficient time-range queries → Amazon Timestream.

Integration with AWS Services

Service	How It Integrates
AWS IoT Core	Route IoT device messages directly to Timestream via IoT Rules
Amazon Kinesis Data Streams	Stream high-throughput events into Timestream via Lambda
Amazon Managed Grafana	Native Timestream data source; visualize metrics as dashboards
Amazon SageMaker	Query Timestream for ML feature engineering on time-series data
AWS Lambda	Serverless ingestion layer between event sources and Timestream

Cross-Cloud Equivalents

Provider	Service / Solution	Notes
AWS	Amazon Timestream	Baseline purpose-built time-series DB
Azure	Azure Data Explorer (ADX)	More powerful and flexible; used for logs + time-series; steeper learning curve
GCP	Google Cloud Bigtable / BigQuery	Bigtable for high-write time-series; BigQuery for analytical queries; neither is purpose-built
On-Premises	InfluxDB / TimescaleDB / Prometheus	InfluxDB = purpose-built time-series OSS; TimescaleDB = PostgreSQL extension; Prometheus = metrics only, no long-term storage

Pricing Model

Writes: per million time-series data points written
Memory store: per GB-hour stored
Magnetic store: per GB-month stored (significantly cheaper than memory)
Queries: per GB of data scanned
No charge for the server/cluster — serverless

Amazon Kinesis and Managed Flink — stream IoT events into Timestream in real time
Amazon DynamoDB and DocumentDB — NoSQL for non-time-series workloads
Database Performance Fundamentals — OLTP/OLAP workload types
Amazon Managed Grafana — visualization layer for Timestream metrics

Amazon Timestream

On this page