Xoxoftware - XOXO Creative Studio | Web & Mobile App Development | Fred Cheung | Hong Kong
AWSMonitoring

AWS X-Ray

Distributed tracing — analyse and debug production applications by tracing requests as they travel through microservices, APIs, and AWS resources.

Overview

AWS X-Ray is a distributed tracing service that helps analyse and debug applications — it traces requests as they flow through microservices, identifies performance bottlenecks, and visualises the service dependency map.


Core Concepts

ConceptDescription
TraceEnd-to-end record of a single request as it travels through all services
SegmentA named block representing work done by a single service (e.g., one Lambda function)
SubsegmentA more granular unit within a segment (e.g., an external HTTP call or DB query)
Trace IDUnique identifier propagated across services to correlate segments into one trace
Service MapVisual graph showing service dependencies, latency, and error rates
AnnotationIndexed key-value pair on a segment — used for filtering and searching traces
MetadataNon-indexed key-value data on a segment — for debugging but not searchable
Sampling RuleControls the percentage of requests traced to manage cost and overhead
X-Ray DaemonBackground process that buffers segments and sends them to the X-Ray API
X-Ray SDKLibrary added to application code to capture trace data automatically

How X-Ray Works

Client Request
    → API Gateway (Segment A)
        → Lambda (Segment B)
            → DynamoDB (Subsegment B.1)
            → External API (Subsegment B.2)
        → SQS (Segment C)
            → EC2 Consumer (Segment D)
                → RDS (Subsegment D.1)

All segments share the same Trace ID → assembled into one Trace

Instrumentation Flow

Application Code + X-Ray SDK
    → Capture segments/subsegments
        → X-Ray Daemon (UDP port 2000)
            → Batch send to X-Ray API
                → Service Map + Trace Timeline

Integration with AWS Services

ServiceIntegration Method
LambdaEnable active tracing in function config — no daemon needed
API GatewayEnable tracing on stage — automatic segment creation
ECS / EKSRun X-Ray daemon as a sidecar container
EC2Install and run X-Ray daemon; instrument app with SDK
Elastic BeanstalkEnable X-Ray in environment configuration
App RunnerBuilt-in X-Ray integration
SNS / SQSAutomatic trace header propagation (active tracing)
Step FunctionsBuilt-in tracing with state-level visibility

Sampling Rules

ParameterDescriptionDefault
ReservoirFixed number of requests traced per second1/s
RatePercentage of additional requests beyond the reservoir5%
Service nameFilter rule to specific services* (all)
URL pathFilter rule to specific API paths* (all)

Custom sampling rules reduce cost and noise by tracing fewer routine requests while capturing all errors or specific endpoints.


Service Map

The service map provides a real-time visual topology of the application:

  • Nodes — Each service or resource (Lambda, DynamoDB, external HTTP)
  • Edges — Request flow between nodes with latency and error stats
  • Colour coding — Green (healthy), yellow (errors), red (faults)
  • Drill-down — Select a node to view traces, latency distribution, and error details

X-Ray vs CloudWatch

AspectX-RayCloudWatch
FocusDistributed tracing (request-level)Metrics, logs, alarms (resource-level)
Question answered"Where is the bottleneck in this request?""How is this resource performing?"
GranularityPer-request, per-serviceAggregate metrics over time
VisualisationService map + trace timelineDashboards + metric graphs
Complementary useDebug specific slow requestsMonitor overall health and set alarms

Common Use Cases

  • Latency analysis — Identify which downstream service or database query is causing slow response times.
  • Error root cause — Trace a failed request through multiple microservices to find the exact failing component.
  • Dependency mapping — Visualise all service-to-service interactions in a microservice architecture.
  • Performance baseline — Establish normal latency distributions and detect regressions.
  • Cold start impact — Measure Lambda cold start duration as a distinct subsegment in the trace.

SAA/SAP Exam Tips

SAA Tip: "Debug latency in a microservice application" or "trace requests across services" → AWS X-Ray. CloudWatch is for metrics/logs; X-Ray is for distributed tracing.

SAP Tip: X-Ray sampling rules control cost — the default is 1 req/s + 5% of additional requests. Adjust sampling for high-traffic services to avoid excessive tracing costs.


Cross-Cloud Equivalents

ProviderService / SolutionNotes
AWSAWS X-RayBaseline
AzureAzure Application Insights (distributed tracing)Part of Azure Monitor; richer APM features
GCPGoogle Cloud TraceDistributed tracing with latency analysis
On-PremisesJaeger, Zipkin, Datadog APM, New RelicOpen-source or SaaS tracing platforms

Pricing Model

DimensionUnitNotes
Traces recordedPer million tracesFirst 100,000 traces/month free
Traces retrievedPer million tracesFirst 1 M retrievals/month free
Traces scannedPer million tracesFor trace summary and analytics queries
X-Ray InsightsPer Insight generatedAutomated anomaly detection (additional charge)

Built by Fred Cheung @CookedRicer · Powered by Fumadocs & Github Copilot

On this page