Introduction to AWS Monitoring

Monitoring is the practice of collecting, analysing, and acting on data about your systems so that you can detect problems before they affect users, understand how your infrastructure behaves under load, and make informed decisions about capacity and cost. On AWS, monitoring is not an afterthought — it is a first-class concern baked into every service.

Why Monitoring Matters

Imagine you deploy a web application on a fleet of EC2 instances behind an Application Load Balancer. Traffic is light at first, but a marketing campaign drives a sudden spike. Without monitoring you might not notice that:

CPU utilisation has hit 98 % on every instance
Response latency has jumped from 200 ms to 3 seconds
Error rates have climbed because the database connection pool is exhausted

By the time customers complain, you have already lost revenue and trust. Monitoring closes the feedback loop between what your infrastructure is doing and what you think it is doing.

The Four Golden Signals

Google's Site Reliability Engineering book popularised four golden signals that apply just as well to AWS workloads:

Signal	What It Measures	AWS Example
Latency	Time to serve a request	ALB target response time
Traffic	Demand on the system	Requests per second to API Gateway
Errors	Rate of failed requests	HTTP 5xx count on CloudFront
Saturation	How "full" a resource is	RDS CPU utilisation at 90 %

If you instrument these four signals for every workload, you will catch the vast majority of production issues.

The Monitoring Spectrum

Monitoring on AWS is not a single tool — it is a spectrum of complementary capabilities:

Metrics

Numeric time-series data points. Examples: CPU utilisation, request count, queue depth. Amazon CloudWatch is the central metrics service on AWS. Every AWS service publishes metrics to CloudWatch automatically.

Logs

Detailed textual records of events. Examples: application log lines, VPC Flow Logs, Lambda invocation logs. CloudWatch Logs is the managed log aggregation and query service.

Traces

End-to-end request paths through distributed systems. AWS X-Ray captures traces so you can see how a request flows from API Gateway through Lambda to DynamoDB and back.

Events and Alarms

Real-time notifications when something changes or crosses a threshold. CloudWatch Alarms trigger SNS topics, Auto Scaling actions, or Lambda functions when a metric breaches a limit.

Auditing

A record of who did what and when. AWS CloudTrail logs every API call made against your account, giving you an audit trail for security and compliance.

Key AWS Monitoring Services

Service	Primary Purpose	Key Feature
Amazon CloudWatch	Metrics, logs, alarms, dashboards	Unified operational data from 70+ AWS services
AWS CloudTrail	API activity auditing	Records every API call for governance and compliance
AWS X-Ray	Distributed tracing	Visualises request paths across microservices
Amazon EventBridge	Event-driven automation	Routes events from AWS services, SaaS, and custom apps
AWS Config	Resource configuration tracking	Continuous evaluation of resource compliance
VPC Flow Logs	Network traffic logging	Captures IP traffic metadata for analysis

You do not need to master every service on day one. In this course we will focus on CloudWatch (metrics, logs, alarms, dashboards), CloudTrail, and X-Ray because they form the monitoring backbone of almost every AWS workload.

Observability vs Monitoring

You will often hear the term "observability" alongside monitoring. The distinction is subtle but important:

Monitoring answers known questions: "Is CPU above 80 %?" or "Are there any 5xx errors?"
Observability lets you ask new questions you did not anticipate: "Why is this specific user's request slow even though aggregate latency looks normal?"

Observability requires rich, high-cardinality data — detailed logs, distributed traces, and custom metrics. AWS provides the building blocks; your job is to instrument your applications to emit the right data.

The Three Pillars of Observability

Metrics — aggregated numeric measurements (CloudWatch Metrics)
Logs — discrete event records (CloudWatch Logs)
Traces — end-to-end request journeys (AWS X-Ray)

When you combine all three pillars you can move from reactive firefighting to proactive, data-driven operations.

The Shared Responsibility Model for Monitoring

AWS publishes platform-level metrics for managed services automatically. For example, RDS exposes CPU utilisation, free storage space, and read/write IOPS without any configuration. However, AWS cannot see inside your application. You are responsible for:

Application-level metrics — request latency, business transactions per second, cache hit ratio
Custom log formats — structured JSON logs with correlation IDs
Trace instrumentation — adding the X-Ray SDK to your code

Think of it as two layers:

Layer	Responsibility	Example
Infrastructure	AWS provides built-in metrics	EC2 CPUUtilization, RDS FreeableMemory
Application	You instrument your code	Order processing time, payment success rate

Cost Awareness

Monitoring has a cost. CloudWatch charges per metric, per alarm, per GB of log data ingested, and per query run. Before you instrument everything, plan a monitoring strategy that balances visibility against expense:

Use standard (free) metrics where possible — AWS provides many at no extra charge.
Aggregate before you ship — send percentiles and averages rather than every raw data point.
Set log retention policies — do not keep debug logs forever.
Use metric filters instead of querying raw logs for simple counts.

We will cover cost-effective monitoring patterns throughout this course.

Summary

Monitoring is the foundation of reliable cloud operations. AWS provides a rich suite of services — CloudWatch for metrics, logs, and alarms; CloudTrail for API auditing; X-Ray for distributed tracing — that together give you deep visibility into your workloads. Understanding the four golden signals, the three pillars of observability, and the shared responsibility model for monitoring will prepare you for the hands-on lessons that follow.

In the next lesson we will dive into Amazon CloudWatch Metrics and Dashboards — the starting point for every AWS monitoring journey.

Introduction to AWS Monitoring

Introduction to AWS Monitoring

Why Monitoring Matters

The Four Golden Signals

The Monitoring Spectrum

Metrics

Logs

Traces

Events and Alarms

Auditing

Key AWS Monitoring Services

Observability vs Monitoring

The Three Pillars of Observability

The Shared Responsibility Model for Monitoring

Cost Awareness

Summary

More in Cloud