Skip to content

You are viewing a free preview of this lesson.

Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.

Introduction to AWS Monitoring

Introduction to AWS Monitoring

Monitoring is the practice of collecting, analysing, and acting on data about your systems so that you can detect problems before they affect users, understand how your infrastructure behaves under load, and make informed decisions about capacity and cost. On AWS, monitoring is not an afterthought — it is a first-class concern baked into every service.


Why Monitoring Matters

Imagine you deploy a web application on a fleet of EC2 instances behind an Application Load Balancer. Traffic is light at first, but a marketing campaign drives a sudden spike. Without monitoring you might not notice that:

  • CPU utilisation has hit 98 % on every instance
  • Response latency has jumped from 200 ms to 3 seconds
  • Error rates have climbed because the database connection pool is exhausted

By the time customers complain, you have already lost revenue and trust. Monitoring closes the feedback loop between what your infrastructure is doing and what you think it is doing.

The Four Golden Signals

Google's Site Reliability Engineering book popularised four golden signals that apply just as well to AWS workloads:

Signal What It Measures AWS Example
Latency Time to serve a request ALB target response time
Traffic Demand on the system Requests per second to API Gateway
Errors Rate of failed requests HTTP 5xx count on CloudFront
Saturation How "full" a resource is RDS CPU utilisation at 90 %

If you instrument these four signals for every workload, you will catch the vast majority of production issues.


The Monitoring Spectrum

Monitoring on AWS is not a single tool — it is a spectrum of complementary capabilities:

Metrics

Numeric time-series data points. Examples: CPU utilisation, request count, queue depth. Amazon CloudWatch is the central metrics service on AWS. Every AWS service publishes metrics to CloudWatch automatically.

Logs

Detailed textual records of events. Examples: application log lines, VPC Flow Logs, Lambda invocation logs. CloudWatch Logs is the managed log aggregation and query service.

Traces

End-to-end request paths through distributed systems. AWS X-Ray captures traces so you can see how a request flows from API Gateway through Lambda to DynamoDB and back.

Events and Alarms

Real-time notifications when something changes or crosses a threshold. CloudWatch Alarms trigger SNS topics, Auto Scaling actions, or Lambda functions when a metric breaches a limit.

Auditing

A record of who did what and when. AWS CloudTrail logs every API call made against your account, giving you an audit trail for security and compliance.


Key AWS Monitoring Services

Service Primary Purpose Key Feature
Amazon CloudWatch Metrics, logs, alarms, dashboards Unified operational data from 70+ AWS services
AWS CloudTrail API activity auditing Records every API call for governance and compliance
AWS X-Ray Distributed tracing Visualises request paths across microservices
Amazon EventBridge Event-driven automation Routes events from AWS services, SaaS, and custom apps
AWS Config Resource configuration tracking Continuous evaluation of resource compliance
VPC Flow Logs Network traffic logging Captures IP traffic metadata for analysis

You do not need to master every service on day one. In this course we will focus on CloudWatch (metrics, logs, alarms, dashboards), CloudTrail, and X-Ray because they form the monitoring backbone of almost every AWS workload.


Observability vs Monitoring

You will often hear the term "observability" alongside monitoring. The distinction is subtle but important:

  • Monitoring answers known questions: "Is CPU above 80 %?" or "Are there any 5xx errors?"
  • Observability lets you ask new questions you did not anticipate: "Why is this specific user's request slow even though aggregate latency looks normal?"

Observability requires rich, high-cardinality data — detailed logs, distributed traces, and custom metrics. AWS provides the building blocks; your job is to instrument your applications to emit the right data.

The Three Pillars of Observability

  1. Metrics — aggregated numeric measurements (CloudWatch Metrics)
  2. Logs — discrete event records (CloudWatch Logs)
  3. Traces — end-to-end request journeys (AWS X-Ray)

When you combine all three pillars you can move from reactive firefighting to proactive, data-driven operations.


The Shared Responsibility Model for Monitoring

AWS publishes platform-level metrics for managed services automatically. For example, RDS exposes CPU utilisation, free storage space, and read/write IOPS without any configuration. However, AWS cannot see inside your application. You are responsible for:

  • Application-level metrics — request latency, business transactions per second, cache hit ratio
  • Custom log formats — structured JSON logs with correlation IDs
  • Trace instrumentation — adding the X-Ray SDK to your code

Think of it as two layers:

Layer Responsibility Example
Infrastructure AWS provides built-in metrics EC2 CPUUtilization, RDS FreeableMemory
Application You instrument your code Order processing time, payment success rate

Cost Awareness

Monitoring has a cost. CloudWatch charges per metric, per alarm, per GB of log data ingested, and per query run. Before you instrument everything, plan a monitoring strategy that balances visibility against expense:

  • Use standard (free) metrics where possible — AWS provides many at no extra charge.
  • Aggregate before you ship — send percentiles and averages rather than every raw data point.
  • Set log retention policies — do not keep debug logs forever.
  • Use metric filters instead of querying raw logs for simple counts.

We will cover cost-effective monitoring patterns throughout this course.


Summary

Monitoring is the foundation of reliable cloud operations. AWS provides a rich suite of services — CloudWatch for metrics, logs, and alarms; CloudTrail for API auditing; X-Ray for distributed tracing — that together give you deep visibility into your workloads. Understanding the four golden signals, the three pillars of observability, and the shared responsibility model for monitoring will prepare you for the hands-on lessons that follow.

In the next lesson we will dive into Amazon CloudWatch Metrics and Dashboards — the starting point for every AWS monitoring journey.