You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Metrics are the heartbeat of your monitoring strategy. They are numerical measurements collected at regular intervals that describe the behaviour of your systems over time. Google Cloud Monitoring provides a powerful metrics infrastructure with automatic collection, flexible querying, and rich visualisation through dashboards.
Every metric in Cloud Monitoring has a kind that describes how values relate over time:
| Kind | Description | Example |
|---|---|---|
| Gauge | A value at a specific point in time | CPU utilisation (45%), memory usage (2.3 GiB) |
| Delta | The change in value over a time interval | Request count in the last minute, bytes sent in the last 5 minutes |
| Cumulative | A running total since a fixed start time | Total requests since process start, total bytes transferred |
Understanding metric kinds is critical when choosing aggregation functions. You cannot meaningfully sum gauge metrics across time — you would use mean or max instead.
Metrics also have a value type that determines the data format:
| Value Type | Description |
|---|---|
| INT64 | A 64-bit integer (e.g., request count) |
| DOUBLE | A 64-bit floating-point number (e.g., CPU utilisation as a fraction) |
| BOOL | True or false (e.g., instance is running) |
| STRING | A text value (rare — used for status labels) |
| DISTRIBUTION | A histogram of values (e.g., latency distribution) |
Metrics Explorer is the primary tool for ad-hoc metric analysis in the Cloud Console. It provides a flexible query interface for exploring any metric across your monitored resources.
When you have multiple time series (e.g., CPU utilisation for 50 VMs), you need to aggregate them into a meaningful view:
| Aggregation | Use Case |
|---|---|
| Sum | Total request count across all instances |
| Mean | Average CPU utilisation across a fleet |
| Max | Peak memory usage across all instances |
| 99th percentile | Tail latency for a service |
| Count | Number of time series matching a filter |
Alignment divides the time axis into regular intervals and produces one value per interval. A 5-minute alignment with mean aligner produces the average value for each 5-minute window.
For advanced queries, Cloud Monitoring supports MQL — a text-based query language that provides more flexibility than the visual query builder:
fetch gce_instance
| metric 'compute.googleapis.com/instance/cpu/utilization'
| filter zone = 'europe-west2-a'
| group_by [instance_name], mean(val())
| align rate(5m)
| every 5m
# Calculate error rate as a percentage
fetch cloud_run_revision
| metric 'run.googleapis.com/request_count'
| filter response_code_class != '2xx'
| group_by [], sum(val())
| div
(fetch cloud_run_revision
| metric 'run.googleapis.com/request_count'
| group_by [], sum(val()))
| mul 100
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.