You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Managed Instance Groups (MIGs) are the foundation for building scalable, self-healing applications on Compute Engine. Combined with autoscaling, MIGs automatically adjust the number of VM instances in response to load, ensuring your application can handle traffic spikes while minimising costs during quiet periods.
Autoscaling automatically adds or removes VM instances in a MIG based on configurable signals. When demand increases, the autoscaler creates new instances from the instance template. When demand decreases, it removes instances to reduce costs.
| Signal | Description | Use Case |
|---|---|---|
| CPU utilisation | Average CPU usage across all instances | General workloads |
| HTTP load balancing utilisation | Requests per second per instance | Web applications behind a load balancer |
| Cloud Monitoring metrics | Custom or built-in Stackdriver metrics | Queue depth, custom business metrics |
| Schedule | Time-based scaling on a recurring schedule | Predictable traffic patterns |
# Set autoscaling based on CPU utilisation
gcloud compute instance-groups managed set-autoscaling web-mig \
--zone=europe-west2-a \
--min-num-replicas=2 \
--max-num-replicas=20 \
--target-cpu-utilization=0.60 \
--cool-down-period=90
| Parameter | Description |
|---|---|
| min-num-replicas | Minimum number of instances (never scales below this) |
| max-num-replicas | Maximum number of instances (never scales above this) |
| target-cpu-utilization | Target average CPU (0.60 = 60%) — autoscaler adds instances when above, removes when below |
| cool-down-period | Seconds to wait after creating an instance before including it in utilisation calculations |
gcloud compute instance-groups managed set-autoscaling web-mig \
--zone=europe-west2-a \
--min-num-replicas=2 \
--max-num-replicas=50 \
--target-load-balancing-utilization=0.80
You can scale based on any Cloud Monitoring metric, including custom metrics published by your application:
gcloud compute instance-groups managed set-autoscaling web-mig \
--zone=europe-west2-a \
--min-num-replicas=2 \
--max-num-replicas=30 \
--update-stackdriver-metric=custom.googleapis.com/queue_depth \
--stackdriver-metric-single-instance-assignment=10
This configuration tells the autoscaler: "Each instance should handle 10 items from the queue. If the total queue depth is 50, run 5 instances."
For predictable traffic patterns, you can configure scaling schedules:
gcloud compute instance-groups managed update-autoscaling web-mig \
--zone=europe-west2-a \
--set-schedule=business-hours \
--schedule-min-required-replicas=10 \
--schedule-cron="0 8 * * 1-5" \
--schedule-duration-sec=36000 \
--schedule-description="Scale up during business hours"
Autohealing monitors the health of each instance in the MIG and automatically recreates unhealthy instances. It uses health checks to determine instance health.
# Create a health check
gcloud compute health-checks create http web-health-check \
--port=80 \
--request-path=/health \
--check-interval=10s \
--timeout=5s \
--healthy-threshold=2 \
--unhealthy-threshold=3
# Apply the health check to the MIG for autohealing
gcloud compute instance-groups managed update web-mig \
--zone=europe-west2-a \
--health-check=web-health-check \
--initial-delay=300
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.