Managed Instance Groups & Autoscaling

Managed Instance Groups (MIGs) are the foundation for building scalable, self-healing applications on Compute Engine. Combined with autoscaling, MIGs automatically adjust the number of VM instances in response to load, ensuring your application can handle traffic spikes while minimising costs during quiet periods.

Autoscaling Overview

Autoscaling automatically adds or removes VM instances in a MIG based on configurable signals. When demand increases, the autoscaler creates new instances from the instance template. When demand decreases, it removes instances to reduce costs.

Autoscaling Signals

Signal	Description	Use Case
CPU utilisation	Average CPU usage across all instances	General workloads
HTTP load balancing utilisation	Requests per second per instance	Web applications behind a load balancer
Cloud Monitoring metrics	Custom or built-in Stackdriver metrics	Queue depth, custom business metrics
Schedule	Time-based scaling on a recurring schedule	Predictable traffic patterns

Configuring Autoscaling

# Set autoscaling based on CPU utilisation
gcloud compute instance-groups managed set-autoscaling web-mig \
  --zone=europe-west2-a \
  --min-num-replicas=2 \
  --max-num-replicas=20 \
  --target-cpu-utilization=0.60 \
  --cool-down-period=90

Parameter	Description
min-num-replicas	Minimum number of instances (never scales below this)
max-num-replicas	Maximum number of instances (never scales above this)
target-cpu-utilization	Target average CPU (0.60 = 60%) — autoscaler adds instances when above, removes when below
cool-down-period	Seconds to wait after creating an instance before including it in utilisation calculations

Load Balancer-Based Autoscaling

gcloud compute instance-groups managed set-autoscaling web-mig \
  --zone=europe-west2-a \
  --min-num-replicas=2 \
  --max-num-replicas=50 \
  --target-load-balancing-utilization=0.80

Custom Metric-Based Autoscaling

You can scale based on any Cloud Monitoring metric, including custom metrics published by your application:

gcloud compute instance-groups managed set-autoscaling web-mig \
  --zone=europe-west2-a \
  --min-num-replicas=2 \
  --max-num-replicas=30 \
  --update-stackdriver-metric=custom.googleapis.com/queue_depth \
  --stackdriver-metric-single-instance-assignment=10

This configuration tells the autoscaler: "Each instance should handle 10 items from the queue. If the total queue depth is 50, run 5 instances."

Schedule-Based Autoscaling

For predictable traffic patterns, you can configure scaling schedules:

gcloud compute instance-groups managed update-autoscaling web-mig \
  --zone=europe-west2-a \
  --set-schedule=business-hours \
  --schedule-min-required-replicas=10 \
  --schedule-cron="0 8 * * 1-5" \
  --schedule-duration-sec=36000 \
  --schedule-description="Scale up during business hours"

Autohealing

Autohealing monitors the health of each instance in the MIG and automatically recreates unhealthy instances. It uses health checks to determine instance health.

Configuring Health Checks

# Create a health check
gcloud compute health-checks create http web-health-check \
  --port=80 \
  --request-path=/health \
  --check-interval=10s \
  --timeout=5s \
  --healthy-threshold=2 \
  --unhealthy-threshold=3

# Apply the health check to the MIG for autohealing
gcloud compute instance-groups managed update web-mig \
  --zone=europe-west2-a \
  --health-check=web-health-check \
  --initial-delay=300

Managed Instance Groups & Autoscaling

Managed Instance Groups & Autoscaling

Autoscaling Overview

Autoscaling Signals

Configuring Autoscaling

Load Balancer-Based Autoscaling

Custom Metric-Based Autoscaling

Schedule-Based Autoscaling

Autohealing

Configuring Health Checks

More in Cloud