You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Cloud Run offers two distinct workload types: services for handling HTTP requests and events, and jobs for running containers to completion. Understanding when and how to use each is essential for building effective serverless architectures on Google Cloud.
A Cloud Run service is a long-running process that listens on a network port and responds to incoming HTTP requests. Each service gets a stable HTTPS URL, automatic TLS certificate management, and built-in load balancing.
Every time you deploy a new container image or change a configuration parameter (environment variables, memory, CPU, concurrency), Cloud Run creates a new revision. Revisions are immutable snapshots of your service configuration.
| Concept | Description |
|---|---|
| Revision | An immutable snapshot of the service's container image and configuration |
| Traffic splitting | Route percentages of traffic to different revisions |
| Rollback | Route 100% of traffic back to a previous revision |
# Deploy a new revision
gcloud run deploy my-service \
--image europe-west2-docker.pkg.dev/my-project/repo/app:v2 \
--region europe-west2
# Split traffic: 90% to latest, 10% to previous revision
gcloud run services update-traffic my-service \
--to-revisions my-service-00001=10,LATEST=90 \
--region europe-west2
# Rollback to a specific revision
gcloud run services update-traffic my-service \
--to-revisions my-service-00001=100 \
--region europe-west2
Each Cloud Run instance can handle multiple concurrent requests. The concurrency setting controls how many requests a single container instance processes simultaneously.
| Setting | Description | Default |
|---|---|---|
| concurrency | Max concurrent requests per instance | 80 |
| max-instances | Maximum number of container instances | 100 |
| min-instances | Minimum warm instances (0 = scale to zero) | 0 |
| timeout | Maximum time for a single request | 300 seconds |
Setting concurrency too high can overwhelm a single instance, while setting it too low wastes resources. Profile your application to find the right balance.
Cloud Run offers two CPU allocation modes:
# Always allocate CPU
gcloud run deploy my-service \
--cpu-throttling=false \
--region europe-west2
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.