You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
The Architecture Review Process is how you put the GCP Architecture Framework into practice. It is a structured, repeatable evaluation of your workload against the framework's pillars to identify strengths, weaknesses, and improvement opportunities. Regular architecture reviews transform the framework from a reference document into a living practice that continuously improves your cloud workloads.
Without regular reviews, architectures drift over time. New features are added without considering the broader impact, quick fixes become permanent, and the gap between the current architecture and best practices widens:
| Drift Type | Example |
|---|---|
| Security drift | IAM permissions granted for debugging are never revoked |
| Cost drift | VMs are upsized during a load spike and never right-sized back |
| Reliability drift | A single-zone database was "temporary" and has been running for two years |
| Performance drift | A caching layer was removed to fix a bug and never re-implemented |
| Operational drift | Monitoring dashboards no longer reflect the current architecture |
Architecture reviews provide a systematic checkpoint that catches drift before it causes problems. They create a feedback loop between the architecture framework and actual workloads.
| Type | When | Duration | Scope |
|---|---|---|---|
| Pre-launch review | Before a new workload goes to production | 2-4 hours | Full pillar review |
| Post-incident review | After a significant incident | 1-2 hours | Focused on the affected pillar(s) |
| Periodic review | Quarterly or after major changes | 2-3 hours | Full pillar review |
| Migration review | Before and after a migration to GCP | 3-4 hours | Full pillar review with migration focus |
| Ad-hoc review | When concerns arise about a specific area | 1 hour | Single pillar deep dive |
| Activity | Description |
|---|---|
| Define the workload boundary | What services, databases, networks, and dependencies are included? |
| Identify stakeholders | Who should attend? (architects, developers, SREs, security, finance) |
| Gather documentation | Architecture diagrams, runbooks, SLOs, cost reports, incident history |
| Select pillars | Review all pillars or focus on specific ones based on current priorities |
Work through each pillar using the framework's best practices as a checklist:
| Question | Evidence |
|---|---|
| Is all infrastructure defined as code? | Terraform/Pulumi repository |
| Is CI/CD automated with staged rollouts? | Cloud Build configuration |
| Are dashboards in place for the four golden signals? | Cloud Monitoring dashboards |
| Are alerts configured with runbooks? | Alerting policies with documentation |
| Is there a defined incident management process? | Incident response playbook |
| Are post-mortems conducted after incidents? | Post-mortem documents |
| Question | Evidence |
|---|---|
| Is IAM following least privilege? | IAM policy audit, Recommender output |
| Are service account keys eliminated? | Workload Identity configuration |
| Is data encrypted with CMEK where required? | KMS key inventory |
| Are secrets managed in Secret Manager? | Secret Manager audit |
| Is network segmented with firewall rules? | VPC and firewall rule review |
| Are audit logs enabled and monitored? | Cloud Logging configuration |
| Question | Evidence |
|---|---|
| Are SLOs defined and measured? | Service Monitoring configuration |
| Is the workload deployed across multiple zones? | Resource distribution review |
| Are health checks and self-healing configured? | Health check and MIG settings |
| Is DR tested regularly? | DR test results |
| Are backups automated and tested? | Backup configuration and restore logs |
| Are circuit breakers and retry logic implemented? | Application code review |
| Question | Evidence |
|---|---|
| Is the compute platform appropriate? | Workload analysis |
| Is autoscaling configured correctly? | Autoscaling configuration |
| Are caching layers in place? | Memorystore, Cloud CDN configuration |
| Are database queries optimised? | Query performance logs |
| Is load testing conducted regularly? | Load test reports |
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.