You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Monitoring and DevOps are deeply interconnected. The best CI/CD pipelines include monitoring at every stage, and the best monitoring strategies are informed by the deployment process. This final lesson brings together everything covered in the course and outlines best practices for building observable, automated Azure workloads.
Effective DevOps teams treat monitoring as a core part of the delivery pipeline, not an afterthought:
Plan → Code → Build → Test → Deploy → Monitor → Feedback → Plan
Each stage generates telemetry that feeds back into the next iteration:
| Stage | Monitoring Activity |
|---|---|
| Plan | Review SLO dashboards, error budgets, and incident retrospectives |
| Code | Static analysis, security scanning, dependency vulnerability checks |
| Build | Build duration metrics, test coverage trends, artifact size tracking |
| Test | Test pass/fail rates, performance regression detection |
| Deploy | Deployment frequency, lead time, change failure rate |
| Monitor | Application performance, infrastructure health, user experience |
| Feedback | Incident analysis, user feedback, cost reports |
The DevOps Research and Assessment (DORA) team identified four key metrics that predict software delivery performance:
| Metric | Description | Elite Target |
|---|---|---|
| Deployment Frequency | How often code is deployed to production | On demand (multiple deploys per day) |
| Lead Time for Changes | Time from code commit to production deployment | Less than 1 hour |
| Change Failure Rate | Percentage of deployments causing a failure | Less than 5% |
| Time to Restore Service | Time to recover from a failure in production | Less than 1 hour |
Deployment Frequency = Count of production deployments / Time period
Lead Time = Average(deployment timestamp - commit timestamp)
Change Failure Rate = Failed deployments / Total deployments
MTTR = Average(recovery timestamp - incident start timestamp)
Track these metrics by querying Azure DevOps or GitHub APIs and visualising them in Azure Dashboards or Grafana.
Before deploying to production, validate:
| Strategy | Description | Monitoring Need |
|---|---|---|
| Blue-green | Two identical environments; switch traffic instantly | Compare metrics between blue and green |
| Canary | Route a small percentage of traffic to the new version | Compare canary metrics against baseline |
| Rolling | Gradually replace instances with the new version | Monitor each instance as it updates |
| Feature flags | Deploy code but control feature activation separately | Monitor feature flag impact on metrics |
After every deployment:
Configure automated rollback based on monitoring signals:
# Example: Azure DevOps pipeline with health check gate
- stage: DeployProduction
jobs:
- deployment: Deploy
environment: production
strategy:
runOnce:
deploy:
steps:
- task: AzureWebApp@1
inputs:
appName: 'webapp-production'
postRouteTraffic:
steps:
- task: AzureCLI@2
inputs:
scriptType: bash
inlineScript: |
# Check Application Insights for error rate
ERROR_RATE=$(az monitor app-insights query \
--app <app-id> \
--analytics-query "AppRequests | where TimeGenerated > ago(10m) | summarize ErrorRate = countif(Success == false) * 100.0 / count()" \
--query 'tables[0].rows[0][0]' -o tsv)
if (( $(echo "$ERROR_RATE > 5" | bc -l) )); then
echo "Error rate too high: $ERROR_RATE%. Rolling back."
exit 1
fi
Monitoring configuration should be version-controlled and deployed alongside your infrastructure:
// main.bicep — deploy app + monitoring together
resource appService 'Microsoft.Web/sites@2023-01-01' = {
name: appName
location: location
properties: {
serverFarmId: appServicePlan.id
}
}
resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
name: '${appName}-insights'
location: location
kind: 'web'
properties: {
Application_Type: 'web'
WorkspaceResourceId: logAnalyticsWorkspace.id
}
}
resource diagnosticSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = {
name: '${appName}-diagnostics'
scope: appService
properties: {
workspaceId: logAnalyticsWorkspace.id
logs: [
{ category: 'AppServiceHTTPLogs', enabled: true }
{ category: 'AppServiceAppLogs', enabled: true }
]
metrics: [
{ category: 'AllMetrics', enabled: true }
]
}
}
resource cpuAlert 'Microsoft.Insights/metricAlerts@2018-03-01' = {
name: '${appName}-high-cpu'
location: 'global'
properties: {
severity: 2
scopes: [appService.id]
evaluationFrequency: 'PT1M'
windowSize: 'PT5M'
criteria: {
'odata.type': 'Microsoft.Azure.Monitor.SingleResourceMultipleMetricCriteria'
allOf: [
{
name: 'HighCPU'
metricName: 'CpuPercentage'
operator: 'GreaterThan'
threshold: 85
timeAggregation: 'Average'
}
]
}
actions: [{ actionGroupId: actionGroup.id }]
}
}
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.