You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Real-world automation must handle failures gracefully. Ansible provides several mechanisms for error handling, custom failure conditions, and debugging tools to help you troubleshoot issues.
By default, when a task fails on a host:
any_errors_fatal: true is set)Continue execution even when a task fails:
tasks:
- name: Check if legacy app exists
command: which legacy-app
register: legacy_check
ignore_errors: true
- name: Report legacy app status
debug:
msg: "Legacy app {{ 'found' if legacy_check.rc == 0 else 'not found' }}"
Warning: Use
ignore_errorssparingly --- it suppresses all errors, which can mask real problems. Preferfailed_whenfor fine-grained control.
Continue when a host is unreachable (network issues, SSH failures):
tasks:
- name: Attempt to gather info
command: hostname
ignore_unreachable: true
- name: This still runs even if host was unreachable
debug:
msg: "Continuing..."
Define custom failure conditions:
tasks:
- name: Run database migration
command: /opt/app/migrate.sh
register: migration_result
failed_when:
- migration_result.rc != 0
- "'already up to date' not in migration_result.stdout"
- name: Check disk space
command: df -h /data
register: disk_check
failed_when: "'100%' in disk_check.stdout"
This lets you treat a non-zero return code as a success when the output contains an expected message.
Control when Ansible reports a task as "changed":
tasks:
- name: Check application version
command: /opt/app/version.sh
register: version_check
changed_when: false # This task never changes anything
- name: Run database migration
command: /opt/app/migrate.sh
register: migration
changed_when: "'migrated' in migration.stdout"
This is important for accurate reporting and for triggering (or not triggering) handlers.
Ansible's equivalent of try/catch/finally:
tasks:
- name: Handle deployment with error recovery
block:
- name: Deploy new version
command: /opt/app/deploy.sh
- name: Run smoke tests
command: /opt/app/smoke-test.sh
rescue:
- name: Rollback to previous version
command: /opt/app/rollback.sh
- name: Send failure notification
mail:
to: ops@example.com
subject: "Deployment failed on {{ inventory_hostname }}"
body: "Automatic rollback was executed."
always:
- name: Clean up temporary files
file:
path: /tmp/deploy-artifacts
state: absent
- name: Log deployment attempt
lineinfile:
path: /var/log/deployments.log
line: "{{ ansible_date_time.iso8601 }} - Deployment attempted on {{ inventory_hostname }}"
create: true
| Section | When It Runs |
|---|---|
| block | Always runs first (the main tasks) |
| rescue | Only runs if a task in block fails |
| always | Always runs, regardless of success or failure |
Stop the entire play immediately when any host fails:
---
- name: Critical deployment
hosts: webservers
any_errors_fatal: true
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.