Pre-Flight Checks - Failing Fast Before Making Changes
title: Pre-Flight Checks: Failing Fast Before Making Changes description: Use reachability, auth, read-only, and environmental checks to stop unsafe runs early and protect production. tags: - Production Principles - Pre-Flight - Validation - Safety - Network Automation
Why Pre-Flight Exists¶
Automation fails expensively when it discovers basic problems too late. Pre-flight checks move failure to the earliest possible point.
Goal: spend seconds validating, not hours recovering.
Baseline Pre-Flight Control Set¶
Before writing anything, validate:
- Reachability to management endpoint
- Authentication and authorisation scope
- Device in expected mode and software train
- Read-only command sanity checks
- Change window and maintenance-state flags
- Platform-specific constraints (CPU, memory, disk, control-plane health)
Layered Validation Model¶
Use progressive gates:
- Connectivity gate: ping or TCP reachability
- Session gate: login and privilege verification
- State gate: read-only health and policy checks
- Intent gate: confirm prerequisites for this specific change
Do not skip to step 4 if earlier gates fail.
Pattern: Fast Abort With Clear Reason¶
checks = [
check_reachability,
check_authz,
check_platform_state,
check_change_window,
check_change_prerequisites,
]
for check in checks:
result = check(device)
if not result.ok:
record_preflight_failure(device=device, check=check.__name__, reason=result.reason)
raise RuntimeError(f"Pre-flight failed: {check.__name__}: {result.reason}")
Critical implementation detail:
- Return machine-readable reasons, not only free-text strings
- Aggregate failures for reporting in read-only mode
- Enforce immediate abort for write mode
Pre-Flight Operating Modes¶
Useful modes:
- Report-only: collect failures without write attempts
- Enforced: block writes on any critical pre-flight failure
- Scoped override: allow specific exception IDs only
This makes testing safer while preserving strict production behaviour.
Production Checklist¶
- Pre-flight runs for every target in every execution
- Critical checks are consistent across sites
- Failures are categorised by severity and reason code
- Writes are blocked when critical checks fail
- Operators receive actionable failure summaries
Anti-Patterns¶
- Pre-flight only in CI but not at runtime
- Warnings with no enforcement path
- One giant opaque check instead of layered checks
- Continuing on auth or privilege mismatch