Skip to content

Re-Engineer

Re-engineer WorkflowsΒΆ

Stage 2 of the PRIME FrameworkΒΆ

"Don't automate broken processes. The Re-engineer stage ensures we're building the right automation, not just making bad workflows faster."

Stage Outcome

Deliverable: Technical architecture documents, workflow diagrams, and design decision records with safety mechanisms planned.

Typical Result: Workflows redesigned to eliminate 30-50% of manual steps, built for parallel execution and scalability before coding begins.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
```mermaid
graph TD
A[Current Workflow] -->|Analyse| B[Identify Issues]
B -->|Redesign| C[Prime Workflows]
C -->|Architect| D[Technical Design]
D -->|Validate| E[Safety Mechanisms]

style A fill:#999
style B fill:#7B68EE
style C fill:#8A7AEE
style D fill:#998CEE
style E fill:#A89EEE
```

Prime Terminology Used: Prime Workflows design, Prime Agents architecture planning


🎯 Objective¢

Design optimised, scalable workflows and architecture before writing code. This stage prevents the costly mistake of automating inefficient processes.


🚫 The Automation Trap¢

The most common (and expensive) mistake in automation:

1
2
3
4
5
    ```text
    Current Manual Workflow (inefficient)
            ↓ automate directly
    Automated Workflow (still inefficient, now faster!)
    ```

Example:
Manually adding VLANs requires logging into 5 switches individually, copying configs, pasting with modifications, saving.

Bad Automation: Script that mimics these exact steps
Good Re-engineering: Template-based bulk provisioning with validation

Why Re-engineering Comes Before Implementation

If you skip this stage and jump straight to coding, you'll automate your current inefficiencies. You'll get a faster version of a bad process. Then you're locked into that design.

This stage is where you solve the problem permanentlyβ€”by redesigning the workflow before automating it. It costs more upfront, but saves infinitely more in the long run.


βœ… What Happens During Re-engineerΒΆ

1. Process AnalysisΒΆ

For each prioritised automation from the Pinpoint stage, we map the current workflow:

Current State MappingΒΆ

Example: VLAN Provisioning (Current Process)ΒΆ

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
    ```text
    1. Receive change ticket
    2. Identify target switches from site documentation
    3. SSH to each switch individually
    4. Copy running-config for backup (manual paste to notepad)
    5. Enter config mode
    6. Add VLAN commands (typing by hand)
    7. Save config
    8. Repeat steps 3-7 for remaining switches
    9. Update change ticket
    ```

Identified Issues:

  • ❌ No validation before applying config
  • ❌ No rollback mechanism if VLAN ID conflicts
  • ❌ Manual typing introduces errors
  • ❌ No verification VLAN was actually created
  • ❌ Sequential execution (slow for many switches)
  • ❌ No audit trail beyond ticket notes

2. Workflow RedesignΒΆ

We design an optimised process that addresses identified issues:

Future State DesignΒΆ

Example: VLAN Provisioning (Re-engineered)ΒΆ

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
    ```text
    1. Receive change ticket (parsed for VLAN details)
    2. Validate VLAN ID doesn't conflict
    3. Generate config from template (Jinja2)
    4. Identify target switches from inventory (CSV/Netbox)
    5. Pre-flight checks:
    - Verify device reachability
    - Check VLAN ID availability
    - Validate trunk port capacity
    6. Apply config to all switches (parallel execution)
    7. Post-flight validation:
    - Verify VLAN in show vlan
    - Check STP state
    8. Generate completion report (with before/after snapshots)
    9. Auto-update ticket with results
    ```

Improvements:

  • βœ… Template-based (zero typing errors)
  • βœ… Pre-flight validation (catch conflicts before change)
  • βœ… Parallel execution (10x faster)
  • βœ… Post-flight verification (proves success)
  • βœ… Automatic rollback on failure
  • βœ… Comprehensive audit trail

3. Safety Mechanism DesignΒΆ

Production networks require bulletproof safety:

Pre-Flight ChecksΒΆ

Before making any changes, automation should verify:

Connectivity:

  • Device reachable via ICMP
  • SSH port accessible
  • Authentication successful
  • Sufficient privilege level

State Validation:

  • Device not in maintenance mode
  • No active config sessions (prevent collision)
  • Sufficient CPU/memory headroom
  • Required feature sets enabled

Change Validation:

  • Configuration doesn't conflict with existing state
  • Required parameters present and valid
  • Change scope matches authorization
  • Dry-run simulation successful

Rollback CapabilityΒΆ

Every automation should include:

  • Checkpoint save before changes
  • Atomic operations (all-or-nothing for multi-device)
  • Automatic rollback on failure
  • Manual rollback procedure documented

4. Architecture PlanningΒΆ

For each automation, we design the technical architecture:

Data FlowΒΆ

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
    ```text
    [User Input] β†’ [Validation Layer] β†’ [Inventory Source]
                    ↓
            [Template Engine]
                    ↓
    [Pre-Flight Checks] β†’ [Device Connection Pool]
                    ↓
            [Parallel Execution]
                    ↓
    [Post-Flight Validation] β†’ [Reporting Engine]
                    ↓
            [Audit Log] + [Ticket Update]
    ```

Component SelectionΒΆ

Requirement Technology Choice Rationale
Device connection Netmiko Broad platform support, reliable
Templating Jinja2 Industry standard, powerful
Inventory CSV β†’ Netbox (future) Start simple, path to scale
Parallel execution Threading Good enough for <100 devices
Validation TextFSM Structured data from show commands
Logging Python logging module Structured, rotatable logs

Scalability PlanningΒΆ

Current Scale: 50 devices
12-Month Scale: 150 devices
24-Month Scale: 300+ devices

Design Decisions:

  • Threading sufficient now, document async migration path
  • CSV inventory works now, plan Netbox integration at 100+ devices
  • Local execution acceptable now, consider container deployment at scale

5. Integration DesignΒΆ

Automation rarely exists in isolation. We design integrations with:

External SystemsΒΆ

Network Management:

  • DNS β€” Validate hostnames, update records if automation creates interfaces
  • IPAM β€” Reserve IPs, prevent conflicts
  • Monitoring β€” Trigger config refresh after changes
  • Netbox/CMDB β€” Source of truth for inventory

Business Systems:

  • Ticketing β€” Auto-update status, attach reports
  • Workflow Systems β€” Approval gates for high-risk changes
  • Notification β€” Email, Slack, Teams alerts

Security Systems:

  • Credential Vaults β€” HashiCorp Vault, CyberArk
  • Logging β€” Syslog, SIEM integration
  • Audit Systems β€” Compliance reporting

6. Error Handling StrategyΒΆ

We design comprehensive error handling:

Failure ModesΒΆ

Failure Type Detection Response
Device unreachable Pre-flight ICMP check Skip device, log, continue
Authentication failure SSH connection attempt Alert, halt (credential issue)
Config syntax error Commit check Rollback, alert
Post-validation fail Show command parsing Rollback, detailed logging
Partial multi-device failure Per-device validation Complete successful, report failed

Logging StrategyΒΆ

  • INFO: Normal operations, successful executions
  • WARNING: Recoverable issues, devices skipped
  • ERROR: Failures requiring attention
  • CRITICAL: System-wide failures, safety mechanism triggers

πŸ“Š Deliverable: Technical Design DocumentsΒΆ

At the end of the Re-engineer stage, you receive:

1. Process Flow DiagramsΒΆ

Visual representation of optimised workflows with:

  • Current state vs. future state comparison
  • Decision points and conditional logic
  • Error handling paths
  • User interactions points

2. Technical Architecture DocumentsΒΆ

For each automation:

  • Component architecture diagram
  • Data flow mapping
  • Technology stack justification
  • Integration touchpoints
  • Scalability roadmap

3. Safety & Validation PlansΒΆ

  • Pre-flight check specifications
  • Post-flight validation criteria
  • Rollback procedures
  • Testing strategy (lab/staging approach)

4. Implementation BlueprintsΒΆ

Detailed specifications for the Implement stage:

  • Required Python libraries
  • Configuration file structures
  • Logging format standards
  • Error message conventions
  • CLI argument specifications

πŸ’‘ Why Re-engineer MattersΒΆ

Returns Compound Over TimeΒΆ

A well-designed workflow becomes the template for future automations:

  • First automation: 4 weeks to design + implement
  • Second automation: 2 weeks (reuse patterns)
  • Fifth automation: 1 week (mostly template customization)

Prevents Expensive RewritesΒΆ

Skipping Re-engineer leads to:

  • ❌ Hard-coded values throughout code
  • ❌ No validation (issues discovered in production)
  • ❌ Can't handle edge cases (brittle)
  • ❌ Doesn't scale (rewrite needed at 50 β†’ 200 devices)

With Re-engineer:

  • βœ… Template-driven (easy to modify)
  • βœ… Comprehensive validation (catches issues early)
  • βœ… Handles edge cases gracefully
  • βœ… Scales to 10x without major changes

πŸš€ What Happens NextΒΆ

After Re-engineer, proceed to Stage 3: Implement where designs become production-ready code.

The implementation team (whether internal or Nautomation Prime) now has:

  • Clear requirements
  • Proven design patterns
  • Safety guardrails defined
  • Success criteria established

This dramatically accelerates development and ensures quality.


πŸ“‹ Re-engineer ChecklistΒΆ

Before moving to Implement stage:

  • Current workflow documented with pain points identified
  • Future state workflow designed with safety mechanisms
  • Architecture reviewed and technology choices justified
  • Integration requirements identified and documented
  • Error handling strategy defined
  • Validation criteria established (what "success" looks like)
  • Scalability plan documented (today + 12/24 months)
  • Design review completed with stakeholders
  • Lab/test environment requirements confirmed

πŸ’Ό Engagement OptionsΒΆ

Re-engineer as Part of Full PRIME EngagementΒΆ

Included as Stage 2 when you engage for the complete framework. Typically 1-2 weeks duration per automation project.

Standalone Re-engineer ServiceΒΆ

Sometimes clients have identified their automations but need design help:

Fixed Fee: Β£3,000 - Β£6,000 per automation (depending on complexity)

Includes:

  • Current state workflow analysis
  • Future state process design
  • Technical architecture documents
  • Safety & validation planning
  • Implementation blueprints

Perfect for: Internal teams with Python skills but need architecture guidance


πŸŽ“ Learn MoreΒΆ


Mission: To empower network engineers through the PRIME Frameworkβ€”delivering automation with measurable ROI, production-grade quality, and sustainable team capability built on the PRIME Philosophy of transparency, measurability, ownership, safety, and empowerment.


← Previous: Pinpoint | Back to PRIME Framework | Next: Implement β†’