Containment is not one switch. Six modes with TTA targets.

This file is one self-contained piece of the AI IR Overlay™ framework. Cross-references to other pieces point to other packages in the same set, which you can obtain at jacobideji.com.

Kill-Switch Modes: The Containment Ladder

The “kill switch” is not one switch. It is six modes that escalate from passive observation to full disable, and back.

Binary on/off is rarely appropriate in production. The Overlay defines six modes so containment can be calibrated to confidence, impact, and business need.

The Modes at a Glance

Mode	Name	Use when	TTA target	Approver
M0	Observe	Normal operations	n/a	Owner
M1	Read-Only	Suspicious behavior; low/moderate impact	≤ 10 min	Tier-1 SOC
M2	Approvals Required	Agent must keep operating; actions need two-person rule	≤ 10 min	Tier-1 SOC
M3	Tool Tiering	Targeted containment. Disable high-risk tools only	≤ 10 min	Tier-1 SOC
M4	Full Disable	Active harm, confirmed misuse, or evidence of compromise	≤ 10 min	Tier-1 SOC
M5	Controlled Re-Enable	Containment validated; staged recovery	n/a	CISO / IC

TTA = Time-To-Activate. Measured from incident-commander order to mode in effect.

Mode 0: Observe (Baseline)

Purpose: Normal operations with logging.

Requirements:

Tool calls logged with parameters and outcomes
Prompt/response logged for the configured retention window
Identity correlation in SaaS audit logs

Exit criteria: Incident declared. Step up to M1 (or further) based on confidence.

Mode 1: Read-Only (Preferred First Containment)

Purpose: Stop writes without stopping the business.

What changes: All write tools are stripped from the agent’s tool set. Read and query tools remain.

Use when:

Suspicious behavior with unclear scope
Business impact appears low to moderate
You need time to investigate without triggering customer-visible failures

Operational checks:

All write tools confirmed disabled in production config
Test query confirms reads still function
Logging continues at M0 fidelity

Exit criteria: Investigation confirms benign (return to M0), confirms harm (step up to M3/M4), or needs continued operation with control (step to M2).

Mode 2: Approvals Required (Two-Person Rule)

Purpose: Continue operation, but no action without a human approver.

What changes: Every tool call is queued for human approval before execution.

Use when:

Agent must continue for business continuity
Risk of automated action is unacceptable
Cost of full disable is higher than approval latency

Operational checks:

Approval queue is staffed
Average approval latency is acceptable for business need
Approver has authority to deny

Exit criteria: Risk reduced (return to M1 or M0), or containment must escalate (M3/M4).

Mode 3: Tool Tiering

Purpose: Contain selectively. Disable high-risk tools, keep low-risk.

What changes: Specific tools (external email send, code deploy, financial actions) are disabled. Lower-risk tools (internal search, status lookup) remain.

Use when:

The harm vector is known and isolated to specific tools
Business needs require continued operation of unaffected tools
Full disable would cause unacceptable disruption

Operational checks:

Tier definitions are pre-documented (see the Agent Privilege Matrix)
Disabled tools confirmed unreachable in production
Remaining tools confirmed operational

Exit criteria: Vector contained (step down to M1/M2), or scope expands (M4).

RAG-specific containment: when the suspected attack path is the retrieval layer (RAG, knowledge base, vector index), use the M3-RAG variant. Cut retrieval to the suspect corpus and leave the other corpora alone. The agent keeps working with reduced knowledge but no exposure to poisoned content. See Playbook 03: RAG / Knowledge-Base Forensics for the freeze-the-world sequence and the seven-component pipeline forensics.

Mode 4: Full Disable

Purpose: Hard stop.

What changes: Agent is taken offline. Active sessions are terminated. Tokens are scoped for revocation (not yet rotated; see Evidence Plan).

Use when:

Active harm is occurring
Compromise is confirmed
Containment under M1 to M3 has failed

Operational checks (critical sequence):

Snapshot identity and capabilities BEFORE token rotation (see Step 2 of the Minimum Evidence Set)
Capture the Minimum AI Evidence Set BEFORE redeployment
Only then: rotate credentials, clean corpora, redeploy

Rotating tokens before capturing scopes is the single most common evidence-destruction failure in AI IR.

Exit criteria: Eradication complete. Move to M5 controlled re-enable.

Mode 5: Controlled Re-Enable (Recovery)

Purpose: Restore operation in stages, with validation at each step.

Sequence:

Re-enable in Read-Only (M1). Confirm the agent functions and logs flow.
Validate retrieval and tool policies. Corpora versions confirmed clean.
Replay the incident scenario in a safe harness. Confirm fix holds.
Re-enable tools incrementally. Start with low-risk, monitor for drift.
Return to M0 Observe. Only after all of the above.

Approver: CISO or designated Incident Commander. Never the original agent owner alone.

Operational checks:

Post-incident hardening complete (see Playbook 18: Post-Incident Hardening)
Monitoring thresholds updated to detect recurrence
Tabletop scheduled within 30 days to validate the fix

Pre-Production Requirements

Before ANY agent reaches production:

All six modes have been implemented in code/config
All six modes have been tested in a tabletop drill
TTA targets are met in measurement (not promised)
Runbook documents who pulls which lever, with backup approvers

If any of the above is “not yet,” the agent is not production-ready.

The Six Triage Questions: triage/six-questions.md
Minimum Evidence Set: evidence/minimum-evidence-set.md
Tool Design Is Containment (Playbook 04): playbooks/04-tool-design-is-containment.md
Testing for Agent Failure Modes (Playbook 14): playbooks/14-testing-for-agent-failure-modes.md
RAG / Knowledge-Base Forensics (Playbook 03): playbooks/03-rag-knowledge-base-forensics.md

Source: AI IR Overlay newsletter and framework synthesis, by Jacob Ideji. https://www.linkedin.com/in/jacobideji/

Kill-Switch Modes: The Containment Ladder

The Modes at a Glance

Mode 0: Observe (Baseline)

Mode 1: Read-Only (Preferred First Containment)

Mode 2: Approvals Required (Two-Person Rule)

Mode 3: Tool Tiering

Mode 4: Full Disable

Mode 5: Controlled Re-Enable (Recovery)

Pre-Production Requirements

Related