The pre-incident playbook. Design your tool layer before you need it. In production, the tool layer IS your containment boundary.

This file is one self-contained piece of the AI IR Overlay™ framework. Cross-references to other pieces point to other packages in the same set, which you can obtain at jacobideji.com.

Playbook 04: Tool Design Is Containment

Safe prompts help the agent behave. Safe tools prevent irreversible impact. In an agent-powered system, the tool layer isn’t plumbing. It’s the containment boundary.

Premise

Most AI agent incidents aren’t the result of compromise in the traditional sense. They’re the result of an agent executing the wrong sequence of perfectly valid actions, at machine speed, simply because those actions were available. Emailing the wrong customer list. Updating the wrong vendor record. Closing the wrong tickets. Pushing changes to the wrong repo. Opening the wrong firewall rule.

Each of those actions can be performed by an authorized service identity, logged as a legitimate API call, and pass every endpoint-security check you have. Prompt guardrails don’t stop them. Model retraining doesn’t stop them. Only one thing stops them: the absence of the tool that performs the action, or the gating around it.

This is the foundational pre-incident playbook in the AI IR Overlay series. Where Playbook 01 covers what to do when an incident is happening, Playbook 04 covers the work that determines whether the response in Playbook 01 will be surgical or catastrophic. The single most leveraged decision in AI IR is not made under pressure during an incident. It’s made on a quiet Tuesday when you decide which tools your agent can call, with what scope, with what controls.

Mental Model clause engaged: if it can act, govern it as a privileged identity. The tool list IS the agent’s privilege grant. Review it in your PAM cadence.

Use this playbook when: you’re designing a new agent for production, reviewing an existing agent before its next deployment, building an Agent Privilege Matrix for the first time, or operationalizing M3 Tool Tiering in your Kill-Switch Modes runbook.

First-Hour Actions

If you start this work today, the highest-leverage first hour isn’t a full tool audit. It’s a single-question scoping exercise on your highest-risk production agent.

The 60-minute drill:

Minute	Action
0–10	Pick one production agent. Pull its current tool list by enumeration, not memory.
10–25	For each tool, ask: if this tool ran with the wrong parameters for 10 minutes, what is the worst defensible outcome? Record the answer in plain English.
25–35	Sort the tools into Tier 0 / Tier 1 / Tier 2 (definitions below). If you can’t decide between T1 and T2, default to T2.
35–45	Identify the single most dangerous tool the agent can call. (For most teams, it’s send external email, write to ERP, deploy code, or change cloud config.)
45–60	Pick one control to add to that tool this week: an allowlist, a cap, an approval gate, or a draft/preview layer. One tool. One upgrade. Measurable risk reduction.

That’s the first-hour version of this playbook. Do it for one agent, ship the upgrade, and move to the next. Tool design is a practice, not a project.

Containment Options

Tool design directly enables Kill-Switch Mode M3 (Tool Tiering). Without pre-tiered tools, M3 isn’t a real mode. It’s a theoretical option that can’t be executed in 10 minutes under pressure.

The Tool-Tiering Model

The single most important deliverable of this playbook is a tiered view of every tool your agent can call. The model:

Tier	Risk class	Examples	Default control
T0	Read-only, low risk	Search KB · summarize ticket · read policy · retrieve status · query records	Allowed by default
T1	Bounded writes, moderate risk	Draft (do not send) email · update internal ticket fields · create tasks · soft-delete	Allowed + caps + allowlists
T2	Systems of record, high risk	Send external email · update CRM/ERP · push code · change cloud config · perform financial / security / identity actions	Approvals required (or disabled until explicitly needed)

Capture this in the Agent Privilege Matrix template. The risk_tier column is the operational handle for M3 containment. When an incident requires Tool Tiering, you filter the CSV by tier and disable T2 in seconds.

The Five Controls (apply per-tool)

Every tool in your matrix should be evaluated against five control dimensions:

Control what can be done. Split read vs. write. No do-everything endpoints. Tier the result.
Control where it can be done. Allowlists for domains, tenants, repos, record types. Restrict write scope to narrow objects and fields.
Control how much can be done. Caps per agent run (records, emails, updates). Rate limits and burst controls.
Control irreversibility. Prefer draft over send. Diff previews before applying. Undo paths (revert logs, soft deletes, staging-first).
Control accountability. Log tool calls with parameters AND results. Capture approver identity and rationale for Tier 2 actions. Maintain an incident decision log.

A tool that passes all five is agent-safe. A tool that fails any of them is a future incident waiting for the right (wrong) sequence of valid actions.

Evidence Priorities

Good tool design isn’t just preventive. It’s evidentiary. The five controls above directly shape what the Minimum Evidence Set can prove during an incident.

Tool-design control	Evidence type strengthened	Why
Log tool calls with parameters + results	Type B (Tool-Call Ledger)	The ledger is only as good as the logging. Tool-call observability is a tool-design choice.
Capture approver identity + rationale (T2)	Type B + Type F	Approver records correlate with SaaS audit logs. Both become defensible records.
Diff previews before apply	Type B	The diff is the proof of what was about to happen. Critical for proving harm avoided.
Allowlists	Type B + Type F	Denied attempts show up in the ledger and downstream logs as evidence of intent
Caps per run	Type B	Cap-triggered halts are evidence the control held

A tool that emits structured logs with parameters, results, approver identity, and outcome is a tool that survives forensic review. A tool that emits only success / failure is a tool that destroys evidence by omission.

Operational requirement: every Tier 2 tool must produce structured logs sufficient to reconstruct who approved, when, with what parameters, with what result. All within the 60-minute evidence export window.

Recovery Sequence

Tool design also shapes recovery speed. The MVO-4 Controlled Re-Enable sequence depends on incremental tool re-enablement. Without tiering, you can only re-enable everything at once. That’s the most common recovery failure.

After containment, recovery proceeds in tier order:

Re-enable T0 (read-only) first. The agent functions. Business workflows that depend on retrieval and lookup resume.
Verify logging and policy enforcement. Confirm that the tools you re-enabled are emitting the structured logs your evidence set requires.
Re-enable T1 (bounded writes) with caps tightened. Lower the per-run cap to half its pre-incident value for the first 24–72 hours. Monitor.
Re-enable T2 (systems of record) one tool at a time, with approvals. Don’t batch-enable T2. Each one is a separate decision with a separate approver and a separate monitoring window.
Return to baseline caps. Only after monitoring thresholds confirm normal behavior over a documented observation window.

If your tools aren’t pre-tiered, this sequence collapses into a single binary decision, and the cost of getting it wrong is re-triggering the original incident.

Post-Incident Hardening

After an incident, tool design is where the lessons get written into the codebase, not into the runbook. A runbook entry that says “be more careful with the email tool” will be ignored. A code change that splits the email tool into draft and send can’t be ignored. The agent literally can’t send anymore without going through the new gate.

The hardening checklist:

Action	Outcome
Add the implicated tool to the Privilege Matrix if missing	Closes the inventory gap
Promote the tool’s tier if its blast radius was underestimated	Tightens future containment
Split the tool if it was a “god tool” (multi-verb or multi-object)	Narrows future blast radius
Add an allowlist if the wrong target was reached	Prevents the same scope error
Add a cap if a runaway happened	Bounds future runaway
Add a diff preview if the action was irreversible	Creates an undo path
Add structured logging if the evidence was insufficient	Strengthens future Type B captures
Update the AI-BOM `tools` section with the new control	Future responders see the new state
Schedule a tabletop within 30 days using the same scenario	Verifies the hardening holds

This is what “transforming lessons learned into guardrails” looks like in tool-layer terms.

Common Pitfalls

These are the highest-frequency failure modes in tool design. Each one quietly converts an agent from helpful to blast-radius-on-tap.

Pitfall	Why it happens	Consequence
God Tools (“update CRM”, “manage email”, “admin cloud”)	Convenience during prototyping; copy-pasted from vendor docs	Unlimited action surface. One wrong instruction becomes many wrong actions at machine speed.
No read/write split	The vendor SDK exposes both in one client; engineers don’t separate	Can’t enable T0 without enabling T2; M3 (Tool Tiering) becomes binary
Tier 2 defaulted to no-approval	Approval workflow not built; deferred to “later”	Approval gate exists only on paper; an incident reveals it was never enforced
Allowlist as policy comment, not code	Allowlist documented in the system prompt instead of enforced in the tool wrapper	Prompt injection bypasses it; allowlist provides false assurance
No diff preview on irreversible writes	“We’ll add it next sprint”	Recovery from a wrong write becomes manual replay across logs
Cap counts requests, not blast radius	Cap = “200 calls per run”; one call updates 10,000 records	Cap held; harm still occurred
Logging captures success only	“Success” branch was instrumented; failures and denials are silent	Type B evidence missing the most important rows (denied attempts = attacker intent)
Tools not in the AI-BOM inventory	Engineering shipped the tool without updating the manifest	Incident commander doesn’t know the tool exists; can’t scope blast radius
Tier 2 tools without an `approver` identity contract	Approval is “checked” but not recorded with who, when, why	F (downstream audit) can’t correlate approval with action
Re-using one tool definition across multiple agents with different risk profiles	DRY engineering reflex	A T1 tool for Agent A is a T2 tool for Agent B; matrix conflicts go unresolved

Distributed as separate packages or files within the framework:

Agent Privilege Matrix template: templates/agent-privilege-matrix.csv (the artifact this playbook operationalizes)
Privilege Matrix README: templates/README-privilege-matrix.md (column-by-column explanation of the matrix)
AI-BOM template: templates/ai-bom.yaml (the tools section is the source of truth for the matrix)
The Minimum Viable Overlay: framework/01-minimum-viable-overlay.md (MVO-1 Inventory + MVO-2 Safe Modes both depend on tool-design discipline)
The Mental Model: framework/02-mental-model.md (clauses 1 and 4: if it can act, govern it as a privileged identity and if it can change, manage it as software)
Kill-Switch Modes: kill-switches/overview.md (M3 Tool Tiering is the mode this playbook prepares you to execute)
Minimum Evidence Set: evidence/minimum-evidence-set.md (Type B Tool-Call Ledger is shaped by tool-design choices)
Playbook 01: The Agent Is a Privileged Identity (playbooks/01-agent-as-privileged-identity.md) (the response playbook this one prepares you for)
NIST AI RMF crosswalk: crosswalks/nist-ai-rmf.md (this playbook supports MAP 4.1, MANAGE 1.3, MANAGE 2.4)
NIST CSF 2.0 crosswalk: crosswalks/nist-csf-2.md (this playbook supports ID.AM-05, PR.AA-05, RS.MI-01)
OWASP Agentic Top 10 crosswalk: crosswalks/owasp-agentic-top-10.md (this playbook responds primarily to ASI02 Tool Misuse & Exploitation, ASI03 Identity & Privilege Abuse, ASI05 Unexpected Code Execution)

The Question to Carry Forward

If you do nothing else after reading this playbook, answer this one question for your highest-risk production agent:

If your agent made the wrong decision for 10 minutes, which tool would do the most damage, and what control stands between the agent and that tool right now?

If you can’t name the tool, you have an inventory gap (MVO-1). If you can name the tool but not the control, you have a tool-design gap (this playbook). If you can name both but the control is unenforced, you have an enforcement gap (the next code change).

Pick the gap with the smallest cost-to-close. Close it this week. Move to the next agent.

That’s how the tool layer becomes the containment boundary it should already be.

Source: AI IR Overlay newsletter, Issue #4, “Tool Design Is Containment,” by Jacob Ideji. https://www.linkedin.com/in/jacobideji/