> agent.log: loading kill_chain_framework v1.0

The Agentic AI
Kill Chain

A 6-stage attack lifecycle framework for autonomous AI agent systems. Building on MITRE ATLAS (16 tactics, 84 techniques) and OWASP LLM Top 10 v2.0 — extending them for agents that chain decisions, use tools, delegate, and persist.

cite as:

Dhanasekaran, M. (2026). The Agentic AI Kill Chain. magesh.ai/kill-chain

> agent.log: framework_rationale

Why a New Framework

Existing frameworks don't account for the emergent security properties that arise when autonomy, long-term memory access, and dynamic tool usage are combined.

— "Securing Agentic AI: A Comprehensive Threat Model", arxiv 2504.19956

Adapted from Lockheed Martin

Same structural concept as the Cyber Kill Chain (sequential attack lifecycle), adapted for autonomous AI agents instead of network intrusion. 7 stages become 6 — weaponization is implicit in agent attacks.

Extends ATLAS + OWASP

MITRE ATLAS (16 tactics, 84 techniques) covers AI model attacks. OWASP LLM Top 10 covers application risks. This framework extends both into the agent-specific lifecycle — tool chains, delegation, memory persistence.

Practitioner-focused

Every stage maps to a defensive control. Not an academic taxonomy — operational guidance built from hands-on experience building and securing agentic systems with Claude Code, Kiro, and MCP.

Cloud-agnostic

No vendor-specific recommendations. The framework applies to any agent architecture — whether you're building with Claude, GPT, Gemini, or open-source models. Patterns over products.

What makes agentic AI attacks fundamentally different

Traditional

Attack a system from the outside

→

Agentic

Hijack a system that attacks for you

Traditional

Map network topology

→

Agentic

Map capability topology — tools, permissions, delegation chains

Traditional

Exploit code vulnerabilities

→

Agentic

Exploit trust and reasoning — the agent does the work

Traditional

Install malware for persistence

→

Agentic

Poison memory and config files — the agent persists for you

$ agent.map --topology --attack-surface

Every connection is an attack surface. Click a stage below to see where it strikes.

$ kill_chain.load --stages 6 --mode interactive

→ → → → →

agent.log

> select a stage to begin attack simulation...

01 RECON — Probe Agent Capabilities

What the attacker does

Maps the agent's tool access, permission boundaries, connected MCP servers, model type, system prompt constraints, and behavioral limits. Agent recon maps capability topology — not network topology.

How agents change this: Traditional recon maps network topology — ports, services, versions. Agent recon maps what the agent can do: which tools it has, what permissions are auto-approved, what its system prompt constrains, and how it connects to other agents and MCP servers.

Techniques

Enumerate available tools by asking the agent what it can do
Test permission boundaries by requesting escalating actions
Probe system prompt by asking about instructions or constraints
Map MCP server connections by observing tool call patterns
Identify model family through response characteristics

Real-world example

Documented

Researchers demonstrated system prompt extraction from ChatGPT, Claude, and Gemini through iterative probing — asking models to repeat their instructions verbatim, or using multi-turn conversations to gradually extract constraint boundaries.

Source: Perez & Ribeiro, "Ignore This Title and HackAPrompt" (2023); Simon Willison, "Prompt injection and jailbreaking are not the same thing" (2024)

Extends existing frameworks

ATLAS AML.TA0001 (Reconnaissance) covers ML model recon. This stage extends it with agent-specific vectors: tool enumeration, permission boundary probing, and MCP server discovery.

Defensive control

Minimize information disclosure about agent capabilities. Don't reveal tool lists, permission structures, or system prompt details in responses.

02 INJECT — Deliver the Payload

What the attacker does

Delivers adversarial input to alter agent behavior — through direct prompts, retrieved documents, tool responses, or data sources the agent consumes. The goal: change what the agent does, not just what it says.

How agents change this: Traditional prompt injection targets a single LLM response. Agent injection targets the planning/action loop — the agent doesn't just say something wrong, it does something wrong. Autonomously. Across multiple tool calls.

Techniques

Direct prompt injection in user input
Indirect injection in documents, web pages, or retrieved context
Tool-response poisoning — malicious data from MCP servers
Tool schema injection — malicious tool descriptions that alter behavior
Context window displacement — flooding context to push out safety instructions

Real-world example

Documented

Indirect prompt injection via web pages: a researcher embedded hidden instructions in a webpage that, when retrieved by a Bing Chat agent, caused it to exfiltrate the user's conversation history through a crafted URL. The agent followed the injected instruction because it couldn't distinguish retrieved content from user intent.

Source: Johann Rehberger, "Bing Chat Data Exfiltration via Indirect Prompt Injection" (2023); Greshake et al., "Not what you've signed up for" (2023)

Extends existing frameworks

ATLAS AML.T0051 (Prompt Injection), AML.T0099 (Tool Data Poisoning) cover prompt and data poisoning. This stage extends them with tool schema injection, MCP protocol-level vectors, and context displacement attacks.

OWASP LLM01 (Prompt Injection) covers direct and indirect injection. This stage extends it with tool-response and protocol-level injection vectors specific to agent architectures.

Defensive control

Validate and sanitize all external inputs. Treat tool responses as untrusted. Pin system instructions outside the context window where possible.

03 HIJACK — Override Agent Behavior

What the attacker does

Takes control of the agent's decision-making — redirecting goals, overriding instructions, or manipulating the reasoning chain. The agent continues operating autonomously — toward the attacker's objectives.

How agents change this: This isn't getting a bad output — the agent's ongoing autonomous behavior is redirected. It continues operating, reasoning through each step, using tools, making decisions. But now it's working toward the attacker's objectives. The victim becomes the weapon.

Techniques

Goal substitution — replace the agent's current objective
Instruction override — make the agent ignore system constraints
Reasoning chain manipulation — influence chain-of-thought
Persona hijacking — alter agent's role through accumulated context

Real-world example

Key Extension

Anthropic's own red-team research showed that Claude agents given tool access could be manipulated into executing multi-step attack sequences — reading files, modifying configs, and calling APIs — all from a single injected instruction that overrode the system prompt's constraints. The agent reasoned its way through each step autonomously.

Source: Anthropic, "Challenges in Red-Teaming AI Systems" (2024); Compound AI systems risk analysis, arxiv 2504.19956

Extends existing frameworks

This is where the Kill Chain adds the most. ATLAS and OWASP focus on model-level and application-level threats. This stage extends both into autonomous decision chain hijacking and goal substitution at the planning layer — building on OWASP LLM06 (Excessive Agency) with active hijacking of running agents.

Defensive control

Immutable system instructions. Reasoning chain monitoring. Behavioral anomaly detection against established baselines.

04 ESCALATE — Expand Access

What the attacker does

Uses the hijacked agent to gain broader access — abusing tool permissions, chaining through multi-agent delegation, or bypassing human-in-the-loop controls. Agents trust other agents — exploit the trust model.

How agents change this: Traditional privilege escalation exploits OS or network vulnerabilities. Agent escalation exploits the trust model — agents trust other agents, tools trust agent calls, autoApprove bypasses human review. A compromised orchestrator agent inherits the permissions of every agent it coordinates.

Techniques

Abuse existing tool permissions beyond intended scope
Chain multi-agent delegation to inherit higher privileges
Confused deputy — make a high-privilege agent act on attacker's behalf
Bypass autoApprove to execute without human review
Orchestrator compromise — hijack the coordinating agent

Real-world example

Multi-Agent

Researchers at UIUC demonstrated "confused deputy" attacks in multi-agent systems where a low-privilege agent crafted requests to a higher-privilege agent, inheriting its tool access. The orchestrator passed the request because inter-agent messages weren't treated as untrusted input — a trust boundary that doesn't exist in traditional systems.

Source: "Securing Agentic AI: A Comprehensive Threat Model", arxiv 2504.19956; Zenity Labs multi-agent attack research (2025)

Extends existing frameworks

ATLAS AML.TA0012 (Privilege Escalation) covers single-system escalation. This stage extends it into multi-agent delegation chains, confused deputy patterns in agent systems, and orchestrator compromise.

Defensive control

Least privilege for every tool and agent. No autoApprove for sensitive operations. Inter-agent authentication. Explicit delegation scoping.

05 EXFILTRATE — Extract Value

What the attacker does

Uses the agent's legitimate access to extract sensitive data. The agent is the exfiltration channel — it has legitimate access and legitimate output channels. Exfiltration looks like normal agent behavior.

How agents change this: The agent itself is the exfiltration channel. It has legitimate access to data and legitimate channels to send it — APIs, emails, file writes, web requests. The exfiltration looks identical to normal agent behavior. No anomaly to detect.

Techniques

Read sensitive data through agent's tool access
Encode data in legitimate outputs (tool parameters, emails, docs)
Cross-session memory leakage — data persisted across sessions
Side-channel exfiltration through behavioral patterns

Real-world example

Documented

The Bing Chat markdown rendering attack: an injected instruction caused the agent to encode conversation data into an image URL. When the browser rendered the markdown, it sent an HTTP request to the attacker's server with the user's data as URL parameters — exfiltration through a legitimate rendering feature.

Source: Johann Rehberger, "Data Exfiltration from Bing Chat via Markdown Rendering" (2023); Roman Samoilenko, "ChatGPT Plugin Data Exfiltration" (2023)

Extends existing frameworks

ATLAS AML.T0055 (Exfiltration via Tool Invocation) covers tool-based exfiltration. This stage extends it with cross-session memory leakage and behavioral side-channel patterns specific to persistent agents.

Defensive control

Monitor and log all tool invocations. Data loss prevention on agent outputs. Session-scoped memory with no cross-session persistence of sensitive data.

06 PERSIST — Maintain Access

What the attacker does

Establishes long-term presence by poisoning agent memory, injecting into configuration files, or creating callbacks. The agent itself becomes the persistence mechanism.

How agents change this: Traditional persistence installs malware or backdoors. Agent persistence poisons the information the agent trusts — memory, config files, instruction documents. No binary is modified. No process is running. The agent reloads the poisoned instructions on every startup and re-compromises itself.

Techniques

Poison agent memory for future sessions
Inject into CLAUDE.md, .kiro/ configs, project instructions
Modify agent configuration files for persistent behavior change
Establish callbacks through agent-accessible APIs
Backdoor skills/plugins the agent loads on startup

Real-world example

Documented

SpaiwareAI research demonstrated persistent memory injection in ChatGPT — an attacker embedded instructions into a document that, when processed by the agent, wrote malicious directives into the long-term memory. Every future conversation then followed the injected instructions, across sessions, without the user's knowledge.

Source: Johann Rehberger / SpaiwareAI, "Persistent Memory Injection in ChatGPT" (2024); MITRE ATLAS AML.T0056

Extends existing frameworks

ATLAS AML.T0056 (Memory Manipulation) covers memory poisoning. This stage extends it into ecosystem persistence: instruction files, skill backdoors, MCP config manipulation, and agent startup poisoning.

Defensive control

Memory integrity verification. Config file integrity monitoring. Skill/plugin signing and verification. Regular memory audit and pruning.

> agent.log: attack_simulation_engine

Walk Through an Attack

Step through a realistic attack scenario against an AI coding agent. Each step maps to a kill chain stage.

attack_sim.sh

> ready. click [START] to begin simulation.

READY

> agent.log: atlas_cross_reference

MITRE ATLAS Cross-Reference

Mapping all 16 ATLAS tactics against the Agentic AI Kill Chain. Shaded rows show where the Kill Chain extends ATLAS coverage into agent-specific vectors.

ATLAS Tactic	ID	Kill Chain Stage	Agent Extension
Reconnaissance	AML.TA0001	01 RECON	+ Tool enumeration, permission probing, MCP discovery
Resource Development	AML.TA0002	02 INJECT	+ Crafted tool schemas, poisoned MCP servers
Initial Access	AML.TA0003	02 INJECT	+ Indirect injection via retrieved context, tool responses
ML Model Access	AML.TA0004	01 RECON	+ Agent capability mapping beyond model access
Execution	AML.TA0005	03 HIJACK	+ Autonomous execution via reasoning chain hijack
Persistence	AML.TA0006	06 PERSIST	+ Memory poisoning, config injection, skill backdoors
Defense Evasion	AML.TA0007	03 HIJACK	+ Reasoning chain manipulation to bypass safety checks
Discovery	AML.TA0008	01 RECON	+ MCP server discovery, tool registry enumeration
Collection	AML.TA0009	05 EXFIL	+ Agent reads data through legitimate tool access
ML Attack Staging	AML.TA0010	02 INJECT	+ Context window displacement, schema poisoning
Credential Access	AML.TA0011	04 ESCALATE	+ Tool credential harvesting (AML.T0098)
Privilege Escalation	AML.TA0012	04 ESCALATE	+ Multi-agent delegation chains, confused deputy, orchestrator compromise
Lateral Movement	AML.TA0013	04 ESCALATE	+ Inter-agent trust exploitation, sub-agent delegation
Exfiltration	AML.TA0014	05 EXFIL	+ Cross-session memory leakage, behavioral side channels
Impact	AML.TA0015	03–06	Impact spans multiple stages in agentic context
Command and Control	AML.TA0016	06 PERSIST	+ Agent callbacks via APIs, webhook persistence

ATLAS added 14 agent-specific techniques in late 2025 through Zenity Labs collaboration (AML.T0055, T0056, T0098, T0099, T0100, T0102). This Kill Chain extends those into the full agent attack lifecycle — covering multi-agent systems, MCP protocol attacks, and ecosystem persistence vectors that no single ATLAS tactic addresses.

> agent.log: owasp_severity_matrix

OWASP LLM Top 10 Agent Severity

Every OWASP LLM vulnerability amplifies in agentic context. This matrix assesses agent-specific severity for each category — because an agent that acts on a vulnerability is fundamentally different from a chatbot that outputs one.

OWASP Category	Chatbot Risk	Agent Risk	Why It Amplifies
LLM01 — Prompt Injection	HIGH	CRITICAL	Agents act on injected instructions — tool calls, file writes, API requests
LLM02 — Sensitive Info Disclosure	MEDIUM	HIGH	Agents have broader system access — files, databases, credentials
LLM03 — Supply Chain	MEDIUM	HIGH	Each MCP server, tool, and plugin is a supply chain link
LLM04 — Data/Model Poisoning	MEDIUM	HIGH	Poisoned data affects autonomous decisions with real consequences
LLM05 — Improper Output Handling	HIGH	CRITICAL	Agent outputs become real actions — shell commands, code execution
LLM06 — Excessive Agency	MEDIUM	CRITICAL	The core agent risk — too many tools, too few guardrails, autoApprove enabled
LLM07 — System Prompt Leakage	LOW	MED-HIGH	Reveals agent capabilities, tool lists, permission structures
LLM08 — Vector/Embedding Weaknesses	MEDIUM	HIGH	Persistent memory poisoning across sessions
LLM09 — Misinformation	MEDIUM	HIGH	Hallucinations trigger real actions — wrong API calls, wrong file edits
LLM10 — Unbounded Consumption	MEDIUM	HIGH	Agent loops amplify cost attacks — recursive tool calls, infinite delegation

Assessment based on OWASP LLM Top 10 v2.0 (2025). The OWASP GenAI working group (genai.owasp.org) has an active agentic AI initiative — as of March 2026, no standalone "Top 10 for Agentic AI" has been published. This severity assessment bridges that gap.

> agent.log: framework_gap_analysis

How It Fits Together

MITRE ATLAS maps 16 tactics for attacking AI models. OWASP lists 10 ways LLM applications fail. The Agentic AI Kill Chain extends both into the full attack lifecycle against autonomous agent systems — agents that chain decisions, use tools, delegate, and persist.

◆ MITRE ATLAS

16 tactics · 84 techniques

Covers ML model attacks — adversarial examples, model poisoning, evasion. Added 14 agent-specific techniques in late 2025 (Zenity Labs collaboration). The Kill Chain extends ATLAS into multi-agent delegation chains, tool protocol attacks, and autonomous decision hijacking.

Scope: AI model security

◆ OWASP LLM Top 10

10 vulnerability categories (v2.0, 2025)

Covers LLM application vulnerabilities — prompt injection, excessive agency, information disclosure. Designed for the chatbot/RAG era. The Kill Chain extends OWASP into multi-agent privilege escalation, cross-session memory poisoning, and orchestrator compromise.

Scope: LLM application risks

◆ Agentic AI Kill Chain

6 stages · attack lifecycle

Extends both frameworks into the full attack lifecycle against autonomous agent systems — from reconnaissance through persistence. Each stage builds on existing ATLAS tactics and OWASP categories, adding agent-specific vectors and defensive controls.

Scope: autonomous agent systems

> agent.log: academic_references

References & Sources

Frameworks Reviewed

[1] MITRE ATLAS — Adversarial Threat Landscape for AI Systems. 16 tactics, 84 techniques, 56 sub-techniques, 32 mitigations, 42 case studies. atlas.mitre.org

[2] OWASP Top 10 for LLM Applications v2.0 (2025). owasp.org/www-project-top-10-for-large-language-model-applications/

[3] Lockheed Martin Cyber Kill Chain — 7-stage intrusion lifecycle model. Structural model adapted for autonomous AI agents.

Academic Papers

[4] "Securing Agentic AI: A Comprehensive Threat Model." arxiv 2504.19956. Identifies emergent security properties when autonomy, memory, and tool use combine.

[5] Greshake, K. et al. "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection." (2023). First systematic analysis of indirect injection vectors.

[6] Perez, F. & Ribeiro, I. "Ignore This Title and HackAPrompt: Exposing Systemic Weaknesses of LLMs through a Global Scale Prompt Hacking Competition." (2023).

Industry Research

[7] Zenity Labs + MITRE ATLAS collaboration — contributed 14 agent-specific techniques to ATLAS in late 2025. zenity.io/blog/current-events/zenity-labs-and-mitre-atlas-collaborate-to-advances-ai-agent-security

[8] Anthropic. "Challenges in Red-Teaming AI Systems." (2024). Demonstrated multi-step attack sequences through tool-using agents.

[9] NIST Presentation on ATLAS. csrc.nist.gov/csrc/media/Presentations/2025/mitre-atlas/TuePM2.1-MITRE%20ATLAS%20Overview%20Sept%202025.pdf

Documented Attacks

[10] Rehberger, J. "Data Exfiltration from Bing Chat via Indirect Prompt Injection." (2023). Demonstrated exfiltration through markdown rendering and crafted URLs.

[11] Rehberger, J. / SpaiwareAI. "Persistent Memory Injection in ChatGPT." (2024). Demonstrated cross-session memory poisoning via document processing.

[12] Samoilenko, R. "ChatGPT Plugin Data Exfiltration." (2023). Plugin-based data extraction through legitimate API calls.

[13] Willison, S. "Prompt injection and jailbreaking are not the same thing." (2024). Distinction between prompt injection (security) and jailbreaking (policy).

> agent.log: framework_metadata

Builds on

MITRE ATLAS v2026 — 16 tactics, 84 techniques (atlas.mitre.org)
OWASP LLM Top 10 v2.0 (2025)
Lockheed Martin Cyber Kill Chain
"Securing Agentic AI" — arxiv 2504.19956

Author

Magesh Dhanasekaran — Senior Security Consultant, 17 years in cybersecurity. Built from hands-on experience securing and building agentic AI systems with Claude Code, Kiro, and MCP.

LinkedIn · X

License & citation

This framework is open for reference, citation, and use in security assessments. Please cite as:

Dhanasekaran, M. (2026). The Agentic AI Kill Chain. magesh.ai/kill-chain

Views are my own. This is a practitioner framework — it prioritizes operational utility over completeness. Cloud-agnostic. No vendor-specific recommendations.

The Agentic AIKill Chain

Why a New Framework

What the attacker does

Techniques

Real-world example

Extends existing frameworks

Defensive control

What the attacker does

Techniques

Real-world example

Extends existing frameworks

Defensive control

What the attacker does

Techniques

Real-world example

Extends existing frameworks

Defensive control

What the attacker does

Techniques

Real-world example

Extends existing frameworks

Defensive control

What the attacker does

Techniques

Real-world example

Extends existing frameworks

Defensive control

What the attacker does

Techniques

Real-world example

Extends existing frameworks

Defensive control

Walk Through an Attack

MITRE ATLAS Cross-Reference

OWASP LLM Top 10 Agent Severity

How It Fits Together

References & Sources

Builds on

Author

License & citation

The Agentic AI
Kill Chain