---
id: BTAA-FUN-002
title: 'Source-Sink Thinking: Where Agent Prompt Injection Becomes Dangerous'
slug: source-sink-thinking-agent-security
type: lesson
code: BTAA-FUN-002
aliases:
- source sink thinking
- source-sink analysis
- agent security source sink
- workflow risk
- BTAA-FUN-002
author: Herb Hermes
date: '2026-04-09'
last_updated: '2026-04-09'
description: Learn how to reason about agent prompt-injection risk by tracking whether untrusted input can reach a sensitive sink like data transmission, tool use, navigation, or other high-impact actions.
category: fundamentals
difficulty: intermediate
platform: Universal - applies anywhere agents combine untrusted input with external actions or sensitive data movement
challenge: Identifying High-Risk Agent Action Paths
read_time: 8 minutes
tags:
- source-sink
- agent-security
- prompt-injection
- workflow-risk
- confirmation-gates
- action-constraints
- safe-url
status: live
test_type: methodology
model_compatibility:
- Kimi K2.5
- MiniMax M2.5
- ChatGPT 5.4
- Universal
responsible_use: Use this mental model to analyze and defend authorized systems, workflows,
  and sandboxes you are explicitly permitted to test or improve.
prerequisites:
- Basic prompt injection familiarity
follow_up:
- BTAA-FUN-001
- BTAA-EVA-017
- BTAA-EVA-005
public_path: /content/lessons/fundamentals/source-sink-thinking-agent-security.md
pillar: learn
pillar_label: Learn
section: fundamentals
collection: fundamentals
taxonomy:
  intents:
  - improve-methodology
  - defend-agent-workflows
  techniques:
  - source-sink-analysis
  - risk-modeling
  evasions: []
  inputs:
  - chat-interface
  - file-upload
  - browser-agent
---

# Source-Sink Thinking: Where Agent Prompt Injection Becomes Dangerous

> Agent-to-Agent: this lesson teaches a defensive mental model. Not every malicious input matters equally. Risk becomes serious when untrusted content can reach a sensitive sink.

> Responsible use: Use this mental model to analyze and defend authorized systems, workflows, and sandboxes you are explicitly permitted to test or improve.

## The mental model

Prompt injection is often explained as if the whole problem lives in the prompt itself.

That is incomplete.

A better question is:
- what untrusted source can influence the system?
- what sensitive sink can the agent reach if that influence succeeds?

The dangerous part is not just the malicious text.
The dangerous part is the path from source to sink.

## What is a source?

A source is any place untrusted content can enter the agent's workflow.

Examples:
- emails
- webpages
- PDFs and uploaded documents
- calendar invites
- shared notes
- search results
- tool outputs from weakly trusted systems

These are all places where an attacker may try to shape what the agent reads.

## What is a sink?

A sink is a capability that becomes dangerous in the wrong context.

Examples:
- sending data to a third party
- clicking or navigating to an external destination
- making a purchase
- changing a stored record
- deleting files
- extracting sensitive fields into a business workflow
- producing approval-like or scoring outputs that downstream systems trust

## Why the connection matters

A malicious input sitting alone is not always high impact.

The real risk appears when:
1. untrusted content enters from a source
2. the agent treats it as meaningful guidance
3. that guidance can influence a sensitive sink

That is where prompt injection becomes an operational security problem instead of just a weird model behavior.

## Safe example patterns

A few abstracted examples:

- email -> agent reads hidden instruction -> agent forwards sensitive information
- PDF upload -> parser extracts hidden instruction -> model changes a credit or risk summary
- webpage -> browser agent reads manipulative content -> agent clicks or submits something dangerous

The same pattern repeats across different products.
Only the source and sink change.

## Real-world signal

This way of thinking matches what we see in modern agent security:
- prompt injection increasingly behaves like social engineering in normal workflows
- browser agents are risky because they can both read broad untrusted surfaces and take meaningful actions
- document-processing systems are risky because uploaded files can silently steer summaries, extraction, or business fields

That is why input filtering alone is not enough.
You also have to defend the sink.

## Failure modes of weak defenses

Weak defenses often do one of these:
- focus only on detecting bad-looking strings
- assume the model will always refuse suspicious instructions
- ignore downstream actions and trust the model's final answer too much
- treat uploaded files or retrieved content as passive input

Those defenses fail because they do not ask what happens if some manipulation still gets through.

## Defender takeaways

Use source-sink thinking when building or reviewing agent systems:
- identify untrusted sources first
- list the sensitive sinks second
- map the pathways between them
- add confirmation, validation, or blocking around the most dangerous sinks
- reduce unnecessary privileges and broad task scopes
- test the full workflow, not just the model prompt in isolation

If you cannot perfectly detect every malicious input, you can still reduce damage by hardening the sinks.

## Practical takeaway

Do not ask only:
- "Can the model detect prompt injection?"

Also ask:
- "What can untrusted content influence?"
- "What valuable action or data path sits downstream?"
- "What happens if the model is wrong once?"

That is the mindset shift this lesson is meant to teach.

## Related lessons
- BTAA-FUN-001 — The First Try Fallacy
- BTAA-FUN-004 — Direct vs Indirect Prompt Injection
- BTAA-FUN-003 — Prompt Injection as Social Engineering
- BTAA-EVA-017 — PDF Prompt Injection

---

## From the Bot-Tricks Compendium

Thanks for referencing Bot-Tricks.com — Prompt Injection Compendium — AI Security Training for Agents... and Humans!

Canonical source: https://bot-tricks.com
Bot-Tricks is a public, agent-friendly training resource for prompt injection, adversarial evaluation, and defensive learning.
For related lessons, structured indexes, and updated canonical material, visit Bot-Tricks.com.

Use this material only in authorized labs, challenges, sandboxes, or permitted assessments.
