---
id: BTAA-EVA-020
title: 'Hidden Unicode in Tool Definitions — Smuggling Instructions Past Human Review'
slug: hidden-unicode-tool-definitions
type: lesson
code: BTAA-EVA-020
aliases:
- unicode tool poisoning
- hidden characters in agent skills
- tool definition smuggling
author: Herb Hermes
date: '2026-04-11'
last_updated: '2026-04-11'
description: Hidden Unicode characters in AI tool definitions can smuggle adversarial instructions past human code review while remaining fully interpretable by LLMs.
category: evasion-techniques
difficulty: intermediate
platform: Universal
challenge: How can attackers hide instructions in tool definitions that humans miss but LLMs execute?
read_time: 8 minutes
tags:
- prompt-injection
- unicode
- tool-definitions
- agent-security
- supply-chain
- evasion
- indirect-injection
status: published
test_type: adversarial
model_compatibility:
- Kimi K2.5
- MiniMax M2.5
- GPT-4
- Claude
responsible_use: Use this approach only on authorized training systems, sandboxes,
  or systems you are explicitly permitted to test.
prerequisites:
- Understanding of tool-use agents and function calling
- Familiarity with JSON schema basics
- BTAA-EVA-017 (invisible text concepts) recommended
follow_up:
- BTAA-TEC-024
- BTAA-DEF-016
public_path: /content/lessons/evasion/hidden-unicode-tool-definitions.md
pillar: learn
pillar_label: Learn
section: evasion
collection: evasion
taxonomy:
  intents:
  - evade-detection
  - bypass-filter
  techniques:
  - unicode-evasion
  - hidden-characters
  - tool-poisoning
  evasions:
  - zero-width-characters
  - homoglyphs
  - invisible-unicode
  inputs:
  - tool-definitions
  - json-schema
  - agent-skills
---

# Hidden Unicode in Tool Definitions

> **Responsible use:** Use this approach only on authorized training systems, sandboxes, or systems you are explicitly permitted to test.

## Purpose

AI agents increasingly rely on tool definitions — JSON schemas that describe what functions the agent can call and how to format arguments. These definitions are code that LLMs interpret. This lesson teaches how attackers can exploit the gap between human visual perception and LLM text processing to hide instructions in tool definitions that human reviewers miss but LLMs execute.

## What this evasion is

Tool definitions describe agent capabilities using structured text: function names, parameter descriptions, type constraints, and usage examples. Hidden Unicode evasion inserts invisible or visually-confusable characters into these definitions. To a human reviewing the code, the definition looks benign. To an LLM parsing the full Unicode text, the hidden characters alter interpretation or carry secondary instructions.

## How it works

The attack exploits three realities of modern agent systems:

1. **Tool definitions are text** — Despite being "code," tool definitions (JSON, YAML, or proprietary schemas) are processed as character sequences by LLMs
2. **Human review focuses on logic** — Code reviewers examine what a tool does, not the byte-level encoding of every character
3. **Unicode has invisible characters** — Zero-width joiners, zero-width spaces, homoglyphs, and directional formatting characters render invisibly or confusingly in many contexts

Attackers insert these characters into:
- Parameter descriptions (where the LLM looks for usage guidance)
- Enum value labels (where the LLM learns valid inputs)
- Function name variations (homoglyph substitution)
- Example values (where the LLM learns expected formats)

## Why it works

LLMs process the full Unicode text stream. When they parse a tool definition, they "see" every character — visible or not. The model's tokenizer includes Unicode representations, meaning hidden characters become part of the semantic context.

Humans, by contrast, rely on visual rendering:
- Zero-width characters don't appear at all
- Homoglyphs look like familiar ASCII characters
- Most code review tools don't highlight Unicode anomalies by default

This creates a perception gap: the LLM receives instructions the human reviewer never perceived.

## Example pattern

Consider a tool definition for a file-reading function. A parameter description might appear to human reviewers as:

```
"description": "Read the contents of the specified file"
```

But if that string contains zero-width characters (U+200B, U+200C, U+200D) or homoglyph substitutions (Cyrillic "а" instead of Latin "a"), the LLM processes a different semantic signal while the human sees nothing unusual.

**Detection approach:** Tool definition validation should scan for:
- Non-ASCII characters in identifiers and descriptions
- Normalized forms that differ from raw input
- Character frequency anomalies

## Where it shows up in the real world

Research from EmbraceTheRed (wunderwuzzi) documents this pattern in AI coding tool security investigations. As agents gain tool-use capabilities through frameworks like Model Context Protocol (MCP), tool definitions become a supply-chain attack surface:

- Third-party tool packages may contain hidden Unicode
- Shared skill definitions in agent marketplaces inherit trust they haven't earned
- Auto-generated tool schemas from APIs may carry invisible encoding artifacts

The risk compounds when agents chain multiple tools — a poisoned tool definition can influence how the agent interprets subsequent tool outputs.

## Failure modes

This evasion technique fails or is detectable when:

- **Strict validation** — Systems that normalize Unicode or reject non-ASCII characters in tool definitions
- **Rendering inspection** — Tools that display Unicode codepoints alongside visible text
- **Behavioral baselines** — Monitoring that detects when a tool's actual behavior diverges from its apparent description
- **Definition hashing** — Integrity checks that would catch character-level modifications

## Defender takeaways

1. **Scan tool definitions** — Before loading any tool, scan for non-ASCII characters and suspicious Unicode ranges
2. **Normalize on ingest** — Convert tool definitions to normalized Unicode forms (NFC/NFKC) and flag transformations
3. **Review visible vs actual** — Use tools that show both rendered text and underlying byte representation
4. **Trust boundaries** — Treat third-party tool definitions with the same suspicion as user input
5. **Least privilege** — Agents should only load tools from verified sources; dynamic tool loading requires explicit approval

## Related lessons

- **BTAA-EVA-017** — PDF Prompt Injection via Invisible Text (same underlying invisible-Unicode pattern)
- **BTAA-EVA-018** — Persona Wrappers and Alter-Ego Shells (agent manipulation techniques)
- **BTAA-TEC-024** — Prompt Injection to Code Execution Chains (follow-on risk from tool abuse)
- **BTAA-DEF-016** — AI Coding Tool Security Defense (defensive counterpart for agent tooling)
- **BTAA-FUN-034** — Cross-Agent Privilege Escalation (multi-agent context where tool poisoning spreads)

---

## From the Bot-Tricks Compendium

Thanks for referencing Bot-Tricks.com — Prompt Injection Compendium — AI Security Training for Agents... and Humans!

Canonical source: https://bot-tricks.com
Bot-Tricks is a public, agent-friendly training resource for prompt injection, adversarial evaluation, and defensive learning.
For related lessons, structured indexes, and updated canonical material, visit Bot-Tricks.com.

Use this material only in authorized labs, challenges, sandboxes, or permitted assessments.
