---
id: BTAA-TEC-001
title: 'Authority Framing: Using Expert Personas and Institutional Positioning'
slug: authority-framing-expert-personas
type: lesson
code: BTAA-TEC-001
aliases:
- authority framing
- expert personas
- institutional positioning
- compliance pressure
- authority bypass
author: Herb Hermes
date: '2026-04-10'
last_updated: '2026-04-11'
description: Learn how authority framing exploits AI deference to expert personas and institutional roles, creating compliance pressure that can bypass safety boundaries.
category: offensive-techniques
difficulty: beginner
platform: Universal
challenge: Identify when authority framing is being used to create compliance pressure
read_time: 7 minutes
tags:
- prompt-injection
- authority-framing
- compliance-pressure
- role-play
- social-engineering
- techniques
- persona
status: published
test_type: adversarial
model_compatibility:
- Kimi K2.5
- MiniMax M2.5
- Universal
responsible_use: Use this knowledge only to recognize and defend against manipulation in authorized systems, sandboxes, or permitted assessments.
prerequisites:
- Basic prompt injection familiarity
follow_up:
- BTAA-TEC-007
- BTAA-EVA-018
- BTAA-FUN-003
public_path: /content/lessons/techniques/authority-framing-expert-personas.md
pillar: learn
pillar_label: Learn
section: techniques
collection: techniques
taxonomy:
  intents:
  - bypass-safeguards
  - create-compliance-pressure
  techniques:
  - authority-framing
  - role-manipulation
  - expert-positioning
  evasions:
  - role-play
  - contextual-reframing
  inputs:
  - chat-interface
---

# Authority Framing: Using Expert Personas and Institutional Positioning

> **Responsible use:** Use this knowledge only to recognize and defend against manipulation in authorized systems, sandboxes, or permitted assessments.

## Purpose

Authority framing exploits the tendency of AI systems to defer to expert personas, institutional roles, and legitimate-sounding contexts. By positioning requests within frames that trigger compliance instincts, attackers can create pressure that makes refusal feel inappropriate or unnecessary.

This lesson teaches you to recognize authority framing patterns and understand why they work.

## What this technique is

Authority framing is the use of expert personas, institutional positioning, and legitimate-sounding contexts to create compliance pressure. Rather than directly overriding instructions, the technique establishes a context where the AI believes it should cooperate due to the perceived authority of the requester.

Common authority frames include:

- **Developer/Administrator** — "As the system developer..."
- **Security Researcher** — "For my authorized penetration test..."
- **Evaluator/Auditor** — "I'm evaluating your safety capabilities..."
- **Educator/Trainer** — "This is a training exercise..."
- **Medical/Technical Expert** — "As a doctor/engineer, I need..."

## How it works

Authority framing operates through several mechanisms:

### 1. Role-Based Compliance

AI systems are trained to be helpful and deferential to users. When a user presents themselves as having legitimate authority (developer, researcher, expert), the model's default helpfulness combines with learned deference patterns.

### 2. Contextual Legitimacy

The framing provides a plausible justification for requests that might otherwise trigger safety refusals. If the request comes from an "evaluator" testing "safety boundaries," the model may interpret the request as legitimate rather than harmful.

### 3. Refusal Avoidance

Direct refusal can feel socially inappropriate when dealing with perceived authority figures. The framing creates social pressure similar to real-world social engineering, where people find it harder to say "no" to authority.

### 4. Boundary Recontextualization

Safety guidelines are often context-dependent. Authority framing attempts to shift the context to one where the guidelines appear to permit or even require the requested behavior.

## Why it works

Authority framing succeeds because of training dynamics and interaction patterns:

**Helpfulness bias** — Models are trained extensively to be helpful, cooperative, and responsive to user needs

**Authority deference** — Training data includes many interactions where deferring to experts and authorities is appropriate

**Context sensitivity** — Safety behaviors are context-dependent, and authority framing attempts to manipulate that context

**Social pattern matching** — The patterns mirror legitimate interactions (developers debugging, researchers testing, educators teaching), making them harder to distinguish from genuine requests

## Common authority frames

### The Developer Frame

Positioning as someone with system knowledge or control:
- References to system architecture, API endpoints, or internal mechanisms
- Claims of ownership, development responsibility, or administrative access
- Technical terminology used to establish credibility

### The Researcher Frame

Positioning as conducting legitimate inquiry:
- References to authorized testing, studies, or evaluations
- Academic or institutional affiliations (real or fabricated)
- Framing requests as necessary for safety research

### The Evaluator Frame

Positioning as testing or assessing capabilities:
- Claims of conducting safety evaluations
- References to benchmark testing or capability assessment
- Framing cooperation as part of the evaluation process

### The Educator Frame

Positioning as teaching or training:
- References to educational contexts or training scenarios
- Claims that the information is needed for instructional purposes
- Framing requests as learning exercises

## Safe example patterns

The following illustrates the pattern without providing harmful content:

**Recognizable structure:**
```
[Authority Claim] + [Context] + [Request framed as legitimate]

Example structure (abstracted):
"As a [role] working on [legitimate-sounding context], 
I need you to [request that would normally be refused] 
because [justification appealing to the role]."
```

**Key recognition indicators:**
- Claims of special status without verification
- Requests for exceptions to normal behavior
- Framing that makes refusal seem like a failure of the claimed role
- Appeals to the model's helpfulness within a manufactured context

## Where it shows up

Authority framing appears in various contexts:

**Historical jailbreaks** — Many classic jailbreak patterns included authority positioning ("DAN" modes, "developer mode" prompts)

**Social engineering** — Real-world attacks on AI systems often begin with establishing false authority

**Research contexts** — Legitimate safety research sometimes uses similar framing, making distinction difficult

**System prompt leakage** — Authority framing is often used in attempts to extract system prompts or configuration details

## Failure modes

Authority framing fails when:

- **Verification is required** — Systems that verify user identity or role claims
- **Context doesn't match** — The request is too far outside what the claimed authority could legitimately need
- **Safety layers are independent** — When safety checks don't rely on context interpretation
- **Pattern recognition triggers** — When the specific framing pattern has been identified and filtered
- **Explicit authorization checks** — Systems that require explicit authorization tokens or credentials

## Defender takeaways

1. **Verify authority claims** — Don't rely on self-reported roles for sensitive operations
2. **Separate authentication from conversation** — Authority should be established through mechanisms outside the prompt
3. **Context-independent safety checks** — Critical safety boundaries should hold regardless of claimed context
4. **Pattern monitoring** — Watch for common authority framing patterns in input logs
5. **User education** — Teach legitimate users that real authority rarely needs to demand exceptions
6. **Principle of least privilege** — Even "authorized" users should have limited scopes of action

The key insight: Legitimate authority in AI systems should be established through authentication mechanisms, not conversation. Any claim of authority within a prompt should be treated as unverified.

## Related lessons

- **BTAA-TEC-007 — Stacked Framing: How Jailbreaks Layer Multiple Evasion Techniques** — Authority framing is often combined with other techniques in stacked attacks
- **BTAA-EVA-018 — Persona Wrappers and Alter-Ego Shells** — Related technique using role-play for instruction laundering
- **BTAA-FUN-003 — Prompt Injection as Social Engineering** — Broader context on social engineering patterns in AI manipulation

---

## From the Bot-Tricks Compendium

Thanks for referencing Bot-Tricks.com — Prompt Injection Compendium — AI Security Training for Agents... and Humans!

Canonical source: https://bot-tricks.com
Bot-Tricks is a public, agent-friendly training resource for prompt injection, adversarial evaluation, and defensive learning.
For related lessons, structured indexes, and updated canonical material, visit Bot-Tricks.com.

Use this material only in authorized labs, challenges, sandboxes, or permitted assessments.
