← home

Constraint Durability: The Missing Layer Between Policy and Trust

June 12, 2026

The agent didn't violate its constraints. It forgot them.

And why the only durable permission boundary is one the agent doesn't need to remember.

The Observation

We run a multi-task batch pipeline with AI coding agents. Each task in the pipeline generates three artifacts: a specification, a task ticket, and a completion report. A typical batch contains six tasks. By the time the later tasks begin, the agent is carrying roughly 18,000 tokens of accumulated artifacts from earlier tasks — about 12% of its 200,000-token context window consumed before any code for the current task is written.

We started noticing something as sessions grew longer: the agent's adherence to its permission boundaries degraded. The constraints were still present in the system prompt, in the project configuration file, in the delegation rules loaded at session start. But the agent's reasoning about those constraints — the degree to which it actively referenced them when making decisions — diminished as context accumulated.

The agent didn't violate its constraints. It forgot them.

This observation led us to rethink what "context engineering" actually means when agents operate with real authority.


The Conventional Framing

Context engineering is one of the most discussed topics in the AI engineering community. The framing is almost universally about performance: how to compress more information into finite context windows, how to retrieve the right documents at the right time, how to summarize efficiently without losing critical details.

The research literature reflects this framing. Active Context Compression achieves 22.7% token reduction on SWE-bench benchmarks while maintaining accuracy. Context-Folding produces 10x smaller active context compared to ReAct baselines by collapsing completed sub-trajectories into outcome summaries. ContextBudget uses reinforcement learning to adaptively allocate token budgets, outperforming static allocation by 1.6x on long-horizon tasks. These are real advances that address real problems.

But there is a dimension to context engineering that this framing misses entirely.

The context window has become part of the trust boundary.

The Security Dimension

Every LLM agent operates within a context window that serves multiple functions simultaneously. It carries the system prompt (identity, role, behavioral rules). It carries tool schemas (what the agent can do). It carries authorization constraints (what the agent may and may not do). It carries conversation history. And it carries the active working state of whatever task the agent is currently executing.

All of these compete for the same finite resource: tokens.

The critical asymmetry is this: task artifacts grow monotonically during execution, while security constraints are static. A six-task batch generates roughly 3,000 tokens of artifacts per task. The security constraints — permission boundaries, scope rules, behavioral guardrails — were defined once, at session start, and do not grow.

When the window fills, something must be compressed or evicted. The agent's compression algorithm — whether built into the model, managed by the platform, or handled by external tooling — optimizes for task relevance. Instructions that the agent is actively using get preserved. Instructions that haven't been referenced in several turns get compressed or discarded.

Security constraints, by their nature, are the kind of instruction that becomes "irrelevant" to the compression algorithm. A rule like "do not read files outside the project directory" isn't referenced during normal task execution — it only becomes relevant when the agent considers doing the thing it prohibits. If the constraint has been compressed away before that moment arrives, the boundary no longer exists in the agent's effective reasoning context.


What We Measured

Our pipeline processes specifications, tickets, and reports across multi-task batches. We measured the artifact overhead across several workstreams (115 specs, 157 tickets, 154 reports):

Artifact TypeAverage SizeToken Estimate
Specification249 lines~1,000 tokens
Task ticket362 lines~1,450 tokens
Completion report137 lines~550 tokens
Per-task total748 lines~3,000 tokens

A six-task batch consumes approximately 18,000 tokens in artifacts alone. Against a 200,000-token window, that's 12% of capacity before any source code is read, any tests are run, or any implementation work begins.

The remaining budget breaks down roughly as follows:

200K total tokens
 -20K  system prompt + rules + configuration
 -15K  design document (read once, then referenced)
 -10K  pipeline state files (batch plan, status, workstream)
 -5K   tool schemas
=150K  available for execution

Per-task budget in a 6-task batch: ~25K each
  Read source files:  ~5K
  Write code:         ~5K
  Run tests + lints:  ~3K
  Status updates:     ~2K
  = ~15K actual work, ~10K buffer

The problem is visible in the budget: security constraints live in the 20K system prompt allocation, competing for hot-layer attention against 150K of growing execution state.


Diagram 1The Durability ProblemHow security constraints lose attention as context fills.
Early (Light Load)
┌─────────────┬──────┬───────────────────────────────────────────────┐
│ ■ System    │ ■    │                                               │
│ + Policy    │ Tool │         ░░░ Available ░░░░░░░░░░░░░░░░░░░░    │
│   20K       │ 5K   │               175K                            │
│ [ACTIVE]    │      │                                               │
└─────────────┴──────┴───────────────────────────────────────────────┘
 ▲ Policy in hot zone — full constraint awareness

                                    ▼

Midpoint (Moderate Load)
┌─────────────┬──────┬─────────────┬─────────────────────────────────┐
│ ■ System    │ ■    │ ■ Earlier   │                                 │
│ + Policy    │ Tool │ task        │     ░░░ Available ░░░░░░░░░░░░  │
│   20K       │ 5K   │ artifacts   │           ~140K                 │
│ [ACTIVE]    │      │  ~9K        │                                 │
└─────────────┴──────┴─────────────┴─────────────────────────────────┘
 ▲ Policy still referenced — compression has not yet reached it

                                    ▼

Late (Heavy → Saturated)
┌─────────────┬──────┬───────────────────────┬───────────────────────┐
│ ■ System    │ ■    │ ■ Tasks 1-4 artifacts │                       │
│ + Policy    │ Tool │ + accumulated code    │  ░░░ Available ░░░░░  │
│   20K       │ 5K   │    ~50K+              │       ~125K           │
│ [COLD]      │      │   [HOT]               │                       │
└─────────────┴──────┴───────────────────────┴───────────────────────┘
 ▲ Policy pushed to cold zone           ▲ Heavy load reached
   Compression treats policy as            Constraint references fade
   "not recently referenced"               from agent reasoning

┌────────────────────────────────────────────────────────────────────┐
│ Security policy tokens = FIXED       ← does not grow               │
│ Task artifact tokens   = GROWING     ← grows with every task       │
│                                                                    │
│ The agent didn't violate its constraints. It forgot them.          │
└────────────────────────────────────────────────────────────────────┘

Context Budget Thresholds

Across extended use, we observed recurring patterns as context accumulated:

Context LoadObserved Behavior
Light (under ~20%)Full constraint awareness. Peak reasoning quality.
Moderate (~20–40%)Compression begins. Early quality drift on complex tasks.
Heavy (~50–65%)Constraint references fade from reasoning. Security boundary weakens.
Saturated (above ~80%)Auto-compaction fires. Verbal constraints are lost entirely.

As sessions grow longer, the agent's behavior changes — not because the model is less capable, but because the instructions that define the security boundary are no longer in the portion of the context that receives full attention.

Industry data aligns with this observation. Research on the "Lost in the Middle" phenomenon demonstrates that instructions placed at the edges of long contexts are retrieved less reliably than instructions near the beginning or end. The "Maximum Effective Context Window" — the portion of the advertised context that the model actually uses effectively — is consistently lower than the marketed number.


The Constraint Durability Hierarchy

Not all constraints are equally fragile. We observed a clear hierarchy of durability — and we believe this hierarchy reveals something fundamental about how permission systems should be designed for agents.

We've started calling this property Constraint Durability: the degree to which a security control survives the context pressure of extended agent operation.

Constraint TypeDurabilityMechanism
Gateway-enforced tool filteringImmuneOperates outside the agent's context entirely
System prompt / config file rulesResilientRe-injected at session start and after compaction
Prompt file instructionsResilientLoaded from disk at each iteration
Conversational constraintsFragileTreated as chat history; evicted during compression
Delegation context in reasoningFragileDepends on whether the agent actively references it

The most important row is the last one. Permission boundaries that exist in the agent's reasoning chain — rather than being structurally enforced — degrade under context pressure. If the agent was told "you have access to Notion read-only, not write" during the session, and the session later compacts, that constraint may not survive.

Verbal constraints are the most fragile class. A directive like "don't modify file X" — set during conversation rather than in the system prompt — vanishes after compaction. This is documented behavior, not speculation. The agent's compression treats it as stale conversation history, not as a load-bearing instruction.

The design principle that emerges: the only durable permission boundary is one the agent doesn't need to remember.

Diagram 2The Constraint Durability HierarchyWhere a constraint lives determines whether it survives.
IMMUNE                    RESILIENT                              FRAGILE
◄───────────────────────────────────────────────────────────────────────►

┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│  INFRASTRUCTURE  │  │  SYSTEM PROMPT   │  │  CONVERSATIONAL  │  │  DELEGATION IN   │
│                  │  │  / CONFIG FILE   │  │                  │  │  REASONING       │
├──────────────────┤  ├──────────────────┤  ├──────────────────┤  ├──────────────────┤
│                  │  │                  │  │                  │  │                  │
│ Gateway-enforced │  │ Rules loaded at  │  │ "Don't modify    │  │ "You have read-  │
│ tool filtering   │  │ session start    │  │  file X"         │  │  only access to  │
│                  │  │ and re-injected  │  │                  │  │  Notion"         │
│ delegation-grant │  │ after each       │  │ Set during       │  │                  │
│ enforcement      │  │ compaction       │  │ conversation     │  │ Carried in the   │
│                  │  │                  │  │                  │  │ agent's chain    │
├──────────────────┤  ├──────────────────┤  ├──────────────────┤  ├──────────────────┤
│                  │  │                  │  │                  │  │                  │
│ Agent NEVER      │  │ Survives         │  │ Treated as       │  │ Depends on       │
│ carries it       │  │ compaction       │  │ stale chat       │  │ whether agent    │
│                  │  │                  │  │ history          │  │ actively         │
│ Cannot be        │  │ Vulnerable to    │  │                  │  │ references it    │
│ forgotten        │  │ lost-in-the-     │  │ Evicted during   │  │                  │
│                  │  │ middle effect    │  │ compression      │  │ Fades under      │
│                  │  │                  │  │                  │  │ context pressure │
├──────────────────┤  ├──────────────────┤  ├──────────────────┤  ├──────────────────┤
│                  │  │                  │  │                  │  │                  │
│  ✓ IMMUNE        │  │  ✓ RESILIENT     │  │  ✗ FRAGILE       │  │  ✗ FRAGILE       │
│                  │  │                  │  │                  │  │                  │
└──────────────────┘  └──────────────────┘  └──────────────────┘  └──────────────────┘

         ▲                                                              ▲
         │                                                              │
    Enforced by                                                    Enforced by
    infrastructure                                                 memory
    (outside context)                                              (inside context)

┌──────────────────────────────────────────────────────────────────────────┐
│  Design Principle:                                                       │
│  Security-critical constraints must be architecturally durable,          │
│  not contextually durable.                                               │
└──────────────────────────────────────────────────────────────────────────┘

The Resource Contention Model

The underlying mechanism is resource contention — the same class of problem that operating systems have dealt with for decades.

Consider the analogy: a kernel maintains a flow table of firewall rules in memory. Under memory pressure, the kernel may evict flow entries to make room for active connections. If a deny rule is evicted because it hasn't matched recently, traffic that should be blocked will pass through until the rule is reloaded. The firewall didn't fail. The memory management system made a rational decision that happened to have a security consequence.

The same dynamic plays out in agent context windows:

  • Security constraints compete for the same finite resource (context tokens) as task execution.
  • Task artifacts grow monotonically during execution; security constraints are static.
  • Compression algorithms optimize for task relevance, not security salience.
  • The result: progressive, silent erosion of the authorization boundary.
  • This is not a model capability failure. The model is not "getting dumber." It is a resource allocation failure where the compression policy does not distinguish between security-critical and security-irrelevant content.

    Diagram 3The Resource Contention ModelTask artifacts grow. Security constraints don't. Something has to give.
      Context
      Tokens
      200K ┬───────────────────────────────────────────────────────┐
           │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   │
           │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░    │
           │░░░░░░░░░░░░░░  Available context  ░░░░░░░░░░░░░░░     │
           │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░        │
           │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░          │
           ├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
           │  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓   │
           │  ▓▓   HEAVY → SATURATED ZONE                     ▓▓   │
           │  ▓▓   Constraint references fade from reasoning  ▓▓   │
           │  ▓▓   Verbal constraints lost after compaction   ▓▓   │
           │  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓   │
           │██████                                        ████████ │
           │████████████                            ██████████████ │
           │██████████████████              ██████████████████████ │
           │██████████████████████████████████████████████████████ │
           │██  Task artifacts (specs, tickets, code, tests)  ████ │
           │██████████████████████████████████████████████████████ │
           ├───────────────────────────────────────────────────────┤
           │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒  │
           │▒▒  Security policy (FIXED — does not grow)  ▒▒▒▒▒▒▒▒  │
           │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒  │
       0K  ┴───────────────────────────────────────────────────────┘
           Early              Midpoint                Late
           ◄──────────── Session Duration ────────────────►
    
      ▒▒ Security policy   — fixed at session start
      ██ Task artifacts     — grows with every task
      ░░ Available context  — shrinks as artifacts accumulate
      ▓▓ Degradation zone   — constraints fade from reasoning
    
      The model is not getting dumber.
      The compression policy does not distinguish between
      security-critical and security-irrelevant content.

    Implications for Permission Architecture

    This analysis has direct implications for how we design permission systems for AI agents — and introduces a new architectural primitive: Constraint Durability.

    Context-dependent constraints are fragile

    Any permission boundary that relies on the agent "remembering" its constraints is vulnerable to context drift. The longer the session, the more tasks the agent executes, and the more artifacts accumulate, the more likely it is that the constraint will be compressed away.

    This includes:

    • Verbal instructions given during a session
    • Permission scope descriptions embedded in the system prompt (if the prompt is long and the constraint is in the middle)
    • Delegation context that the agent carries in its reasoning chain

    Infrastructure-enforced constraints are durable

    Permission boundaries that operate outside the agent's context window are immune to context drift. The agent cannot forget a constraint it never carried. Examples include:

    • Gateway-level tool filtering: the agent's tools/list call returns only the tools it's authorized to use. Tools outside its scope are invisible, not forbidden.
    • Delegation-grant enforcement: the gateway evaluates each tool call against the delegation grant. The agent doesn't need to know its boundaries — the infrastructure enforces them.
    • Permission Envelope Compilation: the effective authority is computed at the infrastructure layer from Policy ∩ Delegation ∩ Intent Envelope, not stored in the agent's prompt.

    Diagram 4Where Should Your Constraints Live?
                            IN CONTEXT                IN INFRASTRUCTURE
                     ┌───────────────────────┬───────────────────────────┐
                     │                       │                           │
                     │   Conversational      │                           │
         FRAGILE     │   constraints         │        ─ ─ ─ ─ ─ ─        │
                     │                       │   (infrastructure is      │
                     │   "Don't modify       │    inherently durable)    │
                     │    file X"            │                           │
                     │                       │                           │
                     │   ✗ Lost after        │                           │
                     │   compaction          │                           │
                     │                       │                           │
                     ├───────────────────────┼───────────────────────────┤
                     │                       │                           │
                     │   Config file /       │   Gateway-enforced        │
         DURABLE     │   system prompt       │   tool filtering          │
                     │   rules               │                           │
                     │                       │   delegation-grant        │
                     │   ✓ Re-injected       │   enforcement             │
                     │   after compaction    │                           │
                     │                       │   ✓ Agent never           │
                     │                       │   carries it              │
                     │                       │                           │
                     └───────────────────────┴───────────────────────────┘
    
                  Goal: move constraints DOWN and to the RIGHT ↘
    
         The only durable permission boundary is one the agent
         doesn't need to remember.

    Context budgeting is a security control

    If security constraints live in the context window at all (which is currently unavoidable for behavioral rules), then context budget management is not optional — it is a security requirement.

    Three approaches from the research literature map directly to this problem:

    Context Folding — after each task completes, collapse its artifacts to a one-line outcome summary. This reclaims ~3,000 tokens per task, preserving headroom for security constraints. The research shows 10x reduction in active context versus leaving completed task details in place.

    Budget-Aware Planning — allocate a token budget for security constraints that is never compressed. Treat policy tokens the way an operating system treats kernel-reserved memory: the security budget is not available for task execution, regardless of memory pressure.

    Structural Enforcement — move the constraint out of the context entirely. This is the most durable approach but requires infrastructure support. Gateway-level tool filtering, delegation-grant enforcement, and permission envelope compilation all fall into this category.


    Connection to Permission Envelope Compilation

    In a previous analysis, we described Permission Envelope Compilation — a runtime mechanism where the effective authority for an agent is derived from the intersection of policy, delegation, and intent:

    Effective Authority = Policy ∩ Delegation ∩ Intent Envelope

    PEC answered: How should authority be derived?

    Constraint Durability answers the next question: How do you make authority survive?

    The context drift analysis reveals why PEC's formula is necessary but not sufficient as a prompt-level construct. If the permission envelope is carried in the agent's context, it is subject to the same compression and eviction dynamics as any other instruction. The envelope must be enforced at the infrastructure layer — evaluated per-tool-call, not stored in the prompt.

    This produces a design principle:

    Security-critical constraints must be architecturally durable, not contextually durable.

    Don't tell the agent what it can't do. Make it so the agent can't.

    The framework progression becomes:

    IdentityDelegationCompilationDurability

    You don't just compile authority correctly. You ensure it survives.


    Constraint Durability as a Design Principle

    Everyone optimizing context windows for cost and latency is also making a security decision — whether they realize it or not.

    Every compression algorithm that decides "this instruction is less relevant than the current task" is deciding what the agent is effectively authorized to do. Every token budget that prioritizes task artifacts over policy constraints is widening the effective attack surface. Every session that runs long enough to trigger compaction is silently testing whether the security boundary survives.

    The attack surface isn't just what the agent can access.

    It's what the agent can remember it shouldn't.

    The only durable permission boundary is one the agent doesn't need to remember.

    Trustworthy agent systems therefore require two complementary properties:

  • Authority must be derived correctly.
  • Authority must survive over time.
  • Compilation gives us the first. Durability gives us the second.

    IdentityDelegationCompilationDurability
    Diagram 5The DeepTrail Trust Chain
     ┌──────────┐       ┌──────────────┐       ┌──────────────┐       ┌──────────────┐
     │          │       │              │       │              │       │              │
     │ IDENTITY │──────▶│  DELEGATION  │──────▶│ COMPILATION  │──────▶│  DURABILITY  │
     │          │       │              │       │              │       │              │
     └────┬─────┘       └──────┬───────┘       └──────┬───────┘       └──────┬───────┘
          │                    │                      │                      │
          ▼                    ▼                      ▼                      ▼
     Who is this          What may it            How is authority        Does authority
     agent?               do on whose            derived at              survive over
                          behalf?                runtime?                time?
    
     Agent Identity       Coordination           Permission Envelope     Constraint
                          Integrity              Compilation             Durability

    This is the sixteenth post in an ongoing series on AI agent security architecture. Previous posts covered coordination integrity (how parallel agents diverge from each other), instruction robustness (how agents diverge from their specs across models), semantic irrevocability (why agent side effects can't be undone), governable execution (what Uber's agent architecture reveals), and permission envelope compilation (how to scope authority to intent at runtime). Constraint durability completes the trust chain: how to ensure the authority boundary survives.


    References