Binding
Binding is the problem of keeping track of what refers to what. A pronoun resolves against an antecedent, a variable use resolves against its definition, a conclusion depends on the premises that license it. The surface domains differ, but the shape is the same: connect a use to a nonlocal source while a closer, plausible-looking candidate competes for the link.
The difficulty is structural rather than lexical. A model can get local word statistics right and still bind to the wrong source, because the correct source is often farther away than a distractor. That is why binding is studied with minimal pairs: change one token, hold most of the surface statistics fixed, and see whether the resolution follows the structure or the proximity.
1. Source, use, distractor
Each binding case has three parts: a source that licenses the answer, a use that needs resolving, and a distractor that is closer or more salient but wrong. Figure 1 lays the three domains side by side. Toggling the distractor swaps the licensed link for the lure, which is the failure a good test is built to provoke.
Subject-verb agreement makes the structural cue concrete. In "The key to the cabinets is rusty," the verb agrees with key, not the nearer plural cabinets. A minimal pair changes only the head noun's number, "the key" versus "the keys," so the verb form must track the grammatical subject rather than the closest noun. If the model prefers are when cabinets is adjacent, it followed proximity; if it tracks key, it followed structure.
2. Three domains
The same source-use-distractor template recurs with different vocabularies. The examples below are schematic, but they make the shared shape concrete.
- Natural language
- Agreement, anaphora, and filler-gap dependencies all create source-use links that cross intervening words. The licensed source is set by grammar, and a nearer noun or name is the standard distractor.
- Code
- In
let total = 0; ... let count = item.count; ... return total, the final use resolves to the accumulator whilecountsits nearby as a lure. The VarMisuse task tests whether a model tracks definitions and uses through program structure rather than text order. - Logic
- From
A -> B,B -> C, andA, the conclusionCdepends on the chain throughB. A premise such asD -> Cis irrelevant unlessDis established, so it functions as a distractor.
3. Candidate mechanisms
- Induction-like copying
- A prefix match can route from a use to an earlier matching context and copy the following token or feature, the mechanism on the induction heads page.
- Binding-ID features
- The model can mark a source and its later uses with a shared, abstract identifier that survives intervening tokens, so the link is carried by a feature rather than by position.
- Use-to-source attention
- A head can route directly from the use site to the licensed source, which is what a distractor closer in surface order is designed to disrupt.
- Scope-sensitive representations
- For code, a model can encode scope and syntax-tree position, so a use resolves through program structure rather than nearby text.
The binding-ID account has the most direct causal evidence. Feng and Steinhardt describe abstract binding vectors that tie an entity to its attributes; an intervention that swaps two entities' binding vectors swaps which attribute the model reports, which is the signature of a genuine binding variable rather than a positional cue. Dai and coauthors locate the same structure in a low-rank subspace, so the binding direction occupies a small part of the residual stream.
4. Representation and routing together
Binding is not a single circuit because it combines two jobs. The model has to mark the source, a representation question about which feature or direction carries the identifier, and it has to deliver that mark to the use site, a routing question about which head reads it. A model can do one well and the other poorly, which is how a nearer distractor wins: the source is represented, but routing attends to proximity instead.
The lookback mechanism is the account that joins the two jobs into one algorithm: the binding ID is written as an address, carried forward as a pointer, and dereferenced by a retrieval head that looks back to the source. The mechanism also varies by model and domain, so a result on agreement does not transfer automatically to code or logic. The shared template is what makes the comparison worth running. Where binding-ID interventions succeed, they show a model using an abstract variable, which is a stronger claim than any probe-level decodability result for the same dependency.
- Warstadt, Parrish, Liu, Mohananey, Peng, Wang, and Bowman (2020), "BLiMP", for linguistic minimal pairs.
- Allamanis, Brockschmidt, and Khademi (2018), "Learning to Represent Programs with Graphs", for VarMisuse and program-graph structure.
- Feng and Steinhardt (2024), "How do Language Models Bind Entities in Context?", for binding-ID interventions.
- Dai, Heinzerling, and Inui (2024), "Representational Analysis of Binding in Language Models", for low-rank binding subspaces.
- The Lookback Mechanism for how the binding ID is stored, carried, and dereferenced.
- Induction Heads for a copying mechanism that supports simple binding-like behavior.
- Dependency Trees & Structural Probes for another way to test nonlocal structure.