QK and OV Circuits

QK sets the attention weights. OV maps the weighted values into a residual-stream update.

The standard attention formula combines two operations inside one head. The query-key part sets the attention weights over source positions. The value-output part maps the selected source representations into the vector added to the residual stream. Elhage, Nanda, Olsson, and coauthors describe these as the QK and OV circuits.

An attention pattern shows where weight went. A mechanism claim also needs the write: what vector the head added, whether later components read it, and whether changing that write changes the behavior under study. A head can attend to the right token and write irrelevant information, or attend to a boring-looking token and write a specific feature.

QK	$q_t = x_t W_Q,\quad k_i = x_i W_K,\quad s_{ti} = q_t k_i^\top / \sqrt{d_k},\quad a_{ti} = \operatorname{softmax}(s_t)_i$	Scores source positions for destination token $t$.
OV	$v_i = x_i W_V,\quad z_t = \sum_i a_{ti} v_i,\quad \Delta x_t = z_t W_O$	Turns the weighted values into a residual-stream update.
Combined	$\Delta x_t = \sum_i a_{ti} x_i W_V W_O$	The attention weights and the written vector can be analyzed separately.

Figure 1 · QK weights, OV write

destination token 3

QK scale / inverse temperature 1.0

1. QK sets the attention weights

For each destination position, a head forms a query vector. For each possible source position, it forms a key vector. The dot products between query and keys become attention scores. After softmax, the destination receives a weighted mixture of source values. This part of the head determines which source positions receive high weight for the current destination token.

In an interpretable example, QK might implement "look to the previous token", "look to the matching earlier token", "look to the subject of this verb", or "look to the beginning-of-sequence token". Those descriptions are summaries of a pattern, not proof of a function. A head's causal role also depends on what the OV circuit does with the attended source.

2. OV writes the residual-stream update

The value and output matrices determine the vector added to the residual stream. That write can be read later by other heads, by MLPs, or by the final unembedding matrix. In a copying head, the OV circuit may write in a direction that promotes the source token. In a suppression head, it may write a direction that pushes a token down. In a syntactic head, it may write information that later components read.

Whether a head copies can be read off the OV path directly. Compose it with the embedding and unembedding to get the token-to-token map $W_E W_V W_O W_U$, which scores how attending to source token $i$ changes the logit of output token $j$. A copying head has large positive entries on the diagonal of that matrix: attending to a token promotes the same token at the output. The framework summarizes the matrix by the fraction of its eigenvalues that are positive, a copying score for the OV path that does not depend on where any particular attention pattern happened to point. A suppression head shows the opposite sign on the diagonal.

Attention weights underdetermine the mechanism. A large attention weight shows a source position that was read. It does not say what was read, whether the write mattered, or whether another component could compensate. QK and OV analysis separates routing evidence from write evidence.

3. Heads compose through QK and OV handoffs

Heads do not act in isolation. A head in one layer can write a feature that becomes the key or query for a head in a later layer. In an induction-circuit account, an earlier head marks token relationships, and a later head uses that mark to copy the next token. One component writes a direction; another component uses that direction for routing.

QK-composition and OV-composition are different handoffs. In QK-composition, an earlier head writes a direction that changes a later head's queries or keys, so the later head attends somewhere different. In OV-composition, an earlier write becomes content that another head, MLP, or the unembedding reads. Transformer circuits sometimes describe these chained routes as virtual attention heads: the effective computation is spread across several ordinary heads.

Citations

Elhage, Nanda, Olsson, and coauthors (2021), "A Mathematical Framework for Transformer Circuits", for the QK/OV decomposition, composition terms, and virtual attention heads.
Olsson, Elhage, Nanda, and coauthors (2022), "In-context Learning and Induction Heads", for induction heads as a concrete QK/OV copying mechanism.
McDougall, Conmy, Rushing, McGrath, and Nanda (2023), "Copy Suppression", for an attention head whose OV circuit suppresses repeated-token predictions.
Wang, Variengien, Conmy, Shlegeris, and Steinhardt (2023), "Interpretability in the Wild", for causal circuit analysis of name-mover and backup name-mover heads.

Related pages

Attention for the weighted-sum and softmax view of attention.
Residual Stream & Directions for the vector workspace that OV writes into.

What next

Mechanism

Induction Heads

A named head type where QK and OV form a clear copying algorithm.

Labels

Attention Head Labels

Where head labels stop short of mechanism claims.