Patch Explorer:
Interpreting Diffusion Models through Interaction

Imke Grabe1,2, Jaden Fiotto-Kaufman2, Rohit Gandikota2, David Bau2
1IT University of Copenhagen, 2Northeastern University

ArXiv Preprint thumbnail
MIV @ CVPR
Workshop Paper
Github code thumbnail
Source Code
Github
poster thumbnail
MIV @ CVPR
Workshop Poster
Github code thumbnail
CVPR
Art Gallery
Github code thumbnail
Demo

Does Interaction via Diffusion Models' Internals support Interpretabilty?

Patch Explorer is an interactive interface for visualizing and manipulating the patches as they are processed by cross-attention heads. Built on interventions via NNsight, our interface lets users inspect and manipulate individual attention heads over layers and timesteps. Interaction via the interface reveals that attention heads independently capture semantics, like a unicorn's horn, in diffusion models. Next to offering a way to analyze its behavior, users can also intervene with Patch Explorer to edit semantic associations within diffusion models, like adding a unicorn horn to a horse. Our interface also helps understand the role of a diffusion timestep through precise interventions. By providing a visualization tool with interactivity based on attention heads, we aim to shed light on their role in generative processes.



How do Diffusion Models Encode Semantic Concepts?

Latent diffusion models operate in a compressed latent space rather than directly in pixel space. The latent space is organized into patches, which are spatial units that correspond to regions in the output image. At every layer of the model's U-Net, here laid out horizontally, multiple attention heads, ordered vertically, perform the attention mechanism in parallel and over several timesteps. Patch Explorer lets users interfere in the generation process by applying interventions to the patches as they are processed.

Interface components of Patch Explorer: After (1) providing a text prompt and a seed to generate (2) an image, users can inspect the (3) patch grids of the individual attention heads visualized. Magenta represents positive and cyan represents negative activation. A slider (4) lets users inspect timesteps for a selected range. For the selected timestep range, users can choose to (5) an intervention to apply to selected patch grids.

Direct Manipulation of Cross-Attention Heads

At the core of diffusion models is the attention mechanism, which enables content-based interactions between different spatial locations. In the cross-attention layers of diffusion models, the K and V matrices are derived from text encodings, while Q comes from the image representation. We propose to target the input and output of cross-attention heads through direct manipulation.

Direct Manipulation of the cross-attention mechanism: A (1) Patch Grid offers a representation to spatially target the outputs of attention heads with interventions. The intervention (2) Scaling multiplies the output of the attention head by a given scalar, while (3) Encoding replaces the output for targeted patches with the output for an alternative text encoding provided by the user.
In that way, patch grids become a new interaction modality, letting users interfere with the internal states of the model:
Interacting with patch grids: The user chooses a patch grid by clicking on it, after which holding the shift lets them ''draw'' by moving the mouse over patches to select them, which marks them green. Clicking again allows users to quickly select many attention heads.

Can we find Specific Visual Concepts through Interaction?

The interface lets users explore the role of cross-attention heads in the generation process. For example, we find that two attention heads are responsible for generating the horn on the head of a unicorn.

Inspecting attention heads: By adjusting the timestep slider, the user can inspect how the horn feature evolves over time at Layer 9 Head 3 and 4.
By interacting with these attention heads, images can be altered, e.g. through scaling.
Applying interventions with Patch Explorer: In the (1) dropdown menu, the user selects an intervention, like Scaling, and applies it to the selected (2) timestep range by (3) selecting the desired patches.
Scaling the attention heads to 0 ablates the attention heads' affect to the residual stream and has the affect that the unicorn's horn disappers. Increasing their effect, on the other hand, amplifies the feature, not only for unicorn horns:
Scaling the two attention heads amplifies or removes the horn not only for unicorns, but also for other horned animals, confirming these heads' general role in generating horns.

The attention heads can be used to transfer the visual feature to other horse-like concepts. For example, for the prompt ''Pegasus'', a unicorn horn can be added by encoding ''unicorn'' into relevant patches. Additionally, we find that the Pegasus turns into a regular horse when scaling down the influence of patches at Layer 8, Head 7, which seems to be responsible for generating its wing.

Restricting interventions to specific timestep ranges shows how features are formed throughout the generation process, like the unicorn horn on the horse' head, or the Pegasus' wings.

Evolution of horn over timesteps: By encoding the prompt ''unicorn'' at the relevant attention heads for a growing number of timesteps, we can observe how a horn is added to a horse.
Evolution of wings over timesteps: By gradually increasing the contribution of the attention head that causes the Pegasus' wings, we can inspect how they are formed over timesteps.

For a detailed usage scenario with more examples, take a look at our paper linked above.

How to cite

The paper can be cited as follows.

bibliography

Imke Grabe, Jaden Fiotto-Kaufman, Rohit Gandikota, David Bau. "Patch Explorer: Interpreting Diffusion Models through Interaction." Mechanistic Interpretability for Vision at CVPR 2025 (Non-proceedings Track).

bibtex

  @inproceedings{
    grabe2025patch,
    title={Patch Explorer: Interpreting Diffusion Models through Interaction},
    author={Imke Grabe and Jaden Fiotto-Kaufman and Rohit Gandikota and David Bau},
    booktitle={Mechanistic Interpretability for Vision at CVPR 2025 (Non-proceedings Track)},
    year={2025},
    url={https://openreview.net/forum?id=0n9wqVyHas}
    }