Play with the models from "Building Better Activation Oracles"

Ask a question. The model generates a chain of thought; we record its internal activations while it thinks.
Select tokens. Click or drag across the chain of thought to pick which positions the oracle gets to peek at. The first sampled position is always-on.
Ask the oracle. Type a question about the model's reasoning (e.g. "is it confident?", "what answer is it heading toward?"). Both Adam Original and Ours answer using only the selected activations — no access to the CoT text.

Generate a CoT to populate the selectable activation text.

Click or drag across highlighted tokens.

Answer (click to expand)

Run the oracle to see output.

Run the oracle to see output.