Play with the models from "Building Better Activation Oracles"

Tip: select 3–5 activations for best results. Too few and the oracle gets confused or starts thinking; too many washes out which token you cared about.
The first CoT token is always injected (shown in teal, non-toggleable). This was a quirk in early training — the oracle always saw the first context position as an anchor — and presumably helps with grounding. You select additional positions on top.
  1. Ask a question. The model generates a chain of thought; we record its internal activations while it thinks.
  2. Select tokens. Click or drag across the chain of thought to pick which positions the oracle gets to peek at. The first sampled position is always-on.
  3. Ask the oracle. Type a question about the model's reasoning (e.g. "is it confident?", "what answer is it heading toward?"). Both Adam Original and Ours answer using only the selected activations — no access to the CoT text.
Generate a CoT to populate the selectable activation text.

2. Chain of Thought — select tokens

Click or drag across highlighted tokens.
Answer (click to expand)

Adam Original

Run the oracle to see output.

Ours

Run the oracle to see output.