Lists / People

Anthropic Interpretability Team

circuits, sparse autoencoder, monosemanticity reports

Anthropic's mechanistic interpretability research group, known for the circuits thread, monosemanticity findings, and sparse autoencoder papers on neural network internals.

@AnthropicAIWikipedia
#37AI Researchers Who Communicate
55/ 100
Authority scoreEstablished
1lists#37peak1primary1category
Appears alongside

Top neighbors of Anthropic Interpretability Team

8people
Last updated
10 May 2026
Suggest an edit →