Archives
- 30 Apr Variational Autoencoder (VAE)
- 13 Apr How does refusal ablation work?
- 01 Mar Noobs guide to mechanistic interpretability
- 13 Jan Removing the Refusal Direction:How I Turned Param-1 Into an Uncensored Model Without Fine-Tuning
- 08 Jan Executing Toxicity Mechanistic Localization of Toxic Behavior in a Fine-Tuned Transformer