saev
saev is a framework for training and evaluating Sparse autoencoders (SAEs) for vision transformers (ViTs), implemented in PyTorch.
Installation
Installation is supported with uv. saev will likely work with pure pip, conda, etc. but I will not formally support it.
Clone this repository, then from the root directory:
uv run python -m saev --help
This will create a virtual environment and display the CLI help.
Quick Start
Save some activations to disk:
uv run scripts/launch.py shards \
--shards-root /$SCRATCH/saev/shards \
--family clip \
--ckpt ViT-B-32/openai \
--layers 11 \
--patches-per-ex 49 \
--batch-size 256 \
data:cifar10
Read the guide for details.
Why saev?
There are plenty of alternative libraries for SAEs:
- Overcomplete, primarily developed by Thomas Fel.
However, saev has some benefits:
- saev is more of a framework, rather than a library. The reason for this is that SAEs require lots of activations to train a relatively small neural network; while you can implement it with a simple inference loop, efficient training requires some caching on disk. This means using saev is a little more like Keras or PyTorch Lightning than Huggingface's Transformers or Datasets libraries.
- saev offers lots of tools for interacting with sparse autoencoders after training, including interactive notebooks and evaluations.
- saev includes complete code from preprints in the
contrib/
directory, along with logbooks describing how the authors used and developed saev.