Skip to content

saev

PyPI Downloads MIT License GitHub Repo stars

saev is a framework for training and evaluating Sparse autoencoders (SAEs) for vision transformers (ViTs), implemented in PyTorch.

Installation

Installation is supported with uv. saev will likely work with pure pip, conda, etc. but I will not formally support it.

Clone this repository, then from the root directory:

uv run python -m saev --help

This will create a virtual environment and display the CLI help.

Quick Start

Save some activations to disk:

uv run scripts/launch.py shards \
  --shards-root /$SCRATCH/saev/shards \
  --family clip \
  --ckpt ViT-B-32/openai \
  --layers 11 \
  --patches-per-ex 49 \
  --batch-size 256 \
  data:cifar10

Read the guide for details.

Why saev?

There are plenty of alternative libraries for SAEs:

However, saev has some benefits:

  1. saev is more of a framework, rather than a library. The reason for this is that SAEs require lots of activations to train a relatively small neural network; while you can implement it with a simple inference loop, efficient training requires some caching on disk. This means using saev is a little more like Keras or PyTorch Lightning than Huggingface's Transformers or Datasets libraries.
  2. saev offers lots of tools for interacting with sparse autoencoders after training, including interactive notebooks and evaluations.
  3. saev includes complete code from preprints in the contrib/ directory, along with logbooks describing how the authors used and developed saev.