Datapoint Initialization¶
Datapoint initialization is an SAE weight initializations strategy independently proposed by Anthropic and Pierre Peigne for improving SAE training.
Conceptually, we initialize each decoder column to look like a real datapoint, so every latent starts with a patch of input space where it "wins" and gets some gradient. Here's the algorithm:
- Select \(n\) random data points from your training data.
- Compute the mean \(\mu\) and zero-center the data: \(x_0 = x - \mu\).
- Linearly blend each zero-centered datapoint with Kaiming initialization: \(w = p \cdot (x - \mu) + (1 - p) \cdot r\) where \(p\) is your blend probability and \(r\) is a randomly sampled Kaiming initalization vector.
- Initialize \(W_\text{enc}\) as a concatenation of \(n\) blended vectors.
- Initialize \(W_\text{dec}\) as \(W_\text{enc}^T\).
Anthropic suggests \(p = 0.8\) for SAEs and 0.4 for "weakly causal crosscoders". I interpret this that there is no universally appropriate \(p\).