Module saev.data.config

Classes

class Activations (shard_root: str = './shards',
patches: Literal['cls', 'image', 'all'] = 'patches',
layer: Union[int, Literal['all']] = 'all',
clamp: float = 100000.0,
n_random_samples: int = 524288,
scale_mean: bool | str = True,
scale_norm: bool | str = True)

Configuration for loading activation data from disk.

Expand source code
@beartype.beartype
@dataclasses.dataclass(frozen=True)
class Activations:
    """
    Configuration for loading activation data from disk.
    """

    shard_root: str = os.path.join(".", "shards")
    """Directory with .bin shards and a metadata.json file."""
    patches: typing.Literal["cls", "image", "all"] = "patches"
    """Which kinds of patches to use."""
    layer: int | typing.Literal["all"] = "all"
    """Which ViT layer(s) to read from disk. ``-2`` selects the second-to-last layer. ``"all"`` enumerates every recorded layer, and ``"meanpool"`` averages activations across layers."""
    clamp: float = 1e5
    """Maximum value for activations; activations will be clamped to within [-clamp, clamp]`."""
    n_random_samples: int = 2**19
    """Number of random samples used to calculate approximate dataset means at startup."""
    scale_mean: bool | str = True
    """Whether to subtract approximate dataset means from examples. If a string, manually load from the filepath."""
    scale_norm: bool | str = True
    """Whether to scale average dataset norm to sqrt(d_vit). If a string, manually load from the filepath."""

Class variables

var clamp : float

Maximum value for activations; activations will be clamped to within [-clamp, clamp]`.

var layer : Union[int, Literal['all']]

Which ViT layer(s) to read from disk. -2 selects the second-to-last layer. "all" enumerates every recorded layer, and "meanpool" averages activations across layers.

var n_random_samples : int

Number of random samples used to calculate approximate dataset means at startup.

var patches : Literal['cls', 'image', 'all']

Which kinds of patches to use.

var scale_mean : bool | str

Whether to subtract approximate dataset means from examples. If a string, manually load from the filepath.

var scale_norm : bool | str

Whether to scale average dataset norm to sqrt(d_vit). If a string, manually load from the filepath.

var shard_root : str

Directory with .bin shards and a metadata.json file.