Module saev.nn.objectives
Functions
def get_objective(cfg: Vanilla | Matryoshka) ‑> Objective
def mean_squared_err(x_hat: jaxtyping.Float[Tensor, '*batch d'],
x: jaxtyping.Float[Tensor, '*batch d'],
norm: bool = False) ‑> jaxtyping.Float[Tensor, '*batch d']def ref_mean_squared_err(x_hat: jaxtyping.Float[Tensor, '*d'],
x: jaxtyping.Float[Tensor, '*d'],
norm: bool = False) ‑> jaxtyping.Float[Tensor, '*d']
Classes
class Loss
-
The loss term for an autoencoder training batch.
Expand source code
@jaxtyped(typechecker=beartype.beartype) @dataclasses.dataclass(frozen=True, slots=True) class Loss: """The loss term for an autoencoder training batch.""" @property def loss(self) -> Float[Tensor, ""]: """Total loss.""" raise NotImplementedError() def metrics(self) -> dict[str, object]: raise NotImplementedError()
Subclasses
Instance variables
prop loss : jaxtyping.Float[Tensor, '']
-
Total loss.
Expand source code
@property def loss(self) -> Float[Tensor, ""]: """Total loss.""" raise NotImplementedError()
Methods
def metrics(self) ‑> dict[str, object]
class MatryoshkaLoss
-
The composite loss terms for an training batch.
Expand source code
@jaxtyped(typechecker=beartype.beartype) @dataclasses.dataclass(frozen=True, slots=True) class MatryoshkaLoss(Loss): """The composite loss terms for an training batch.""" @property def loss(self) -> Float[Tensor, ""]: raise NotImplementedError()
Ancestors
Inherited members
class MatryoshkaObjective (cfg: Matryoshka)
-
Torch module for calculating the matryoshka loss for an SAE.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
Expand source code
@jaxtyped(typechecker=beartype.beartype) class MatryoshkaObjective(Objective): """Torch module for calculating the matryoshka loss for an SAE.""" def __init__(self, cfg: config.Matryoshka): super().__init__() self.cfg = cfg def forward(self) -> "MatryoshkaLoss.Loss": raise NotImplementedError()
Ancestors
- Objective
- torch.nn.modules.module.Module
Inherited members
class Objective (*args, **kwargs)
-
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn import torch.nn.functional as F class Model(nn.Module): def __init__(self) -> None: super().__init__() self.conv1 = nn.Conv2d(1, 20, 5) self.conv2 = nn.Conv2d(20, 20, 5) def forward(self, x): x = F.relu(self.conv1(x)) return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:
to
, etc.Note
As per the example above, an
__init__()
call to the parent class must be made before assignment on the child.:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Initialize internal Module state, shared by both nn.Module and ScriptModule.
Expand source code
@jaxtyped(typechecker=beartype.beartype) class Objective(torch.nn.Module): def forward( self, x: Float[Tensor, "batch d_model"], f_x: Float[Tensor, "batch d_sae"], x_hat: Float[Tensor, "batch d_model"], ) -> Loss: raise NotImplementedError()
Ancestors
- torch.nn.modules.module.Module
Subclasses
Methods
def forward(self,
x: jaxtyping.Float[Tensor, 'batch d_model'],
f_x: jaxtyping.Float[Tensor, 'batch d_sae'],
x_hat: jaxtyping.Float[Tensor, 'batch d_model']) ‑> Loss-
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the :class:
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
class VanillaLoss (mse: jaxtyping.Float[Tensor, ''],
sparsity: jaxtyping.Float[Tensor, ''],
l0: jaxtyping.Float[Tensor, ''],
l1: jaxtyping.Float[Tensor, ''])-
The vanilla loss terms for an training batch.
Expand source code
@jaxtyped(typechecker=beartype.beartype) @dataclasses.dataclass(frozen=True, slots=True) class VanillaLoss(Loss): """The vanilla loss terms for an training batch.""" mse: Float[Tensor, ""] """Reconstruction loss (mean squared error).""" sparsity: Float[Tensor, ""] """Sparsity loss, typically lambda * L1.""" l0: Float[Tensor, ""] """L0 magnitude of hidden activations.""" l1: Float[Tensor, ""] """L1 magnitude of hidden activations.""" @property def loss(self) -> Float[Tensor, ""]: """Total loss.""" return self.mse + self.sparsity def metrics(self) -> dict[str, object]: return { "loss": self.loss.item(), "mse": self.mse.item(), "l0": self.l0.item(), "l1": self.l1.item(), "sparsity": self.sparsity.item(), }
Ancestors
Instance variables
var l0 : jaxtyping.Float[Tensor, '']
-
L0 magnitude of hidden activations.
var l1 : jaxtyping.Float[Tensor, '']
-
L1 magnitude of hidden activations.
var mse : jaxtyping.Float[Tensor, '']
-
Reconstruction loss (mean squared error).
var sparsity : jaxtyping.Float[Tensor, '']
-
Sparsity loss, typically lambda * L1.
Methods
def metrics(self) ‑> dict[str, object]
Inherited members
class VanillaObjective (cfg: Vanilla)
-
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn import torch.nn.functional as F class Model(nn.Module): def __init__(self) -> None: super().__init__() self.conv1 = nn.Conv2d(1, 20, 5) self.conv2 = nn.Conv2d(20, 20, 5) def forward(self, x): x = F.relu(self.conv1(x)) return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:
to
, etc.Note
As per the example above, an
__init__()
call to the parent class must be made before assignment on the child.:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Initialize internal Module state, shared by both nn.Module and ScriptModule.
Expand source code
@jaxtyped(typechecker=beartype.beartype) class VanillaObjective(Objective): def __init__(self, cfg: config.Vanilla): super().__init__() self.cfg = cfg def forward( self, x: Float[Tensor, "batch d_model"], f_x: Float[Tensor, "batch d_sae"], x_hat: Float[Tensor, "batch d_model"], ) -> VanillaLoss: # Some values of x and x_hat can be very large. We can calculate a safe MSE mse_loss = mean_squared_err(x_hat, x) mse_loss = mse_loss.mean() l0 = (f_x > 0).float().sum(axis=1).mean(axis=0) l1 = f_x.sum(axis=1).mean(axis=0) sparsity_loss = self.cfg.sparsity_coeff * l1 return VanillaLoss(mse_loss, sparsity_loss, l0, l1)
Ancestors
- Objective
- torch.nn.modules.module.Module
Inherited members