febhd_clustering.Encoder¶
-
class
febhd_clustering.Encoder(features: int, dim: int = 4000)¶ Bases:
objectThe nonlinear encoder class maps data nonlinearly to high dimensional space. To do this task, it uses two randomly generated tensors:
\(B\). The (dim, features) sized random basis hypervectors, drawn from a standard normal distribution \(b\). An additional (dim,) sized base, drawn from a uniform distribution between \([0, 2\pi]\).
The hypervector \(H \in \mathbb{R}^D\) of \(X \in \mathbb{R}^f\) is:
\[H_i = \cos(X \cdot B_i + b_i) \sin(X \cdot B_i)\]- Parameters
-
__call__(x: torch.Tensor)¶ Encodes each data point in x to high dimensional space. The encoded representation of the (n?, features) samples described in \(x\), is the (n?, dim) matrix \(H\):
\[H_{ij} = \cos(x_i \cdot B_j + b_j) \sin(x_i \cdot B_j)\]Note
This encoder is very sensitive to data preprocessing. Try making input have unit norm (normalizing) or standarizing each feature to have mean=0 and std=1/sqrt(features) (scaling).
- Parameters
x (
torch.Tensor) – The original data points to encode. Must have size (n?, features).- Returns
The high dimensional representation of each of the n? data points in x, which respects the equation given above. It has size (n?, dim).
- Return type