State-Space Models

Beyond Mamba SSMs: Parallel Kalman Filters as Scalable Primitives for Language Modelling featured image

Beyond Mamba SSMs: Parallel Kalman Filters as Scalable Primitives for Language Modelling

We show that Kalman filters can be reparameterized for efficient parallel training and introduces GAUSS, a more expressive yet equally scalable state-space layer that outperforms …

Vaisakh Shaj
Read more