Mamba stacks mixer layers, which can be the equal of Attention layers. The Main logic of mamba is held in the MambaMixer class.
Foundation models, now powering almost all of the interesting applications in deep https://k2spiceshop.com/product/liquid-k2-on-paper-online/