Olmo Hybrid - 7B open model mixing transformers and linear RNNs

rising-star (59)in #steemhunt • 9 days ago

Olmo Hybrid

7B open model mixing transformers and linear RNNs

Screenshots

Hunter's comment

Hybrid language models – architectures that mix transformer attention with linear recurrent layers – have been gaining momentum across the field, with recent efforts from projects like Samba, Nemotron-H, Qwen3-Next, Kimi Linear, and Qwen 3.5. By combining transformers' ability to recall precise details from earlier in a sequence with recurrent layers' efficiency at tracking evolving state, hybrids promise to be both more capable and cheaper to run at long context lengths