<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Mixture of Experts on The Coders Blog</title><link>https://thecodersblog.com/tag/mixture-of-experts/</link><description>Recent content in Mixture of Experts on The Coders Blog</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Thu, 07 May 2026 11:51:42 +0000</lastBuildDate><atom:link href="https://thecodersblog.com/tag/mixture-of-experts/index.xml" rel="self" type="application/rss+xml"/><item><title>ZAYA1-8B: Efficient Large Language Models with MoE</title><link>https://thecodersblog.com/zaya1-8b-a-new-moe-llm-2026/</link><pubDate>Thu, 07 May 2026 11:51:42 +0000</pubDate><guid>https://thecodersblog.com/zaya1-8b-a-new-moe-llm-2026/</guid><description>&lt;p&gt;Forget scaling up parameter counts; the future of LLMs is about &lt;em&gt;intelligence density&lt;/em&gt;, and ZAYA1-8B is the latest, and perhaps most compelling, testament to this shift. Zyphra’s new 8.4 billion total parameter model, with a mere 760 million active parameters per token, doesn&amp;rsquo;t just tread water – it sprints ahead in crucial areas, particularly mathematical and coding reasoning. This isn&amp;rsquo;t just another incremental improvement; it’s a statement piece that challenges the established dogma of &amp;ldquo;bigger is always better.&amp;rdquo;&lt;/p&gt;</description></item></channel></rss>