Gemma 4 26B: Google Drops a MoE Monster With 4B Active Params
Google DeepMind released Gemma 4 today. The 26B A4B MoE scores 1441 on LMArena while burning just 4B active parameters. Here's what that means for local inference.
Google DeepMind released Gemma 4 today. The 26B A4B MoE scores 1441 on LMArena while burning just 4B active parameters. Here's what that means for local inference.
Mistral Small 4 is a 119B MoE model with only 6B active parameters. I ran it on AMD Strix Halo hardware and got real numbers.
I ran a 15-test tool-calling benchmark against every local model on my Ryzen AI Max+ 395. The results were not what I expected.
A weekend spent benchmarking every promising local AI model on consumer hardware. Here's what actually works.
I benchmarked 17 local LLMs across 13 dimensions with 39 tests. The results destroyed my assumptions about model size.
Real benchmarks of OpenAI's open-weight 120B MoE model running on a Ryzen AI Max+ 395 with 128GB unified memory. No cloud, no A100s, just bare metal.