2026-03-07
Benchmarks
2026-03-05
Which Local LLMs Can Actually Use Tools?
I ran a 15-test tool-calling benchmark against every local model on my Ryzen AI Max+ 395. The results were not what I expected.
2026-03-01
I Tested 10 AI Models So You Don't Have To
A weekend spent benchmarking every promising local AI model on consumer hardware. Here's what actually works.
2026-02-26
Bigger Isn't Better: How a 9GB Model Beat 120B Parameters
I benchmarked 17 local LLMs across 13 dimensions with 39 tests. The results destroyed my assumptions about model size.
2026-02-23
GPT-OSS 120B: First Benchmarks on Consumer AMD Hardware
Real benchmarks of OpenAI's open-weight 120B MoE model running on a Ryzen AI Max+ 395 with 128GB unified memory. No cloud, no A100s, just bare metal.