S01E03: Harness

Human in the Loop

S01E03: Harness

0:00

-49:47

S01E03: Harness

What turns a model into something useful

Dec 16, 2025

The co-host gets a French accent this week, courtesy of Mistral Large 3—a granular mixture-of-experts model from the European lab that keeps punching above its weight. But the real subject is the harness: the scaffolding that turns a language model into something that can act. Mark and the co-host dig into the “sandwich architecture” of voice agents (speech-to-text → LLM → text-to-speech), why it makes conversations feel like tennis matches, and the “criminally overlooked” practice of evals. A UC Berkeley paper provides the reality check: 68% of deployed agents need human intervention within ten steps, 70% use off-the-shelf models, and 74% depend on human evaluation. The hype says autonomous agents are coming. The data says we’re still building harnesses.