Skip to main content

Claude Opus 4.6 is my daily driver for building software. Not because it wins every benchmark. Because the tooling around it is years ahead of everything else.

GPT-5.4 Pro sits in a class by itself for complex academic work. And Gemini 3.1 Deep Think posts impressive benchmark numbers. But when I’m actually building, the model is only part of the stack. The harness around it matters just as much.

Claude Code gives me skills that automate multi-step workflows, MCP servers that connect directly to Figma, ClickUp, and my database, hooks that fire on events, and subagent orchestration that lets me parallelize research across the codebase while I keep building. I’m not copy-pasting between tools. The agent reads context, makes changes, and runs verification in one loop.

GPT-5.4 is genuinely strong. Gemini’s raw reasoning keeps improving. But Google keeps proving that a powerful model inside a weak harness just hallucinates with more confidence. Tool use reliability, reference handling, and staying inside the scaffolding still trip it up in production workflows.

The model race is real, but the orchestration layer is what’s keeping me shipping. Right now Claude’s ecosystem is the most complete developer toolchain by a wide margin. That could change. Curious what others are using for daily production work and whether the tooling or the model drives your choice.

Paranoia in the age of AIAI & Technology

Paranoia in the age of AI

Cory HessCory HessApril 24, 2026
Ai-enabled tumbleweedsAI & Technology

Ai-enabled tumbleweeds

Cory HessCory HessApril 24, 2026
AEO isn’t SEO with a new name, it rewards engineering as much as editorialAI & Technology

AEO isn’t SEO with a new name, it rewards engineering as much as editorial

Cory HessCory HessApril 23, 2026