George Hotz—the teenage prodigy who famously jailbroke the iPhone and later reverse-engineered the PlayStation 3—has fired a frank warning about the mass rollout of AI coding agents: it could be “one of the most costly mistakes in the field’s history.”
In a new blog post titled "The Eternal Sloptember," Hotz argues that agent-driven coding systems don’t actually “program” in a reliable way. After six months of hands-on experiments—using agents to extend Tinygrad (his open-source deep learning framework) and to reverse-engineer the firmware of a USB–PCIe chip—he says the pattern is consistent: agents accelerate early progress, then hand developers a brittle, messy product that never quite gets finished. “The agent frontloads all the progress,” he writes. “You pull the lever and hope the finishing work gets done. It never quite does.” His indictment is blunt: “Agents cannot program, and it’s taking longer and longer to realize that they can’t.” And worse, he adds, the failures are subtle: “The output is broken, but in a way that’s getting harder and harder to detect. Which is exactly what you’d expect from an increasingly accurate statistical model.”
Why this matters now
Hotz’s post lands amid a sharp industry split. Five days earlier, Andrej Karpathy—one of the most visible AI researchers—announced his move to Anthropic’s pre-training team, calling the next few years “especially formative” for large models. Karpathy and Anthropic’s leadership have publicly embraced agentic workflows: Anthropic CEO Dario Amodei said at Davos that some engineers there already let models generate code and merely review the results. Microsoft, too, went hard on agents when it converted GitHub Copilot into a full agentic system in 2025, with CEO Satya Nadella framing the shift as on par with the move to cloud computing.
Hotz is on the opposite side of that debate. He aligns himself with the so-called LeCun/Marcus perspective—Yann LeCun and Gary Marcus being prominent skeptics who view large language models largely as sophisticated pattern-matchers, not true reasoners. Hotz warns that when companies push agents across entire engineering organizations, the effect on average code quality will be negative: high performers will still catch and correct agent errors because they have tight feedback loops, while lower performers—supercharged by agents to deliver many more patches and PRs—won’t. The result, he predicts, is “a golden era for buckets and buckets of slop, and a dark age for gems of quality.”
Anticipating the defense that this is just fear of replacement, Hotz pushes back. He cites automated tools like Google’s AFL (American Fuzzy Lop), which found many bugs without prompting existential angst among programmers, and notes how chess and Go grew in popularity after AI dominance. His real worry is organizational: widespread adoption can mask a steady decline in code quality under the cover of increased velocity. He even speculates the marketing push might be partly a sales tactic: “I almost think this is some kind of psyop to sell agents. Fear of loss is one of the only ways to make big companies move.”
Concrete stakes for crypto engineers
For crypto and blockchain projects, Hotz’s critique should sound particularly loud. Smart contracts are unforgiving: subtle bugs in deployed code can mean irreversible loss of funds. If agents become the default way to generate contract code, the risk is not just buggy features but systemic, hard-to-detect vulnerabilities spreading across DeFi, NFT systems, and layer-2 infrastructure—especially if teams rely on agents without rigorous reviews, testing, and formal verification.
What to watch
- Tool adoption vs. process: Agents can speed prototyping, but teams must maintain strict reviews, audits, and formal methods where appropriate.
- Visibility of failures: Expect more subtle, statistical errors that pass quick tests but fail in edge conditions.
- Organizational incentives: Monitor whether companies’ drive for velocity is trumping quality controls.
- Sector-specific risk: In crypto, prioritize audits and on-chain safety checks before trusting agent-produced code.
Contextual counterpoints
Not everyone agrees with Hotz. Karpathy, previously skeptical of agents, has publicly changed course after recent model improvements and joined Anthropic on May 19, 2026. Anthropic engineers’ practice of reviewing model output rather than writing every line themselves is the practical argument in favor of agent workflows. Hotz says he tried the same hands-off approach and consistently found himself reverting to manual fixes.
Bottom line
Hotz’s warning is a call for caution rather than Luddism: AI agents are powerful, but their outputs are statistical approximations—not substitutes for careful engineering judgment. For crypto teams, where the cost of defects is uniquely high, the post is a timely reminder that velocity without vigilant review and verification can convert innovation into contagion.
Read more AI-generated news on: undefined/news