articles

The Interpolation Trap

Name: The Interpolation Trap
Author: Jeremy Howard, MLST

Jeremy Howard, MLST

29 highlights

2026-roadmap-reflection software-design agentic-philosophy-traces agentic-req-ux agentic-concepts agentic-coding agentic-product-philosophy bigideas-concepts

Highlights & Annotations

Jeremy Howard—deep learning pioneer, creator of ULMFiT and fast.ai, Kaggle grandmaster—sits down for a wide-ranging conversation that cuts through the euphoria surrounding AI-powered coding. His argument is uncomfortable and empirically grounded: LLMs are extraordinarily good at coding, the mechanical act of translating specifications into syntax, but they are not good at software engineering, the discipline of designing systems that humans can understand, maintain, and evolve. This distinction, which most people conflate, is the fault line along which careers and companies will fracture. Howard reveals that the psychology of AI coding closely mirrors gambling addiction—complete with illusions of control, intermittent reinforcement, and losses disguised as wins—while the actual productivity data shows only a “tiny uptick” in what people are shipping. The conversation spans the philosophical foundations of machine understanding, the history of transfer learning that Howard helped create, the case for interactive programming environments, and a deeply personal mission to stop people working in ways that make them dumber.

Ref. 6499-A

And yet his central claim is devastating: the gap between what AI coding tools appear to deliver and what they actually deliver is not a minor discrepancy. It is a chasm—one concealed by the same psychological mechanisms that make slot machines addictive. The appearance of productivity, the feeling of velocity, the occasional genuine win—these create a subjective experience of transformation while the objective data tells a different story.

Ref. F4CF-B

“The thing about AI-based coding is that it is like a slot machine in that you have an illusion of control. You get to craft your prompt and your list of MCPs and your skills and whatever, and then in the end you pull the lever.”

Ref. DB9B-C

But Howard goes further than mere skepticism. His argument has a precise shape: LLMs are extraordinarily good at coding —the act of translating a specification into syntax—and fundamentally incapable of software engineering —the discipline of designing systems that can be understood, maintained, extended, and composed into larger wholes. This distinction, which most executives and many engineers collapse into a single category, is the load-bearing wall of the entire analysis. Everything else follows from it.

Ref. 4813-D

The alarm is this: the default behavior of AI coding tools is to make people dumber. Not as a side effect, but as the natural attractor state of a system that removes friction from a process where friction is the mechanism of learning. Organizations that chase AI-powered productivity metrics without understanding this dynamic are, in Howard’s assessment, setting themselves up to be destroyed—not by AI, but by the accumulated technical debt and human capability debt that AI-assisted coding leaves in its wake.

Ref. 858A-E

This is where the story turns from celebration to caution. Howard observes, multiple times daily in his R&D; work, a phenomenon that anyone using LLMs for novel tasks will recognize: the model goes from being “incredibly clever to worse than stupid—not understanding the most basic fundamental premises about how the world works.” The transition is sudden and complete. One moment the model is producing brilliant output; the next, it has fallen outside the training data distribution and its responses become nonsensical.

Ref. A982-F

“I see it multiple times every day where the LLM goes from being incredibly clever to worse than stupid—not understanding the most basic fundamental premises about how the world works. And then there is no point having that discussion any further because you have lost it at that point.” This is the voice of someone who works at the edge of the training distribution daily and has calibrated intuitions about where it breaks.

Ref. 75D5-G

With this framework in place, Howard’s analysis of AI coding becomes precise. Coding—the act of translating a specification into working syntax—is, in his formulation, a style transfer problem . You take a specification of what needs to happen, find the parts of the training data that match related problems, interpolate between them, map the result to the syntax of the target language, and produce code. This is something language models are excellent at, because it operates almost entirely within the training distribution.

Ref. 60EA-H

Fred Brooks wrote “No Silver Bullet” decades ago, responding to an almost identical wave of enthusiasm about fourth-generation languages that would make software engineers obsolete. He predicted a maximum thirty-percent improvement in productivity. Howard invokes Brooks not for historical color but because the essential argument has not changed: the vast majority of work in software engineering is not typing code.

Ref. F9D2-I

Howard is explicit about his own experience: perhaps ninety percent of his code is now typed by a language model. But it has not made him proportionally more productive, “because that was never the slow bit.” The slow bit—the hard bit—is everything else: understanding the domain deeply enough to know what to build, designing abstractions that compose correctly, recognizing when a surface-level similarity conceals a fundamental difference, knowing which pieces to make and how big they should be, testing that pieces behave as intended, and building everything up in incremental steps where each step can be verified.

Ref. 66F5-J

This distinction has teeth. When Howard tried to get an LLM to design a solution to something that had not been designed many times before, the result was “horrible—because what it actually every time gives me is the design of something that looks on its surface a bit similar.” This is precisely the training-distribution boundary manifesting in practice. The model finds the nearest interpolation point in its training data and presents it as a solution. For novel design problems, the nearest interpolation point is often a disaster disguised as relevance.

Ref. 2BEA-K

Howard strengthens this claim with a specific observation: every major piece of AI-generated software that has been examined closely—the Cursor browser, the Anthropic compiler—turns out to be an obvious copy of things that already exist. When Chris Lattner examined the compiler source code, he found it replicated his own idiosyncratic decisions. When you look at the AI-generated browser, you find the same patterns. These are not examples of software engineering; they are examples of very sophisticated coding. The distinction is not academic. It determines whether the output can survive contact with the real world—whether it can be maintained when requirements change, debugged when edge cases appear, and evolved when the domain shifts.

Ref. 6CF8-L

The software engineering skills that matter—knowing how to recognize what the right pieces are, how to design them, how to compose them into larger wholes—normally require decades of experience. Howard reckons he got “pretty good at it after maybe twenty years.” This is the skill that is now more important than it has ever been, and it is precisely the skill that AI coding tools do nothing to develop.

Ref. D414-M

“They are really bad at software engineering. And I think that is possibly always going to be true. There is no current empirical data to suggest that LLMs are gaining any competency at software engineering.”

Ref. C8E2-N

The IPyKernel case reveals a new category of technical debt: comprehension debt . Unlike traditional technical debt, which accumulates through known shortcuts, comprehension debt accumulates invisibly. The code works—tests pass, features ship—but no human possesses the mental model required to maintain, debug, or evolve it. This is not a temporary state that will resolve with better AI. It is the natural consequence of outsourcing the cognitive work of understanding to a system that does not itself understand.

Ref. F192-O

César Hidalgo’s framework: knowledge cannot be exchanged like a commodity. It must be built through experience with friction—mistakes, corrections, the feeling of reality pushing back. When AI removes this friction, it does not transfer the knowledge; it prevents the knowledge from forming. The organization appears productive (outputs are being produced) while the actual knowledge substrate is degrading. This is the enfeeblement thesis in its sharpest form.

Ref. F00A-P

Illusion of control. You craft your prompt. You curate your MCPs. You tune your skills. You set up your context. It feels like you are making decisions that influence the outcome. But in the end, you pull the lever—you submit the prompt—and something stochastic comes back. The feeling of control is real; the actual control is not.

Ref. 27A7-Q

Second, Anthropic’s own study on learning with Claude Code, which Howard says “contradicted Dario completely.” The study found that there was a small group of users who asked conceptual questions, maintained engagement with the underlying problem, and showed a gradient of learning. But most people did not. Most people entered an “autopilot mode” where the AI removed so much friction from the process that no learning occurred. The default attractor state—the natural tendency of the system—was enfeeblement.

Ref. F7BA-R

The interviewer identifies a pattern worth naming: the pernicious erosion of control . It begins innocuously—ten percent AI-generated code. Six months later, a PR comes in and sixty percent is AI-generated. You slowly become disconnected from your own codebase. The erosion is pernicious precisely because each incremental step feels manageable. No single PR crosses a threshold. But the cumulative effect is that the team loses the ability to understand, maintain, and evolve its own product.

Ref. 216A-S

The intercept is where you are now—your current productivity, your current capability, your current output rate. The slope is how fast you are growing. AI coding tools, in Howard’s analysis, optimize aggressively for the intercept. They make you produce more right now, at the cost of the learning that would make you produce better in the future. They increase output while flattening the slope.

Ref. AAE6-T

“Almost the only thing I care about is how much your personal human capabilities are growing. I do not actually care how many PRs you are doing. If you focus on just driving out results at the limit of whatever AI can do right now, you are only caring about the intercept. It is basically a path to obsolescence—for the company and the people who are in it.”

Ref. 370E-U

This is a radical position for a CEO to take. Howard tells his staff that he does not care about their PR count or feature velocity. He cares about whether their personal capabilities are growing. The reasoning is straightforward: if the team’s slope is high—if people are developing deeper understanding, better design intuitions, more robust engineering judgment—then the company will outperform competitors regardless of short-term output differences. If the slope is zero or negative—if people are becoming dependent on AI tools without developing underlying capabilities—then increasing output today is borrowing against

Ref. 262B-V

The slope principle connects to a body of research on learning that Howard knows well. Desirable difficulty —the concept from educational psychology—holds that learning happens most effectively when it is hard. Not impossible, but effortful. The classic example is spaced repetition: the algorithm schedules flashcards at the moment just before you would forget, ensuring that every review session requires significant cognitive effort.

Ref. 022C-W

And here is where it gets genuinely interesting—and genuinely troubling. Howard now has a critical piece of his company’s infrastructure that nobody understands . Not the AI that wrote it. Not Howard himself, who guided the process but did not develop deep comprehension of the five-thousand-line, multi-threaded, event-driven codebase. Nobody.

Ref. 51A5-X

Cosplaying Intelligence

Ref. AD9F-Y

Creativity and Its Limits

Ref. 8A7F-Z

Intermittent reinforcement. Sometimes you get cherry-cherry-cherry. A feature appears. It works. The dopamine hits. Next time, you adjust your prompt slightly, pull again. Pull again. The occasional win keeps you at the machine. This is the same variable-ratio reinforcement schedule that makes slot machines the most addictive form of gambling.

Ref. 37AE-B

THE SELF-DRIVING CAR ANALOGY

Ref. EE0C-C