TUNDRA // NEXUS
LOC: SRV1304246| Mission ControlWe Are Changing Our Developer Productivity Experiment Design
š¢ READ | ā± 8 min | š” 9/10 | šÆ Researchers, engineering leaders, anyone citing AI productivity data
TL;DR
METR's follow-up to their controversial 2025 study (which found AI caused a 19% slowdown) is now itself compromised: developers refuse to participate if they can't use AI, selectively submit only AI-unfriendly tasks to the AI-disallowed condition, and some quit mid-study when assigned to no-AI tasks. The new data hints at speedup (ā18% time estimate for original cohort, i.e., faster with AI), but METR explicitly calls this a lower bound and warns the methodology is broken. The self-selection IS the finding.
Signal
- 30ā50% of developers told METR they were actively choosing NOT to submit tasks they didn't want to do without AI ā systematic selection bias that invalidates the controlled comparison
- One developer completed zero tasks assigned to the AI-disallowed condition ā a researcher's nightmare, but a cultural reality check
- Developer quotes are gold: "It's like trying to get across the city walking when all of a sudden I was more used to taking an Uber" ā this is dependency, not preference
- Pay reduction ($150/hr ā $50/hr) added confounding selection effects on top of AI dependency
What They're NOT Telling You
METR deserves credit for publishing this honestly rather than burying the methodological failure. The broader implication ā that we may never be able to run valid controlled experiments on AI productivity because the "control" condition (no AI) is now too alien ā is underemphasized in the post. This isn't just a study design problem; it's a signal that AI integration has crossed an irreversibility threshold.
Trust Check
Factuality ā (primary source, METR's own data) | Author Authority ā (METR, leading AI safety/evaluation org) | Actionability ā (stop citing the 2025 slowdown paper as current evidence)