The Bottleneck Moved. Your Operating Model Didn't.

Software engineering as an industry was built around one scarcity: developer throughput. Most companies are still optimizing around it while AI is dissolving it.

May 28, 2026

Man presenting charts on a large screen. — Photo by Vitaly Gariev on Unsplash

The Slide That Said Nothing

A few weeks ago a fellow engineering manager showed me a slide from a leadership review in their company. The slide was about engineering productivity, year-over-year. On the surface it looked great:

Pull request count up.
AI tooling adoption up.
Lines of code per engineer up.

The slide was meant to be reassuring. The implicit story was: we invested in AI, the metrics moved, we are winning.

Looking deeper though, what the slide actually showed was a company running its old operating model harder, with new tools strapped on. Same sprints. Same story points. Same RFC review cycles. Same architecture review boards. Same quarterly planning. They swapped petrol for kerosene and were proud the engine was burning more of it.

That same conversation is repeating across my network right now, in different forms. Founders, CTOs, VPs of Engineering — all of them have rolled out AI tooling, all of them are tracking adoption rates, none of them have stopped to ask the more uncomfortable question: what is this operating model actually optimizing for, and is that thing still the bottleneck?

The honest answer is no. And almost no one is acting like it.

The Constraint That Built the Industry

Software engineering as an industry was built around one scarcity: developer throughput. Code was expensive to produce. Running systems at scale required meticulousness on the engineers’ part. Hiring was a multi-month exercise.

Headcount was strategy (2020 hiring frenzy anyone?), because companies were valued partly on how many engineers they had. The logic went: engineers produced code, code became features and intellectual property, and intellectual property is one of the main factors that drive the valuation up. “We have 200 engineers” was a moat.

Now look at the rituals we built to manage that constraint:

Scrum
Story points
Sprint planning
Standups
Velocity charts
RFCs
Multi-stage code reviews
Architecture review boards
Quarterly roadmaps with X pre-committed initiatives
Six-week design phases for two-week implementation phases

Every artifact, every ritual, every approval gate exists for one reason: to manage the scarcity of expensive, slow human IP production. You did upstream design because mistakes were expensive. You did sprint planning because you needed to allocate scarce capacity carefully. You did multi-stage code review because every line of code was a long-term liability you couldn’t easily rewrite. You committed quarterly because once you committed an engineering team for a quarter, you couldn’t easily redirect them.

The factory line analogy is apt. We built a production line where the workers were expensive and the work was hard, and we optimized accordingly. Conservative roadmaps. Heavy upstream planning. Few experiments. Each bet had to be high-confidence because each bet was costly.

This was a rational modus operandi for the constraint. But now, the constraint is disappearing.

a group of people working in a factory — Photo by TruckRun on Unsplash

What’s Actually Scarce Now

AI and LLMs have collapsed the cost and time of producing code. Yes, token budgets are currently subsidized. Yes, the unit economics will normalize. But the structural direction is unmistakable.

The data is no longer ambiguous. A quarter of YC’s W25 batch had 95% of their code written by AI, per YC’s own partners (LeadDev, March 2025). 41% of all code is now AI-generated, totalling 256 billion lines in 2024 alone (InfoWorld, May 2025). Cursor scaled from $1M to $100M ARR in 12 months with a tiny team (Sacra); Midjourney hit $200M ARR with fewer than 50 employees (CB Insights). Shopify now requires every team to demonstrate why the work cannot be done with AI before approving additional headcount — a working-principle change in the operating model itself, per CEO Tobi Lütke’s public memo on X (April 2025). Gartner projects that by 2030, 80% of organizations will evolve large engineering teams into more nimble AI-augmented teams (Gartner, October 2025).

You can argue with the magnitude, but not with the direction.

The pillar that the entire industry rested on — that code production is the bottleneck — has shifted. What’s scarce now is different in kind:

Of these, judgment and conviction are the most differentiating — and also the hardest to develop. You can’t bootcamp your way to judgment or pair-program your way to conviction. They come from years of being close to users, watching bets succeed or fail, and building the pattern recognition that lets you say ”no, not that, this” with enough authority that the team listens.

Engineering excellence is technical skill applied with deep product understanding. AI didn’t invent that thesis. AI just made it the only thing that matters.

a woman showing a tablet to another woman — Photo by Cova Software on Unsplash

The Bolt-On Trap

I’ll be direct: I’ve sat inside this trap. I’ve seen my team go through API design reviews that are waterfall-y — days or weeks of upfront design before anyone writes code. I’ve shipped APIs where we did all the right things: thorough design, all the approvals, fast iteration on the design itself. Then we got blocked at the release end by tooling we hadn’t sped up. The bottleneck moved and we never looked there.

PRD reviews are the same shape. I’ve sat in plenty of reviews. PRDs get written faster and arrive with more data than they used to. AI is genuinely useful here. But the process didn’t change. PRD → review → eng design → review → implement → ship. The artifact got faster; the workflow didn’t.

Another annoyance is dogfooding. I’ve sat in sessions where one or two people are in the driver’s seat and others watch — a gladiator pit where the audience places bets on what breaks next. Meanwhile the actual unlock is right there: agents that dogfood end-to-end, produce structured bug reports, file issues themselves, and let the team review findings async.

The tools exist. We’re still running the old ritual.

I call this The Bolt-On Trap. I’ve watched it happen from inside the rooms. Most engineering folk I talk to are running some version of it too. You take your existing operating model — the sprints, the RFCs, the review boards, the quarterly planning, the layered approval gates — and you bolt AI onto it. Engineers get Copilot, Codex or Claude Code. The IDE gets smarter, or completely replaced by the command line. Boilerplate gets faster. Tests scaffold themselves.

And then the same artifacts and rituals consume the same calendar time. The two-week sprint is still two weeks. The RFC still takes a week to get reviewed. The architecture board still meets every Thursday. The quarterly roadmap is still locked in October for January delivery. The eight pre-committed initiatives are still pre-committed. The cross-team handoffs are still synchronous and still serialized.

You’ve sped up the cheapest part of the work — typing — and left untouched the parts that actually take time: coordination, approval, decision latency, cross-team handoffs.

The data confirms exactly this. Faros AI’s analysis of 10,000+ developers across 1,255 teams found teams with high AI adoption merge 98% more PRs — but PR review time increased 91%, bugs per developer rose 9%, and PR size grew 154% (Faros AI, April 2026). The 2025 DORA report — nearly 5,000 respondents, the most rigorous annual study in engineering productivity — concluded plainly: ”AI does not automatically improve software delivery performance” (InfoQ, March 2026). 90% of CEOs report AI has had no measurable impact on productivity (NBER/Fortune, February 2026). Only 1 in 50 AI investments delivers transformational value (Gartner/HBR, February 2026).

We’re injecting AI into operating models designed around a constraint that no longer exists. The bottleneck didn’t disappear; it just moved upstream and downstream of code generation, where you weren’t measuring.

three men laughing while looking in the laptop inside room — Photo by Priscilla Du Preez 🇨🇦 on Unsplash

The Model That Was Always Right (And Now Finally Works)

Here’s the part I find most interesting: the new model isn’t actually new. Hypothesis-driven development. Rapid prototyping. Ship-and-learn. Small bets, fast iteration, kill quickly. This is the startup playbook. It’s been the theoretically optimal model for two decades. Every product textbook describes it. Every YC partner has written about it. Every postmortem of a failed corporate product blames its absence.

It only worked at startups because they were forced into it. They had four months of runway. They couldn’t afford to commit a 12-engineer team to a six-month bet. So they built tiny things, tested them, killed what didn’t work, and iterated.

Bigger companies couldn’t operate this way despite having infinitely more resources. Each experiment cost real engineering cycles, so each experiment had to be high-confidence. They ran fewer, bigger, more conservative experiments — and lost ground to startups taking ten times the swings.

AI changes the economics of execution, not just the speed. When the cost of being wrong drops dramatically, you can run riskier experiments, test weirder hypotheses, be braver in what you try. The permission structure changes. You’re finally economically justified to operate the way product theory always said you should.

Electric power reached factories in the 1880s, but productivity didn’t jump. For decades, plants swapped the steam engine for a single electric motor and kept everything else — the same overhead shaft, the same vertical layout, machines arranged by power proximity rather than workflow. The factories got electrified, and nothing changed.

The productivity revolution didn’t arrive until the 1920s — forty years later — once factories were redesigned around what electricity actually enabled: distributed power, one motor per machine, single-story layouts optimized by workflow. Economist Paul David documented this lag in “The Dynamo and the Computer” — the gap between technology arriving and operating models adapting was four decades, because the operating model needed to be rebuilt from scratch.

We’re at the equivalent moment with AI. Most companies have replaced the steam engine with an electric motor and are turning the same overhead shaft. The operating model is steam-era. The technology is post-electricity. The gap is everything.

The companies that close that gap fastest will compound advantage; the rest will look productive on the dashboard while losing share to competitors they didn’t take seriously yet. Most factories that didn’t rewire didn’t get a forty-year grace period; they got bought, beaten, or shut down by the ones that did.

What Rewiring Actually Looks Like

If you accept that the constraint has shifted, the implication is concrete: decentralize.

Not shrink. Decentralize. Many small autonomous teams, each with deep domain ownership, each running the startup playbook independently. Each unit close enough to its users that the engineers know what to build without three meetings of upstream alignment. Each unit empowered to ship, learn, and iterate without architectural review boards bottlenecking every decision.

Five diagnostic questions, each with a prescription. If most of them describe your org, you’re in the Bolt-On Trap. The work is to rebuild the operating model around the new constraint.

1. Are you locking quarterly roadmaps with eight pre-committed initiatives?

If yes, you’re building for the old constraint. Collapse the planning horizon. Move to two-week or four-week horizons with explicit “kill or continue” gates. Make it cheap to be wrong. The point is to commit to experiments, not to outcomes you can’t yet justify.

Most quarterly roadmaps are speculative anyway; they just hide the speculation behind formality and give middle management something to obsess over. A four-week experiment cycle with an honest kill gate is more rigorous than a quarterly roadmap that nobody amends when reality diverges.

2. Can your engineers articulate the top three user pains in their domain without a PM in the room?

If not, AI tooling will accelerate your team’s product drift, not fix it. Push product depth into every team. The new scarcity is people who know how to leverage the new tools to ship code and know enough about the user to direct the tools.

3. Are your RFCs still taking two weeks while code generation takes two days?

If yes, the approval cost now exceeds the work cost. Cut approval layers ruthlessly. Every approval gate that existed to protect against expensive engineering mistakes needs to be re-justified. RFCs that took two weeks should take two days, or be replaced with prototypes. Architecture review boards that meet weekly should be replaced with documented domain ownership and a “you must escalate if X” rule. Let the code do the talking instead of theoretical documents.

The goal: a single autonomous team should be able to ship without permission from any other team for the vast majority of work. Communication pathways grow exponentially with team size — a 5-person team has 10 pathways, a 15-person team has 105 (Technori, April 2026).

Every approval layer multiplies the pathway count again. The fix is fewer required gates.

4. Does every meaningful change require both a “frontend team” and a “backend team” to ship?

If yes, your structure is fighting your product. Reorganize around domain, not function. Conway’s Law isn’t optional. A “checkout team” and an “onboarding team” can run startup playbooks independently. A “frontend team” and a “backend team” can’t — every meaningful change requires both.

The 2024 DORA report found that internal developer platforms produced an 8% individual productivity gain — but came with an 8% throughput decrease and 14% stability decrease unless the platform team operated with a product mindset (Accelerate State of DevOps Report 2024 — DORA). Domain-aligned teams beat functionally aligned teams once code is no longer scarce. We’re transitioning from optimizing for code reuse to optimizing for product velocity per autonomous unit.

5. Can every team point to a metric that moved because of work they shipped this quarter?

If not, you’re measuring activity, not impact. Measure outcomes, not output. Stop tracking PR count, story points, AI adoption rate. Start tracking shipped product changes that moved a metric. PR count is the steam-era metric. Product impact per autonomous unit is the post-electricity one.

Big companies that rewire get the best of both worlds: startup-style agility per unit plus the scale advantages of being big. The math has changed; some of the competitors eating their lunch will be three engineers with $10M ARR.

white and red do not enter signage — Photo by Kyle Glenn on Unsplash

Why Most Companies Won’t Do This

The reasons companies give for not rewiring sound real. Three come up most — and most don’t hold.

”Our domain is regulated — we can’t move fast.”

Regulated industries (fintech, healthcare, infrastructure) have real constraints. But the constraint is on what you ship, not on how you decide what to ship. Compliance review applies to releases, not experiments. The autonomous-unit model doesn’t mean shipping unsafe code but rather deciding what to build, prototyping it, and validating it against users before the formal release pipeline starts. The regulated parts of the workflow are narrow with controls; the rest has no excuse.

”We’ll lose institutional knowledge if we decentralize.”

This is real, but the inverse is worse: centralized knowledge sitting in approval gates and architecture review boards becomes a velocity tax that prevents new knowledge from being created. Decentralized teams don’t lose it if you invest in the documentation, internal platforms, and cross-team forums that let knowledge flow without serializing it through approvals. The 86% of organizations that believe platform engineering is essential to realizing AI’s business value (Platform Engineering, 2026) are getting at exactly this — the platform replaces the gate. Build the knowledge into the substrate, not the people who sit in review meetings.

”This will threaten our enterprise customers — they expect process maturity.”

Enterprise customers expect outcomes — uptime, security, reliability, predictable releases. They don’t care whether you produce them through quarterly roadmap meetings or two-week experiment cycles. They’ll notice if your reliability slips; they won’t notice that you killed your weekly architecture review board.

These are mostly post-hoc rationalizations for protecting the existing org chart. The real reason rewiring is hard is political.

Layered approval gates create roles for the people who run them. Quarterly planning gives executives an artifact to point at. Architecture review boards exist partly because senior engineers built careers reviewing architecture. When you decentralize, you flatten; when you flatten, you remove rungs from th career ladder; and the people standing on those rungs push back. It takes conviction and guts to make the change.

The shift is happening anyway. A year ago Amazon’s CEO Andy Jassy directed every org to increase the ratio of individual contributors to managers by at least 15% by the end of Q1 2025 (Fortune, November 2024).

Meta’s new applied AI engineering org is being built with up to 50 employees per manager (The Decoder) — 5-7x flatter than typical orgs. Across small and mid-sized businesses, ICs per manager roughly doubled between 2019 and 2024 (Gusto, 2025).

This won’t be cyclical. It’s permanent.

The Bolt-On Trap is the path of least resistance — everyone keeps their job description, and the dashboards tell a positive story for a while. The companies that win the next decade are the ones with the courage to reorganize before the market forces them to. Most won’t; they’ll wait until their startup competitors are eating them, and by then the gap will be too wide to close. The Block headlines, the Oracle restructurings, the Amazon flattening are the visible early signs most leaders are still treating as someone else’s problem.

The forty-year lag in factory electrification didn’t end with everyone catching up. It ended with most factories closing.

The Question You Should Be Asking

The thesis underneath all of this is simple: the way we work needs to be reevaluated from first principles. Start with one question and let everything follow from it.

If execution is no longer the bottleneck, what should our operating model actually look like?

Don’t answer it generically. Answer it for your team. For your stage. For your domain. Walk through every ritual, every artifact, every approval gate, every planning horizon, and ask whether it’s solving for the old scarcity or the new one. Most of it is solving for the old one. That’s the work.

The bottleneck moved years ago for a few companies. It’s moving for everyone now. The question isn’t whether you’ll rewire around it — it’s whether you do it before or after your competitors prove it works.

If you’re scaling an engineering team and trying to figure out what AI actually means for how your org should be structured, I share practical frameworks every week. Subscribe to Stratechgist or connect with me on LinkedIn where I write about building product-minded engineering teams that actually ship — from someone right in the trenches.

If your team is hitting the bolt-on trap and you want to talk through how to rewire without breaking what’s working, DM me on LinkedIn. Always happy to compare notes.

Stratechgist

Discussion about this post

Ready for more?