Executive Resilience Insider
Posts
Software isn’t a tool. It’s the workforce

Software isn’t a tool. It’s the workforce

Companies still running pilots are being outpaced by those running parallel experiments at machine scale.

Mario Peshev
December 16, 2025

Startup investment reaches record levels while market intelligence reveals a brutal paradox: companies orchestrating parallel experimentation systematically outperform those stuck in pilot mode. Mercor generates $4.5 million revenue per employee. Microsoft manages $1.8 million. Cursor delivers $3.2 million per employee. Meta achieves $2.2 million.

Cross-sector analysis reveals strategic miscalculation:

Organizations perfecting validation processes while missing momentum opportunities

Leaders optimizing for safety while agent-enabled competitors capture exponential advantages

Executive teams investing in governance sophistication while experimentation bandwidth creates market separation

The Experimentation Paradox:

Pilot sophistication ↑ = Market momentum ↓

Validation rigor ↑ = Competitive edge ↓

Organizations have 90 days to build experimentation infrastructure or surrender market advantage to competitors who understand that accumulated learning determines competitive survival.

Why validation theater destroys competitive advantage

McKinsey analysis reveals the crisis: 78 percent of companies deployed generative capabilities, yet 80 percent report no earnings contribution. Only six percent achieve meaningful results. These high performers share one characteristic—they redesigned workflows rather than layering capabilities onto existing processes. High performers are three times more likely to redesign workflows.

Consider the traditional seed-stage trajectory. Companies run five to ten major experiments over their funding runway. Each attempt consumes weeks of planning, development cycles, coordination overhead.

That era ended.

The mathematics transformed completely. Single experiments with five percent breakthrough probability deliver 40 percent chance across ten attempts. One hundred parallel experiments generate 99 percent probability of discovering strategic advantage.

Gartner projects 40 percent of enterprise applications will integrate autonomous capabilities by end of 2026, up from less than five percent today. The same research shows 40 percent of current initiatives will be canceled by end of 2027 due to escalating costs or unclear value.

The day ClassPass discovered their assumptions were completely wrong

ClassPass ran a robust Voice of Customer program. They understood which inquiries could be deflected. Their team calculated expected deflection rates before deploying Decagon's autonomous support system. They set conservative targets based on historical performance and industry benchmarks.

Operations prepared for modest improvements. Maybe 15 percent increase in deflection. Perhaps 20 percent if everything went well. They had backup plans for the learning curve, contingencies for edge cases requiring human intervention.

Launch day arrived.

Deflection rates hit 10x higher than their most optimistic projections.

Not 10 percent higher. Ten times what they'd conservatively estimated. The customer service team stared at dashboards showing metrics that shouldn't have been possible. Tickets that would normally route to human agents were being resolved automatically. Complex inquiries about bookings, cancellations, account issues—all handled without escalation.

The entire customer service model transformed overnight. Their chat program scaled to 24/7 while maintaining critical CX metrics. The assumptions they'd carefully validated through traditional analysis proved completely wrong. Not because their analysis was bad—their methodology was sound, their data accurate, their projections reasonable.

But linear thinking cannot predict exponential capability.

"Though we already had a robust Voice of the Customer program and an understanding of customer inquiries we thought we could deflect, we saw 10x higher deflection at launch than we anticipated," the team reported. "And while ticket deflection was our primary goal, Decagon has also allowed us to scale our Chat program to 24/7 while hitting critical CX metrics at the same time."

This wasn't an outlier. Flashfood automatically resolves over 90 percent of issues. Curology drove massive cost savings.

Bessemer analysis shows "Supernovas" reaching $40 million average annual recurring revenue in year one, then $125 million in year two. Revenue per employee hits $1.13 million—four to five times above typical benchmarks. Boston Consulting Group documented the economics: a consumer packaged goods company reduced costs 95 percent and improved speed 50x, a global bank cut costs 10x, a biopharma company achieved 25 percent cycle time reduction and 35 percent efficiency gains, an IT department increased productivity 40 percent.

PwC surveyed 300 executives: 79 percent have adopted agent capabilities, 66 percent report increased productivity, 73 percent believe how they manage these capabilities will determine competitive outcomes over the next 12 months.

The transformation that validation approaches cannot deliver

Gigi Levy-Weiss, general partner at NFX, identified what most executives miss. "The old model was software enhancing people. The new model is people orchestrating agents."

Software ceased being a tool humans operate. It became the workforce itself. Customer support runs on agents resolving 90 percent of issues. Growth experiments run continuously—specialized agents generating copy, targeting segments, testing mechanisms across all platforms.

Infrastructure handles coordination while humans identify opportunities and interpret results. But moving from validation to parallel experimentation requires solving five distinct barriers that keep organizations trapped in pilot mode.

Framework 1: "If coordinating three experiments requires weekly meetings, coordinating 30 requires unsustainable management burden"

This assumption underlies every rejected proposal for scaling experimentation velocity. Executive teams understand the mathematics—running 100 tests generates better results than running 10. Yet they dismiss volume-based approaches as impractical because they believe coordination overhead scales linearly with experiment count.

The mental model was accurate when humans performed coordination work. It's completely wrong now.

Monday.com runs dozens of specialized agents in production—one automatically updates systems whenever marketing events happen, another continuously collects release information enabling product decisions. Neither requires human coordination. Operations teams don't schedule meetings for these efforts. Infrastructure handles coordination automatically.

ServiceNow didn't write a coordination strategy before reducing case handling time 52 percent. They simply started running experiments faster than their coordination processes could track. When agents evaluate results automatically, when learning from one test influences related efforts without manual knowledge transfer, when reversible decisions don't require approval—coordination overhead disconnects from experiment volume.

Their teams allocate bandwidth without elaborate frameworks: 70 percent refining what works, 20 percent testing adjacent ideas, 10 percent exploring contrarian approaches. They track experiments per week and learning per experiment rather than success rates.

Breaking bandwidth constraints isn't about working faster. It's about removing human coordination as bottleneck.

Framework 2: Why Lenovo stopped buying better tools and started building digital teams

Evaluate new software by asking "Will this make our team more productive?" That question leads to incremental gains—better CRM systems, enhanced analytics platforms, faster development tools.

Lenovo's engineering teams lived this pattern. Development processes worked. Improvements came incrementally through better tooling. The trajectory was predictable, sustainable, uninspiring.

Then they shifted from describing how work should be done to specifying what results they needed. Code quality improved 15 percent. Development speed increased 15 percent. Not through superior tools—through treating software as autonomous team members with clear performance expectations.

Customer support made the same transition. Response times dropped 90 percent because agents operated autonomously—handling troubleshooting, processing orders, managing subscriptions without requiring validation for each action. Nobody wrote protocols for when agents could make decisions versus when they needed human approval. They just removed approval layers and let agents handle standard operations, escalating genuine exceptions.

Moderna merged their technology and HR departments, making explicit that workforce management encompasses both human and digital members. Deloitte's Zora platform targets 25 percent cost reduction and 40 percent productivity increase by implementing the same model.

Companies escaping tool optimization ask different procurement questions. Not "How will this improve team productivity?" but "What functions can this perform autonomously?" They test agent capabilities before deployment the same way they would interview human candidates. Performance reviews track the same metrics over time.

Tool optimization delivers incremental gains. Workforce development enables exponential capacity.

Framework 3: The mathematics of breakthrough probability

Five percent breakthrough probability × 10 sequential attempts = 40 percent success chance

Five percent breakthrough probability × 100 parallel attempts = 99 percent success probability

These numbers destroy the logic behind careful bet selection. Companies running five carefully chosen tests while competitors run 50 diverse experiments aren't competing in the same performance category. Yet executives continue investing heavily in selection methodology—market research, competitive analysis, customer validation.

Boston Consulting Group's consumer packaged goods case achieved 50x speed improvement not by perfecting which experiments to run but by running experiments at scale impossible through traditional coordination. The biopharma company reducing cycle time 25 percent wasn't optimizing bet selection—they were running more tests faster.

When you can test 100 approaches simultaneously, expected value calculations shift completely. Allocating 10 percent of bandwidth to contrarian hypotheses that conventional selection would eliminate becomes rational strategy. Ideas that seemed too risky when you could only afford three experiments become intelligent bets when experimentation costs approach zero.

Volume mathematics transforms what constitutes intelligent risk. Companies trapped in careful selection methodology keep optimizing for picking winners. Their competitors discover positioning through statistical probability.

Framework 4: High agency over attention to detail

Parag Agrawal doesn't hire for reliability. His teams at Parallel don't optimize for "attention to detail" or "ability to follow established processes." He's given extreme experimentation license to every engineer. They don't standardize how people do things. Everyone gets to do their own entrance. Everything changes constantly. They exploit opportunities rather than picking lanes and sticking to them.

This approach seems chaotic to companies optimized for process execution. Job descriptions at most organizations emphasize track records of consistent delivery. Interview questions probe how candidates handled specific situations, looking for evidence they can replicate proven approaches.

That philosophy made sense when work meant following procedures. It becomes a critical limitation when work means orchestrating resources at exponential scale.

In agent-orchestrated environments, the few people you hire determine everything. They need to think like someone with a 1,000-person team at their disposal. High agency becomes foundational—taking responsibility and acting without waiting for instructions. Multi-domain fluency matters more than specialized expertise because orchestration requires understanding marketing, operations, and engineering simultaneously.

Builder's instinct separates effective orchestrators from reliable executors. They make things with their own hands, fix what breaks, prototype what's missing. When agents fail, models drift, systems break—they treat disruption as normal operating condition rather than crisis.

Low ego combined with high self-worth creates the necessary psychological profile. They separate themselves from outcomes, treating failure as data rather than identity threat. They share information including failures, making everything public rather than protecting reputation.

During interviews, present workflow scenarios and watch whether candidates sense bottlenecks without explicit prompting. The best orchestrators treat obsolete mental models as liabilities they discard instantly when conditions change.

Framework 5: When bold experiments become the rational choice

Testing a contrarian hypothesis costs $100. Success probability: one percent. Potential payoff: $100,000. Expected value: $1,000—ten times the cost.

Linear experimenters can't pursue these opportunities because limited bandwidth forces allocation toward higher probability conventional bets. Parallel experimenters test 100 contrarian approaches simultaneously, knowing statistical probability ensures several succeed.

Yet companies treat ambitious experiments as inherently risky. Bold ideas require extensive justification, multiple approval layers, careful risk mitigation planning. Conservative approaches pass through with minimal scrutiny.

This framework made sense when each experiment required significant capital and constrained bandwidth forced selective allocation.

The economics inverted completely.

Ganesh Gopalan, CEO of Gnani.ai, observed the shift: "While early adoption was about experimentation without clear business metrics, businesses now have a sharp focus on ROI and end outcomes." But the transformation isn't toward more conservative thinking—it's toward understanding that volume mathematics make ambitious positioning rational when experimentation costs approach zero.

Companies discovering breakthrough positioning don't write processes for identifying contrarian opportunities. They simply mandate that teams propose one "unreasonable" experiment weekly. These get tracked separately from core optimization efforts, measuring learning accumulation rather than individual success rates.

Look for problems everyone acknowledges but nobody tackles because the required work seemed disproportionate. Find workflows requiring extensive human coordination where automation appeared impossible. Identify markets where conventional wisdom says "that will never work" based on constraints that parallel experimentation eliminates.

Parallel experimentation transforms competitive outcomes

Momentum infrastructure requires equivalent resources as linear validation, simply allocated toward experiment volume rather than approval sophistication. Organizations implementing bandwidth expansion, workforce development, probability leverage, orchestration capability, and risk recalibration consistently outperform validation-dependent competitors.

Market leaders establish separation through capabilities that traditional optimization cannot replicate. Within 90 days, early movers establish competitive advantages that linear competitors cannot overcome through process improvement alone.

The choice determines competitive survival. The window closes. The consequences are permanent.