Setareh Lotfi
Dispatches

Seasonal letters on whatever has kept me up at night.

Spring, 2026

On the Art of Outsourcing
Your Own Nervous System

A dispatch on enterprise AI deployment, consulting-grade security failures,
and the structural gap between capability and governance.

The views expressed here are my own and do not represent those of any current or former employer.

A disclosure, before we get into it: I spent a non-trivial portion of my career adjacent to the world of enterprise AI deployments, and I still managed to be surprised by how fast this unraveled. In my defense, I assumed the firms advising Fortune 500 companies on digital transformation would have, at minimum, locked their own doors. A quaint assumption, it turns out.

In March, a cybersecurity startup called CodeWall turned an autonomous AI agent loose on McKinsey’s internal AI platform, Lilli. Within two hours, the agent had full read-write access to the production database. 46.5 million chat messages. 728,000 files. Strategy documents, M&A briefs, client engagements, all sitting in plaintext like a diary left open at a dinner party. The vulnerability was almost comically basic: API endpoints with no authentication, JSON field names concatenated directly into SQL. Not some exotic zero-day. A SQL injection. The kind of thing you learn to prevent in the same semester you learn what SQL is[1].

Two days later, CodeWall did the same thing to BCG. Their X Portal, the one dedicated to analytics and data science and, one assumes, a certain implied competence in handling data, had an endpoint that accepted raw database queries with zero authentication. No API key. No session token. Nothing. Behind that open door: 3.17 trillion rows of workforce analytics, compensation benchmarks, employee movement records, M&A intelligence. 131 terabytes of the kind of proprietary data that is, quite literally, what clients pay BCG to protect. One imagines the sales pitch did not include “and we’ll leave it accessible to anyone with a URL.”

Both firms followed the same pattern. Build an AI-powered data platform. Expose an API. Forget to lock the door. Charge handsomely for the privilege.


Here’s what I find interesting, though. The conversation that followed was almost entirely about cybersecurity. Patch the vulnerabilities, hire more red teams, implement better access controls. All necessary, all correct, all missing the actual point with the quiet determination of someone rearranging deck chairs on a vessel whose structural integrity is, shall we say, no longer hypothetical.

The actual point is this: deploying AI agents inside large enterprises is not a technology problem. It’s a leadership problem. And the current model for getting it done, the one that the major AI labs and their consulting partners have converged on, is structurally broken in a way that better firewalls won’t fix.


The dominant approach right now is the Forward Deployed Engineer. Palantir popularized the concept, and now everyone’s doing it with the enthusiasm of a sector that has discovered a new billable unit. OpenAI has FDEs. Anthropic is training Accenture’s 30,000-person army of what they’re politely calling “reinvention deployed engineers,” which is the kind of title that could only emerge from a branding committee that had been in session too long[2]. Deloitte announced their own FDE practice. Job postings for the role grew 800% in 2025.

The FDE model, as most labs are running it, is fundamentally a point-solution machine. An engineer shows up, duct-tapes a model into your CRM, gets it producing outputs that look impressive in the quarterly review, and moves on to the next client. There’s no architectural layer underneath. No unified governance framework. No coherent plan for what happens when you have forty agents running across six departments and someone needs to understand who has access to what and why. The McKinsey hack happened precisely because nobody was thinking at that level. They built Lilli as a product. They did not build the organizational architecture to keep it safe. They built, if you will, a nervous system without a skull[3].

And here’s the thing that makes this worse. The adoption itself isn’t happening through rigorous technical evaluation. It’s happening through culture. Ara Kharazian made this point recently about Anthropic specifically: they went from 4% to 24.4% of business AI adoption in a single year, winning 70% of head-to-head matchups against OpenAI among first-time buyers. Not because Claude benchmarks better. Not because the pricing is lower. Because Anthropic became cool. The “AI guy on your team” evangelized it, and the rest of the org followed. Kharazian compares it to iMessage’s green and blue bubbles: model selection as identity signal, not technical decision[4].

This is where I think the bigger labs, with their growing FDE arms, are getting it fundamentally wrong. They’re selling deployment. They’re not selling transformation. They’re benefiting from a cultural adoption curve that moves faster than any enterprise security review can keep up with. And there’s a meaningful difference between “we chose this because it’s cool” and “we chose this because we understand what it can do to our systems,” one that shows up in your threat surface about eighteen months after launch.


And then there’s Bain, who’d like you to believe they’re the adults in the room. The adults, mind you, not the ones who got hacked. A low bar, but they’ve positioned themselves on the correct side of it, and they’d like credit.

Their expanded partnership with Palantir, announced in late March, is explicitly framed around “end-to-end delivery,” from strategic planning through to operationalization. They’ve published work on what they’re calling “AgentOps,” which extends traditional MLOps into managing autonomous systems as a discipline: governance, monitoring, tool registries, orchestration flows, rollback mechanisms. It sounds great in a whitepaper. The language is correct. The frameworks are tidy. They even cite their own research showing that 80% of generative AI use cases meet or exceed expectations in pilot, but only 23% of companies can tie those initiatives to measurable revenue. One admires the candor of citing your own industry’s failure rate as a selling point.

Here’s my problem with this. Bain’s answer to the gap between pilot and production is, essentially, more Bain. More frameworks, more governance layers, more partnership announcements, more consultants with “AI” in their title. They’ve correctly diagnosed that legacy architectures were designed for stateless transactions and that agentic AI does something fundamentally different. It discovers capabilities dynamically, shares context across boundaries, invokes tools and executes actions in sequences nobody predicted at design time. All true. But their solution is to wrap that complexity in a consulting engagement that still runs on the same incentive structure as every other consulting engagement: billable hours, long-term retainers, and a dependency model that keeps the client coming back. The disease prescribing itself as the cure, if you like[5].

And the Palantir partnership tells you everything you need to know. Palantir’s entire business model is forward-deployed engineers embedded inside your organization, running Palantir’s platform, maintaining Palantir’s integrations. Bain adding a strategy layer on top of that doesn’t make it end-to-end. It makes it two layers of dependency instead of one. When something goes wrong, and the McKinsey and BCG hacks show us that things will go wrong, the client is now navigating a blame chain that runs through their own IT team, through Bain’s consultants, through Palantir’s FDEs, and back. Nobody owns the failure because everybody technically owns a piece of it. It’s a very sophisticated game of musical chairs, and when the music stops, the client is the one left standing[6].

You can’t bolt agentic AI onto a legacy architecture and expect the seams to hold. The McKinsey and BCG hacks are what it looks like when the seams don’t hold. But you also can’t outsource the seams to two different firms and call it “integrated.”


But here’s the part that nobody in consulting wants to say out loud, and I suspect it’s the part that matters most.

Even if you get the architecture right, even if you hire Bain and Palantir and build your AgentOps practice and implement every recommendation in every whitepaper, you are still a large enterprise. You still have a CISO who needs to approve every data pipeline. You still have compliance teams in four jurisdictions who need to review every agent’s access permissions. You still have procurement cycles that take nine months and security reviews that take six. You still have a board that read about the McKinsey hack and is now asking very pointed questions about your own AI roadmap while simultaneously asking why you’re not “moving faster on AI.” The duality of the modern boardroom[7].

This is the structural disadvantage that no amount of consulting can fully resolve. A lean startup can ship an AI agent to production in a week because their “compliance review” is the CTO checking Slack on a Saturday. A Fortune 500 company runs the same deployment through twelve committees and a third-party audit. By the time they ship, the model is two generations old and the threat landscape has moved.

And the vendor selling you the model? They’re not waiting for you. Anthropic went from $1 billion in annualized revenue to $14 billion in roughly fourteen months. Ten-x annual growth, three years running. Eight of the Fortune 10 are Claude customers. Over 500 companies now spend more than a million dollars a year on their API. As Ben Thompson noted in Stratechery, Anthropic’s enterprise business has reached “escape velocity,” and they’re literally turning away revenue because they don’t have enough compute to serve the demand. Your compliance team is still reviewing last quarter’s pilot.

The speed gap is real, and it’s widening. Not because large enterprises are incompetent. Because the regulatory and governance overhead that comes with handling real data at scale, client data, employee data, financial data, is genuinely enormous. And it’s about to get much more complicated as AI agents start doing things that nobody explicitly authorized, which is sort of the entire point of making them autonomous. The companies building these models are growing at a pace that has no precedent in B2B software. The companies deploying them are governed by processes designed for a world that moved slower. That’s not a gap. It’s a chasm. And it’s where the next McKinsey-scale breach is already forming.


Which brings us to the question that every enterprise executive should be losing sleep over, and that I suspect very few of them have even properly formulated: how, exactly, do you keep up?

The model capabilities aren’t just improving. They’re improving on a release cadence that makes enterprise planning cycles look paleontological. In the first three months of 2026 alone, we got GPT-5.4, Claude Opus 4.6, and Gemini 3.1. Claude Opus 4.5 was resolving four out of five GitHub issues autonomously. Then Opus 4.6 arrived with a million-token context window and, by early accounts, the strongest coding capabilities of any commercial model. Three months. That’s less time than it takes most enterprises to onboard a new vendor.

So who inside these organizations is actually tracking this? Who is evaluating whether the agent you deployed in January on Opus 4.5 should be migrated to 4.6, and what that migration means for your security posture, your prompt architecture, your compliance certifications, your entire governance framework? The answer, in most enterprises, is nobody. Or, more precisely, it’s “the AI guy on your team,” the same person who chose the model because it was cool, who is now implicitly responsible for an organizational capability that touches every department and every data source in the company. That person probably reports to a VP of Engineering who reports to a CTO who reports to a CEO who mentioned AI seventeen times in the last earnings call but couldn’t tell you the difference between an agent and an API.

The fashionable answer to this problem is the “AI Center of Excellence,” which is corporate speak for “a committee that makes PowerPoints about AI while the actual AI deployment happens in the departments that got tired of waiting.” The track record is not encouraging. The majority of enterprise AI projects fail, and most of those failures trace back to leadership, not technology. Projects with sustained CEO involvement succeed at dramatically higher rates than those that lose C-suite sponsorship early. The AI Center of Excellence, in practice, tends to function less as a center and more as a polite fiction that allows the C-suite to claim governance over something they don’t understand and haven’t resourced.

But the current approach of “move slowly and outsource the hard parts” isn’t working either. It’s producing AI platforms with unauthenticated endpoints, governance frameworks that exist on slides but not in production, and a widening gap between what the models can do and what enterprises are equipped to manage. Every quarter, the models get more capable. Every quarter, the distance between “what this agent could do” and “what we’ve authorized this agent to do” grows. And every quarter, the attack surface expands in ways that the last security review didn’t anticipate, because the last security review was written for a model that is now, by the standards of this industry, practically vintage.

The real question isn’t whether enterprises need an AI Center of Excellence. It’s whether they need something more radical: a permanent, senior, technically literate leadership function that sits at the intersection of product, security, and strategy, that has actual authority over deployment decisions, and that is staffed by people who can tell the difference between a model upgrade and a threat surface expansion. Not a committee. Not a center. A capability. One that evolves at something approaching the pace of the technology it’s supposed to govern, which, I grant you, is rather a lot to ask of any organization that still requires three signatures to change a DNS record.


What’s quietly interesting is that the most compelling attempt to build this missing infrastructure isn’t coming from the consulting world or the labs. It’s coming from someone who watched the problem being created from the inside.

AIUC, the Artificial Intelligence Underwriting Company. Founded by Rune Kvist, who was Anthropic’s first product and commercial hire, alongside a former McKinsey partner from the insurance practice. Backed by Nat Friedman with $15 million in seed funding. Think about that for a second. Kvist had a front-row seat to the fastest cultural adoption curve in enterprise software. He watched Anthropic go from 4% to nearly a quarter of business AI adoption in a year. He saw the evangelism-driven adoption pipeline from the inside, the one where the “AI guy on your team” drives the purchasing decision and nobody asks the hard architectural questions until it’s too late. And what he built when he left wasn’t another AI product. It was the trust infrastructure that even Anthropic, the company that literally markets itself on safety, wasn’t providing for its own enterprise customers. Which tells you something, if you’re willing to hear it[8].

Their argument, which Kvist laid out in a conversation with the Cosmos Institute, is that the missing piece in enterprise AI deployment isn’t better technology or better consulting. It’s confidence infrastructure. Standards you can audit against. Insurance that prices the actual risk. A certification process that means something beyond a logo on a slide deck.

They’ve built AIUC-1, the first AI agent standard, developed with input from over 100 Fortune 500 CISOs. It covers six enterprise risk domains: security, safety, reliability, accountability, data privacy, and societal risk. Independent auditors test whether your agents can be jailbroken, whether they hallucinate, whether they leak data, whether they do things nobody authorized. ElevenLabs was the first to get certified. The consortium backing the standard includes Anthropic, Microsoft, Cisco, Databricks, and Salesforce.

What Kvist understood, probably from watching Anthropic’s enterprise sales process from the inside, is that the real bottleneck isn’t building the agent. It’s getting the CISO to sign off. It’s getting legal to approve the data access. It’s getting the board comfortable that deploying this thing won’t end up as a front-page story about how your company’s client data was sitting behind an unauthenticated endpoint. The McKinsey hack didn’t happen because McKinsey couldn’t build AI. It happened because nobody had built the trust infrastructure that would have forced the right questions before launch.

AIUC’s bet is essentially Benjamin Franklin’s: that insurance, standards, and audits can solve coordination problems that regulation alone can’t. Franklin didn’t prevent fires by passing laws. He made fire prevention profitable. Kvist is trying to do the same thing for AI agents, making security profitable by tying it to insurability, which ties it to the thing enterprises actually care about: the ability to deploy without existential risk to the business.

It’s early. Whether AIUC will actually solve the structural problem is genuinely unclear. But they’re building from the right premise: that what’s missing isn’t a better model or a better consulting framework, but the institutional infrastructure that makes deployment trustworthy at scale.


So what’s actually needed here? I think it’s something that sits uncomfortably between the consulting world and the tech world and that neither is particularly well-equipped to deliver alone. Which is, if you think about it, the whole problem.

It’s a leadership transformation. Not a digital transformation (that phrase has been strip-mined of all meaning and now serves primarily as a PowerPoint title and a LinkedIn hashtag), but a genuine restructuring of how enterprise leadership thinks about AI as an organizational capability rather than a technology deployment. The CISO and the Chief AI Officer and the head of product need to be in the same room making architectural decisions together, not reviewing each other’s work after the fact. The governance model needs to be designed alongside the agent architecture, not layered on top of it as an afterthought. And the people making these decisions need to understand, at a practical level, what an autonomous agent can actually do inside their systems, because the McKinsey breach happened partly because nobody with that understanding was asking the right questions about Lilli’s API surface.

What might actually work is something nobody in consulting or tech is incentivized to build: an internal capability that the enterprise owns outright, staffed by people who understand both the technical architecture and the organizational politics, and who don’t leave when the engagement ends. That’s not a product you can buy. It’s not a partnership you can announce. It’s the slow, unglamorous work of developing leadership that can think in systems, not just in deployments. And I suspect most enterprises won’t do it until they have their own CodeWall moment, by which point the lesson will have been very, very expensive.

I’ll be watching. From a safe distance, naturally.

From the desk of,

— S.L.