Anthropic called a systemic MCP supply chain flaw "expected behavior" as OX Security disclosed 150M downloads and 200K vulnerable servers. In the same week, OWASP shipped the first AI vulnerability scoring system, NIST rationed CVE enrichment under a 263% surge, and the UK AISI published its first government-grade Mythos cyber benchmark. Here is what this week changes for AI compliance programs.
Safe AI AcademyApril 19, 202614 min read20 views
There is a particular phrase that shows up in security advisories when a vendor does not want to own a problem. "Working as intended." "By design." "Not a vulnerability." This week, Anthropic added a new entry to that canon. When OX Security disclosed what they called the Mother of All AI Supply Chains, a systemic architectural flaw in how the Model Context Protocol's STDIO transport handles trust, Anthropic declined to fix it and called it "expected behavior."
I read that, and then I read it again, because I was not sure I had parsed it correctly. Expected behavior. For 150 million downloads of affected libraries, 200,000 plus vulnerable servers in the wild, and 10 plus Critical or High severity CVEs documented in the disclosure. That is the response.
I will be honest, I am an Anthropic fan. I have said so publicly, and I still think they are one of the most thoughtful labs in the industry. That is exactly why this response is worth pulling apart in public, because the implications for how we govern AI tooling are significant.
The MCP Bill Is Actually a Lot Bigger Than the Disclosure
Let me talk about the architectural issue first, then the supporting cast.
MCP's STDIO transport assumes the local process running the server is trusted. That made sense when MCP was primarily a way for Claude Desktop to call a local Python script. It makes considerably less sense now that MCP is the de facto integration protocol for AI agents, running everywhere from developer laptops to cloud workloads to SaaS backends. Once you deploy MCP at scale, "local process is trusted" quietly becomes "anything that can write a file to the box is trusted," which is not a threat model. It is a surrender.
SecurityWeek framed it bluntly: a by-design flaw at the core of MCP could enable widespread AI supply chain attacks. with the headline, "Anthropic won't own MCP design flaw putting 200K servers at risk." . This is not a CVE you patch. It is a protocol that needs rethinking, and the spec owner has said, respectfully, that is not their job.
Stay Updated
Get notified when we publish new articles and course announcements.
That on its own would be a story. It becomes something bigger when you stack it next to the rest of the week's MCP news.
GreyNoise published honeypot telemetry showing 91,403 attack sessions targeting exposed LLM endpoints between October 2025 and January 2026, with MCP servers called out specifically as the newest and least-secured entry point in self-hosted LLM infrastructure. A single exposed MCP endpoint, per their analysis, becomes a bridge to the entire internal environment. 91,403 sessions in four months is not probing. That is sustained reconnaissance at industrial scale.
Then the CVEs landed in clusters.
Capsule Security disclosed CVE-2026-21520, which they dubbed "ShareLeak," an indirect prompt injection in Microsoft Copilot Studio. A SharePoint form comment field could inject a fake system-role message into the agent's context window, override its instructions, and exfiltrate customer data from connected SharePoint Lists through Outlook. Microsoft patched it in January but only publicly disclosed in April. Capsule's sister disclosure hit Salesforce Agentforce with a parallel flaw they called "PipeLeak." Salesforce has not assigned a CVE. They have, to their credit, now enabled Human-in-the-Loop confirmation by default for email-based agentic actions. Dark Reading's coverage put the two disclosures in the same frame, which is the right way to read them.
GitLab's advisory for CVE-2026-29783 documents a shell expansion RCE in the GitHub Copilot CLI, where crafted bash parameter transformation operators let an attacker hide commands from the user before the agent runs them. Microsoft's April Patch Tuesday added 167 flaws, two zero-days, and one info-disclosure bug in GitHub Copilot and Visual Studio that specifically leaks the contents of the Model Context Protocol. ZDI's write-up called out the MCP component explicitly. Whatever else you can say about this month, auditors now have a CVE that directly attributes an info disclosure to MCP plumbing on Microsoft's own stack.
The way I see it, the aggregate picture is pretty clear. MCP has become enterprise plumbing in less than eighteen months, and the plumbing was never pressure-tested for that job. The STDIO trust model was fine for a protocol being used between two cooperating processes. It is actively dangerous now that MCP servers are being deployed across production agents with access to SharePoint, Salesforce, local shells, and cloud APIs. When Anthropic says "expected behavior," they are right in the narrow technical sense and deeply wrong about the operational reality. And the adversaries, judging by GreyNoise's numbers, figured that out months ago.
The Scoring System Finally Arrives, and Then the Record Book Breaks
Why is this a big deal? Because a CVSS 7.5 for a classical buffer overflow and a CVSS 7.5 for a prompt injection against a multi-agent system are not the same risk. They never were. The multi-agent system has autonomous amplification that the buffer overflow lacks. CVSS was never built for "what happens when the exploited component can decide what to do next." AIVSS is the first formal attempt to capture that delta, and the fact that it attracted 1,900 plus public comments means the industry actually cares about the answer.
And the exploitation timeline is collapsing in lockstep. SANS, CSA, Un-Prompted, and the OWASP GenAI Security Project released an emergency "Mythos-Ready" strategy briefing built in a single weekend by 60 plus contributors and reviewed by 250 plus CISOs. The headline statistic is the one that should reshape patch management SLAs industry-wide: mean time from disclosure to exploitation is now less than one day, down from 2.3 years in 2019. CSA Labs published the practitioner brief with a 13-item risk register mapped to four frameworks. I have read a lot of industry "emergency" briefings over the years. Most of them are marketing. This one is not. The delta from 2.3 years to under 24 hours is the kind of shift that makes every "30-day critical patch" SLA in your security policy obsolete.
Which is a good segue, because Microsoft published "Incident Response for AI: Same Fire, Different Fuel" this week, the first major cloud provider to ship a dedicated AI IR methodology. The title is the thesis. Traditional IR playbooks still apply for compromise of the underlying infrastructure. What is new is the telemetry you need for the model layer itself: prompt traces, tool-call histories, policy-evaluation records, model-provenance artifacts. If you do not have those, your AI incident response is guessing. Microsoft just formalized the target state, and that becomes a reference point auditors will use inside a year.
Mythos Becomes an Institution
Last week I wrote about the Bessent and Powell call. This week the story expanded into something qualitatively different, and I want to flag the pieces that matter for how compliance programs should calibrate.
And Anthropic shipped Claude Opus 4.7 to GA with automated cyber-use detection and blocking built in, alongside differential reduction of cyber capabilities during training and a Cyber Verification Program. That is a structural change worth pausing on. For years, model safety has been a post-training guardrail exercise: RLHF, constitutional AI, classifier filters on outputs. 4.7 is the first time a major frontier model has shipped with capability suppression applied at training time specifically for cyber. It is still a guardrail, but it is baked in at a layer that used to be considered off-limits to safety tuning. Dario Amodei also met White House Chief of Staff Susie Wiles on April 17, resolving the Pentagon standoff from March. Government-wide Mythos deployment is now planned across the intelligence community, CISA, Treasury, and Energy.
The way I see it, this is what the institutionalization of an AI capability story looks like. A capability gets flagged. The regulators treat it as systemic. Standards bodies publish emergency guidance. A government benchmark quantifies the threat. Commercial vendors build products against it. The vendor itself ships structural mitigations. And the procurement clause arrives to capture the audit trail. That entire cycle used to take years. This one compressed into about four weeks.
The Privacy Layer Is Quietly Failing, and Nobody Is Insured For It
Let me close with three smaller items that, stitched together, describe the operational reality most compliance teams are about to walk into.
Rochester Institute of Technology released AudAgent, a privacy audit tool for AI agents. Their finding, and I want you to really hear this: agents powered by Claude, Gemini, and DeepSeek failed to refuse handling Social Security numbers. GPT-4o correctly refused. Let me put it this way. If you have deployed an agent on any of those first three models into a workflow that might see SSNs (customer support, HR onboarding, benefits administration, healthcare intake), you just learned that the model-layer privacy refusal you assumed existed, does not. That is the kind of finding that ends up in a Q3 data protection impact assessment and forces a mid-quarter redesign of an agent pipeline.
Then the insurance side. Fitch Ratings became the first major rating agency to formally assess the cyber insurance impact of AI vulnerability discovery, warning that AI-driven vulnerabilities will probably outnumber patches in the short term. When a rating agency publishes that thesis, it is not a security observation; it is a capital markets observation. Cyber insurance premiums are about to move, and the coverage carve-outs for AI-related incidents are about to tighten. Compliance teams that do not already have evidence of AI-specific control coverage are going to feel this at renewal.
And on the agent privacy and consumer-trust front, the Doe v. Perplexity AI class action alleges secret data sharing with Meta and Google through hidden trackers, including in Incognito mode. Lasso Security's independent red team of Perplexity's BrowseSafe found a 36% compromise rate against standard malicious attacks that BrowseSafe marked safe. The flagship product of a company's newly launched secure intelligence institute failed more than a third of the time in independent testing. That is not a rounding error, and the consumer AI trust story is going to be litigated for the next two years.
At the end of the day, "expected behavior" is the vendor telling you the protocol is not the problem. It is also the auditor telling you the control has to live somewhere, because the protocol is not going to fix itself. If MCP is going to be the integration fabric for agentic AI (and it is), then we are going to have to build the trust boundaries that Anthropic declined to build at the spec level, ourselves, at the deployment level. That means MCP server identity, least-privilege scoping, continuous telemetry on tool calls, kill switches on agent actions, and decision-level logs that actually survive an incident investigation.