OpenAI Quietly Gates Powerful Hacking Tool, Fears It's Too Dangerous

OpenAI is preparing to release an artificial intelligence model with advanced hacking capabilities, but it won't be available to everyone. The company plans to distribute it only to a carefully vetted group of organizations, following a similar strategy announced this week by rival Anthropic.

The cautious approach signals a watershed moment in AI development: the technology has become powerful enough that its creators are now actively restricting access to their own tools.

A Controlled Release Strategy

Anthropic kicked off the trend Tuesday when it announced plans to limit distribution of Mythos, a model capable of discovering cybersecurity vulnerabilities with unusual sophistication. The company said it would only provide access to hand-picked technology and security firms.

OpenAI is now following suit with its own advanced cyber-capable model. The company already operates a "Trusted Access for Cyber" pilot program launched in February, following the rollout of GPT-5.3-Codex, its most capable hacking-focused reasoning model. OpenAI sweetened the invitation by offering $10 million in API credits to approved participants for defensive security research.

The pattern mimics how the software industry has handled dangerous security vulnerabilities for decades: controlled disclosure to trusted parties before broader release, if at all.

Why This Matters

Government officials and security leaders have spent the past year warning about AI models that could fall into malicious hands. The stakes are existential: in skilled actors' hands, advanced AI could autonomously disrupt water treatment systems, power grids, or financial networks. Those capabilities appear to have arrived.

Rob T. Lee, chief AI officer at the SANS Institute, sees the restricted release as logical but ultimately temporary. "You can't stop models from doing code enumeration or finding flaws in older codebases," he said. "That capability exists now."

His concern is echoed across the security establishment. Wendi Whitmore, chief security intelligence officer at Palo Alto Networks, told Axios that competitors will likely develop comparable models within weeks or months. Adam Meyers, senior vice president of counter adversary operations at CrowdStrike, called Mythos' capabilities a "wake-up call" for the industry.

The underlying tension is stark: restricting access to exploit-writing capability makes more sense than restricting access to vulnerability-detection capability, according to Stanislav Fort, CEO of security firm Aisle. Once a model can write new code exploits, the damage potential jumps dramatically.

The Limits of Gatekeeping

Research published Wednesday by AISLE researchers suggests the window for control may already be closing. Widely available AI models can already discover many of the same vulnerabilities that Mythos uncovered during testing.

Anthropic has said it will never release Mythos Preview to the general public, though it has left open the possibility of releasing other Mythos variants if adequate safeguards are in place. OpenAI has not committed to either approach for its forthcoming model.

What's clear is that the era of unrestricted AI release for cutting-edge capabilities may be over, replaced by an uneasy holding pattern where powerful tools circulate among trusted hands while the rest of the industry races to catch up.

Comments