About

Anthropic’s Nicholas Carlini shifts from AI safety warning to model-release advocate

Published
Score
10

Why it matters

Anthropic researcher Nicholas Carlini has shifted from publicly demonstrating how advanced AI models can exploit software vulnerabilities to advocating for the release of Anthropic's newest system, Mythos, through a restricted preview program. Carlini previously published research showing how the model could identify and chain exploits in tools including Ghost and Linux—work that intensified concerns about offensive cybersecurity applications. He is now part of the internal team arguing the model can be safely shared with vetted users despite those demonstrated capabilities.

The Trump administration has raised concerns about whether Anthropic's frontier AI systems pose cybersecurity risks, a worry shared among roughly 700 cybersecurity experts who discussed the issue in March. The specific terms of Anthropic's access controls for Mythos and the scope of the restricted preview remain unclear.

For practitioners, this reflects an unresolved tension in AI governance: whether models capable of advanced vulnerability discovery should be withheld entirely, tightly restricted to trusted researchers, or released under controlled conditions. As U.S. officials scrutinize the cybersecurity implications of frontier AI systems, Anthropic's approach to Mythos will likely become a test case for how companies balance capability demonstration, safety assurance, and regulatory pressure. Attorneys advising AI companies or monitoring federal AI policy should track both the access framework Anthropic ultimately adopts and any formal guidance the administration issues on cybersecurity-adjacent AI capabilities.

Sources

mail Subscribe to Law And Technology email updates

Primary sources. No fluff. Straight to your inbox.

Also on LawSnap