Google, Microsoft and xAI agreed to let a Commerce Department center evaluate new AI models before public release, expanding federal safety testing.
WASHINGTON — The US Department of Commerce will safety test new artificial intelligence models from Google, Microsoft and xAI before they are released or deployed, under voluntary agreements announced Tuesday by its Center for AI Standards and Innovation.
The deals give federal scientists access to frontier AI systems from three major developers at a time when Washington is trying to assess security risks from increasingly capable models without imposing broad mandatory rules on the sector.
CAISI said the work will include testing, collaborative research and development of best practices for commercial AI systems, with evaluations focused on capabilities and security. The center’s director, Chris Fall, said the agreements would expand government work “in the public interest at a critical moment.”
The agreements extend earlier arrangements reached under the Biden administration with OpenAI and Anthropic. The Commerce testing center, then known as the US Artificial Intelligence Safety Institute, has been used as a federal hub for AI evaluations and voluntary safety standards.
CAISI said Tuesday it has completed more than 40 previous evaluations, including tests of “state-of-the-art models that remain unreleased.” It did not identify those models. The agency has said developers often provide versions of models with some guardrails stripped back so evaluators can probe for national security risks.
Microsoft said it would work with US government scientists to test AI systems “in ways that probe unexpected behaviors” and to develop shared data sets and workflows for model evaluation. The company also said it has signed a similar agreement with the United Kingdom’s AI Security Institute. Google declined to comment to Al Jazeera, and xAI did not immediately respond to its request for comment.
The announcement comes amid growing concern in Washington about how advanced AI systems could be misused in cyberattacks or military settings. Recent attention has focused in part on Anthropic’s Mythos model and questions about whether powerful systems could supercharge hackers before governments and companies fully understand their risks.
The testing pacts also follow a Pentagon agreement with seven technology companies — including Google, Microsoft, Amazon Web Services, Nvidia, OpenAI, Reflection and SpaceX — to use AI systems across classified computer networks. The Defense Department said that arrangement is intended to help “augment warfighter decision-making in complex operational environments.”
The move marks a notable expansion of federal access to company models under President Donald Trump, whose administration has otherwise emphasized reducing regulatory barriers to AI development. Trump last year signed executive orders forming the basis of an AI Action Plan that he said would “remove red tape and onerous regulation” and help the United States win through AI advancement and control.
The public announcement did not name the specific Google, Microsoft or xAI models that will be reviewed, nor did it specify how much of CAISI’s findings will be released. For now, the agreements put more of the most powerful commercial AI systems inside a federal testing process before they reach wider use.
Comments (0)