Anthropic Revises Responsible Scaling Policy v3, Loosening Key Safety Commitments Amid Pentagon Pressure

Anthropic, widely regarded as the safety-conscious conscience of the generative AI race, has released the third iteration of its Responsible Scaling Policy (RSP v3). The update, which fundamentally restructures how the company handles catastrophic AI risks, arrives at a moment of intense geopolitical and commercial friction. As the company faces a reported ultimatum from the U.S. Department of Defense regarding military use of its technology, the removal of its "flagship" safety pledge—to pause development if safety cannot be guaranteed—has drawn sharp scrutiny from industry observers.

The Shift from "Conditional Pausing" to "Pragmatic Transparency"

Since its inception, Anthropic’s RSP has been defined by a mechanism of "conditional commitments." Under the previous RSP v2, the company pledged to halt the training or deployment of new models if they crossed specific "AI Safety Level" (ASL) thresholds without corresponding safeguards in place. This "tripwire" approach was designed to prioritize safety over competitive velocity.

With RSP v3, Anthropic has pivoted away from these hard stops. The company argues that unilateral pauses are ineffective in a market where competitors continue to race forward. Instead, the new policy emphasizes transparency and public goal-setting.

Key Components of RSP v3:

Frontier Safety Roadmaps: Instead of binding internal pauses, Anthropic will now publish "Roadmaps" detailing their safety goals. These are described as "ambitious but non-binding," aimed at creating public accountability rather than strict operational bottlenecks.
Risk Reports: The company commits to releasing comprehensive risk assessments every 3 to 6 months. These reports will detail the safety profile of their current models, including gaps between current capabilities and ideal safety standards.
External Review: In specific high-risk scenarios, Anthropic will subject its risk reports to third-party expert review to validate their findings.

Anthropic executives have framed this shift as a "pragmatic" response to reality. In a blog post accompanying the release, the company noted that "stopping the training of AI models wouldn't actually help anyone" if other developers with fewer scruples continue to advance. They cited the failure of a "race to the top"—where competitors would emulate Anthropic’s safety restraints—as a primary driver for the change.

Comparison: RSP v2 vs. RSP v3

The following table outlines the structural changes between the previous policy and the newly released version.

Feature/Commitment	RSP v2 (Previous)	RSP v3 (Current)
Core Mechanism	Conditional Pausing (ASL Tripwires)	Transparency & Roadmaps
Safety Pledge	Stop training if safety is not guaranteed	Pragmatic unilateral goals
Documentation	Internal assessments & defined thresholds	Public Frontier Safety Roadmaps
Risk Reporting	Ad-hoc and internal focus	Systematic Public Risk Reports (3-6 mo)
Industry Strategy	Lead by example (Race to the Top)	Shift to national competitiveness

The Pentagon Ultimatum: A Geopolitical Catalyst?

The timing of RSP v3 is impossible to divorce from the escalating standoff between Anthropic and the U.S. military. Reports confirm that Defense Secretary Pete Hegseth recently met with Anthropic CEO Dario Amodei, delivering a stark ultimatum: lift restrictions on military use of Claude models or face severe consequences.

The Pentagon is reportedly demanding that Anthropic allow its AI to be used for "any lawful purpose," effectively stripping the company of its right to veto specific military applications. Anthropic has historically maintained strict "red lines" against the use of its technology for:

Fully autonomous weapons (where AI makes lethal targeting decisions without human intervention).
Mass domestic surveillance.

The Defense Department has threatened to invoke the Defense Production Act (DPA)—a Korean War-era law that allows the President to compel private companies to prioritize national defense contracts. Additionally, officials have floated the possibility of designating Anthropic as a "supply chain risk," which would effectively blacklist the company from all federal contracts, potentially costing them hundreds of millions in revenue and shutting them out of the lucrative government sector.

Critics argue that the loosening of the RSP’s "pause" commitments creates a convenient policy loophole. By removing the strict requirement to halt deployment based on internal safety thresholds, Anthropic may be positioning itself to accommodate the Pentagon's demands without technically violating its own safety constitution.

Industry Implications and the "Capability Overhang"

The revision of the RSP highlights a growing tension in the AI industry: the "capability overhang." This term refers to the gap between an AI model's raw power and the safety mechanisms available to control it. Anthropic’s previous policy was designed to prevent this overhang from growing too large. By removing the hard brake, the company is implicitly accepting a higher level of risk to remain competitive against rivals like OpenAI and xAI, who have already secured extensive defense contracts.

Why this matters for the AI ecosystem:

Normalization of Military AI: If Anthropic, the industry's most vocal proponent of safety, bends to Pentagon pressure, it signals the end of "conscientious objection" among major AI labs.
The Failure of Self-Regulation: The shift acknowledges that voluntary commitments are insufficient in the face of national security imperatives and market dynamics. Anthropic explicitly cited the lack of federal regulation as a reason for their policy pivot.
Focus on Post-Deployment Monitoring: With pre-deployment pauses gone, the industry's safety focus will likely shift entirely to "red teaming" and monitoring systems after they are built, rather than preventing their creation in the first place.

Conclusion

Anthropic’s RSP v3 represents a maturing, albeit cynical, realization of the AI landscape in 2026. The idealism of 2023—where a single company could steer the industry toward safety through moral leadership—has collided with the hard realities of great power competition and military necessity. While the introduction of Risk Reports and Frontier Safety Roadmaps offers a new layer of transparency, the removal of the binding "safety pledge" marks the end of an era. As the Pentagon looms large, Anthropic is no longer trying to slow the train; it is merely promising to blow the whistle more loudly as it speeds up.