Anthropic Accuses Chinese AI Labs of Mining Claude via Distillation Attacks

Anthropic Exposes Massive Distillation Ring Involving Major Chinese AI Labs

In a significant escalation of the ongoing artificial intelligence arms race, Anthropic has publicly accused three prominent Chinese AI laboratories—DeepSeek, Moonshot AI, and MiniMax—of conducting a systematic, industrial-scale campaign to extract capabilities from its Claude models. The allegations, detailed in a new security report released Monday, outline how these organizations allegedly utilized thousands of fraudulent accounts to "distill" Claude’s advanced reasoning and coding abilities into their own proprietary models.

This revelation comes at a critical juncture for the global AI industry, coinciding with intensified debates in Washington regarding the efficacy of semiconductor export controls. As U.S. policymakers struggle to limit China's access to cutting-edge hardware, Anthropic’s findings suggest that intellectual property theft via model distillation has become a primary avenue for competitors to bypass hardware constraints and close the capability gap.

The Scale of the "Distillation" Operation

According to Anthropic’s investigation, the coordinated effort involved the generation of over 16 million exchanges with Claude models through a sophisticated network of approximately 24,000 fraudulent accounts. These accounts, allegedly managed through commercial proxy services to mask their origins, were used to query Claude systematically, recording its outputs to train smaller, domestic models—a process known in machine learning as "distillation."

While distillation is a legitimate technique used by developers to compress their own large models into more efficient versions, extracting data from a competitor's model without authorization violates terms of service and constitutes intellectual property theft. Anthropic's data indicates that the operation was not a casual experiment but a highly organized extraction of high-value cognitive behaviors.

The scale of the attack varied significantly across the accused institutions, with MiniMax appearing to be the most aggressive aggressor. The following breakdown illustrates the scope of the alleged activities:

Table: Breakdown of Alleged Distillation Activities by Lab

Lab Name	Estimated Exchanges	Primary Target Capabilities
MiniMax	~13 million	Agentic coding, tool orchestration, and complex reasoning sequences
Moonshot AI	~3.4 million	Agentic reasoning, data analysis, and computer vision tasks
DeepSeek	>150,000	Foundational logic, alignment protocols, and policy-sensitive queries

Anatomy of an AI Heist

The methodology described by Anthropic reveals a sophisticated understanding of Large Language Model (LLM) training pipelines. The attackers did not merely ask random questions; they targeted specific "teacher" behaviors that are difficult and expensive to replicate from scratch.

MiniMax, identified as the largest perpetrator, reportedly redirected nearly half of its own traffic to Claude within 24 hours of a new model release, effectively using Anthropic’s infrastructure to jumpstart its own system's capabilities. By feeding user prompts into Claude and using the high-quality responses to train their own models, these labs could theoretically achieve near-parity with state-of-the-art U.S. models while expending a fraction of the compute resources.

Key tactics identified in the report include:

Chain-of-Thought Elicitation: prompting Claude to "show its work" or explain its reasoning steps, generating rich training data that teaches student models how to think, not just what to answer.
Proxy Network Obfuscation: utilizing decentralized residential proxy networks to distribute requests, making the traffic appear as if it were coming from thousands of distinct, legitimate users.
Targeted Guardrail Stripping: specifically querying sensitive topics to understand how Claude refuses or handles safety requests, potentially to train models that circumvent similar restrictions.

The National Security Dimension: Stripped Safeguards

Beyond the commercial implications of intellectual property theft, Anthropic highlighted a grave safety concern: the removal of safety guardrails. U.S. frontier models like Claude are subjected to rigorous "Constitutional AI" training to prevent them from assisting in the creation of bioweapons, cyberattacks, or disinformation campaigns.

When a model is distilled illicitly, the "student" model often learns the capabilities of the "teacher" without inheriting its safety inhibitions. Anthropic warns that these "unshackled" clones pose a unique proliferation risk. If a distilled model retains Claude's coding proficiency but lacks its refusal mechanisms for malware generation, it becomes a potent weapon for bad actors.

"Illicitly distilled models lack necessary safeguards, creating significant national security risks," Anthropic stated in its research paper titled Detecting and Preventing Distillation Attacks. The company argues that allowing foreign entities to clone American AI capabilities undermines the very safety protocols the U.S. government has been urging the industry to adopt.

New Defensive Measures: Behavioral Fingerprinting

Coinciding with the accusation, Anthropic has released details on new defense mechanisms designed to identify and block distillation attempts in real-time. The core of this defense is "behavioral fingerprinting," a technique that analyzes the statistical patterns of API usage.

Unlike legitimate users who exhibit organic, varied interaction patterns, distillation scripts often leave subtle statistical signatures. These include:

Unnatural Prompt Distributions: A high frequency of prompts designed to cover the entire "knowledge space" of a model rather than solve immediate user problems.
Systematic Parameter Sweeping: Systematically varying temperature or sampling settings to extract diverse outputs for the same prompt.
Latency Correlation: Timing patterns that suggest the API is being called programmatically in response to a third-party user input (a "man-in-the-middle" setup).

Anthropic has announced it is sharing these technical indicators with other major U.S. AI labs (such as OpenAI and Google DeepMind), cloud providers, and government authorities to establish an industry-wide defense grid against model mining.

Geopolitical Fallout: The Chip War Connection

This incident throws a wrench into the complex machinery of U.S.-China tech relations. The timing is particularly sensitive, as the U.S. Department of Commerce is currently reviewing the effectiveness of export controls that ban the sale of advanced GPUs, like NVIDIA’s H100 and the newer Blackwell series, to Chinese firms.

Critics of the current export bans argue that they are insufficient if Chinese labs can simply "smart their way" around hardware deficits by copying the intelligence of U.S. models. If a lab can train a competitive model using 10% of the compute power by distilling Claude, the "compute barrier" aimed at slowing China's AI progress becomes significantly more porous.

Implications for Policy:

Stricter API Controls: We may see U.S. regulators demanding "Know Your Customer" (KYC) standards for AI API access, similar to banking regulations, to prevent anonymous foreign access.
Export Control Expansion: The definition of "export" might be broadened to include not just physical chips or model weights, but access to model inference APIs that can be used for training.
Retaliatory Measures: This public naming and shaming could provoke retaliatory cyber activities or sanctions from Beijing, further bifurcating the global AI ecosystem.

Conclusion

The accusations leveled by Anthropic mark a transition from theoretical risks to documented conflict in the AI sector. As models become more valuable, they are no longer just products but strategic national assets. The "Distillation Heist" serves as a stark reminder that in the digital age, capability can be stolen just as easily as it can be built. For the industry, the focus must now shift from simply building smarter models to building harder-to-steal ones, ensuring that the fruits of American innovation do not inadvertently fuel the very competitors they were meant to outpace.