Anthropic vs. Chinese Vendors: The Problem With Distillation

Staying true to its branding as an enterprise and security-first AI vendor, Anthropic has accused three Chinese vendors — DeepSeek, MiniMax and Moonshot AI — of extracting from Anthropic’s Claude model to improve their own. The AI vendor claims that actions taken by Chinese vendors pose a national security risk and could lead to the development of dangerous capabilities in new models, as safety guardrails are removed.

Anthropic on Feb. 23 alleged that Chinese vendors generated more than 16 million exchanges with Claude from 24,000 fraud accounts, using distillation. Distillation is a method in which a small or student model is trained to mimic the actions and functions of a teacher model.

According to Anthropic, DeepSeek targeted reasoning capabilities across different tasks and used Claude to generate chain-of-thought data at scale and to produce censorship-safe alternatives to politically sensitive questions. Moonshot AI also aimed for agentic reasoning, tool use and agent development for computer use. MiniMax pursued agentic coding, tool use and orchestration through its operation, Anthropic said.

Related:Meta Signs $100B AI Chip Deal With AMD

Anthropic’s accusation is the latest in the geopolitical tension and competition between Chinese and U.S. technology vendors. In January 2025, OpenAI first raised concerns that DeepSeek used its models to train the DeepSeek-R1 series. The ChatGPT maker sent a memo to the U.S. House Select Committee on China on Feb.12, accusing DeepSeek of using distillation methods to bypass its security safeguards.

The Dangers of Distillation

While distillation is not new, and many companies test and use competitor models, the scale of the alleged distillation of Anthropic’s models by Chinese vendors is a concern for enterprises, given the security breach. It could also be a political issue because Chinese vendors are allegedly taking advantage of the work done by American vendors to advance to the next level.

“What’s alleged here is industrial-scale extraction — millions of API calls across thousands of coordinated accounts — which looks less like evaluation and more like systematic replication,” said Kashyap Kompella, CEO and founder of RPA2AI Research.

He added that what Anthropic is claiming is ironic, given that American AI vendors have been accused of sourcing their training data unethically by not paying for or gaining permission to use copyrighted material. Still, the level of extraction Anthropic is alleging is problematic.

One primary reason is that if a competitor can extract data at such a scale, it undercuts the rationale for investing in R&D, Kompella said.

Related:Amazon to Invest in Louisiana Data Centers

“Frontier AI models cost billions in compute, talent and infrastructure,” Kompella said. “If a competitor can shortcut that investment by systematically extracting capabilities, the economics of innovation and VC investments can collapse.”

Moreover, as Anthropic mentioned, safety guardrails are usually removed during distillation. Therefore, the distilled models typically carry greater risks if additional safety controls are not added afterward to prevent malicious AI use, Kompella continued.

He added that the takeaway here is that vendors must include model protection as a key feature.

Extraction also carries the risk that enterprise data will be misused and handed over to the Chinese government, said Lian Jye Su, an analyst at Omdia, a division of Informa TechTarget.

“If these kinds of activities are left unmonitored, it creates a backdoor where the training data that attracts users may involve sensitive company data that can be accessed by the Chinese government and Chinese authorities or be resold to other parties that have malicious intent,” Su said.

Anthropic’s next step is to place greater emphasis on mechanisms that help it better protect its models because distillations are likely to persist, Su said.

Related:OpenAI Partners With Consulting Giants in Enterprise AI Push

“There is always going to be some sort of malicious activities, maybe not by the model vendor themselves, maybe by the developer community in China,” he said.

On the other hand, Anthropic’s claims show how the vendor continues to align itself with U.S. domestic interests. In recent months, the AI vendor has supported U.S. chip export restrictions. On its blog, Anthropic said the distillation attacks show that “restricted chip access limits both direct model training and the scale of illicit distillation.”

The Chinese AI Market

While there is nothing wrong with Anthropic being focused on the U.S. and what benefits its regional market, since Chinese vendors are similarly focused on the Chinese AI market, it can’t be said that Chinese vendors are only successful because they copied American vendors, Su said.

“Yes, maybe some of the stuff is borrowed from Anthropic, but then I think they also have their own innovations that they bring to the table as well,” he said, adding that the vendors have each been successful in building their own developer communities.

For example, Moonshot Kimi K2 is a one trillion-parameter model. To create it, Moonshot introduced MuonClip, an optimizer known for its token efficiency.

These types of innovations are independent of American vendors, Su said.

“They are not just pure copycats,” he said. “You can’t do this sort of stuff without strong AI. You do need a very strong team of AI engineers.”

Moreover, AI competition remains fierce in China, Su added. These vendors compete not only with U.S. vendors such as OpenAI and Anthropic but also with Alibaba, Baidu and other Chinese vendors. It is also likely that these vendors are not the only ones to distill Anthropic’s and OpenAI’s models but are singled out because of their ability to innovate on top of what they learn, Su said.

However, being able to innovate becomes less impressive when you’re labeled a copycat and a plagiarist.

“It creates reputational risk,” Kompella said. He added that while Chinese vendors have technical advances, Anthropic’s allegations complicate the brand. “The reputational challenge is that even strong, independent innovation becomes harder to separate from allegations of appropriation.”

For enterprises watching this play out, the main thing is to be critical of safety guardrails included in the models they use.

“Training data lineage is rapidly becoming a board-level issue,” Kompella said. “If there is uncertainty about how a model was trained — whether it relied on unauthorized access, breached terms or sidestepped controls — that becomes a procurement red flag.”

Menu

Categories:

Hot right now:

Follow on:

Menu

Categories:

Hot right now:

Follow on:

Anthropic vs. Chinese Vendors: The Problem With Distillation

Share This Post

The Dangers of Distillation

The Chinese AI Market

Related Posts

Tokenized US Treasury Market Surges by $1B Since Beginning of Year

When ETF options start driving bitcoin

Learning About Hanukkah: Exploring Themes of Light and Resilience with Elizabeth Fraley Kinder Ready

Billionaire Alan Howard’s crypto incubator WebN closes down

Tether backs Whop to bring stablecoin infrastructure to millions of creators

BTC hits $67,000; ETH, DOGE, SOL lead amid crypto short squeeze

Categories:

Hot right now:

Company:

Follow on: