Inside the Gray Market for LLM Access Middlemen package extra tokens, hijack IDs to resell, distill models

Published

Jun 05, 2026

Reading time

3 min read

An ecosystem of API proxy servers enables AI developers in China to access top U.S. models at deeply discounted prices.

What’s new: A network of vendors that operate in legal gray areas provides low-cost access to models that otherwise are restricted or unavailable in mainland China, according to a report by the think tank ChinaTalk. For instance, the system reportedly enables developers in China to buy Anthropic Claude tokens for as little as 10 percent of the typical market price.

How it works: Major AI models built in the U.S. including OpenAI ChatGPT, Anthropic Claude, Google Gemini, and Midjourney are not officially available in mainland China. Instead, developers there can rely on an informal network that adapts to shifting legal, market, and security conditions. Transactions may involve illegal activities such as credit card theft or unauthorized circumvention of China’s Great Firewall to connect to servers in countries such as Singapore. Other parts of the network may violate providers’ terms of service, exploit people who provide biometric data, or misrepresent products for sale.

The network includes a wide variety of parties: account farms that acquire AI model accounts at scale, verification platforms that supply phone numbers to pass sign-up checks, token resellers that deal in unused quotas, identity brokers that create fake credentials, model routers, payment processors, and others. Between sit proxy servers that receive API calls from developers and relay them to API providers via accounts that appear to be legitimate but may not be.
To keep prices low, some providers use tactics that may be legal technically but take advantage of gray areas, such as aggregating Anthropic’s free API credits, reselling unused account quotas, exploiting educational or corporate discounts, or splitting subscription plans among multiple users. They may use illicit sources, such as accounts created with stolen or fraudulent credit cards.
When users select a higher-tier model, their requests may be routed to a cheaper, inferior model. The German research lab CISPA Helmholtz Center for Information Security found that proxy access to “Gemini-2.5” achieved benchmark performance of 37 percent on MedQA (answering multiple-choice dataset medical questions), markedly lower than the 83.82 percent performance via Google’s API.
Proxy servers harvest users’ requests and sell the logs. Users’ API calls make good training data for new models, and the outputs of proprietary models can be used to train other models to mimic their responses.

Behind the news: This gray market has been implicated in allegations that Chinese developers of open source models routinely train them to mimic proprietary models built by U.S. companies. For instance, in February, Anthropic accused three Chinese AI labs – DeepSeek, Moonshot, and MiniMax – of systematically extracting Claude’s outputs to improve their own models in an effort Anthropic called “industrial-scale” distillation. While Anthropic acknowledged that distillation is a well-established training method, the company detected over 16 million exchanges from 24,000 fraudulent accounts. It argued that “illicitly distilled models lack necessary safeguards, creating significant national security risks.” Reactions to Anthropic’s accusation were mixed:

Some critics argued it was hypocritical for Anthropic to object to companies using its models’ outputs for training, when AI developers commonly train models on copyrighted material, presuming that this activity is fair use. Others framed the accusations as an attempt by Anthropic to maintain its competitive advantage by encouraging tighter U.S. regulation of Chinese AI firms.
In April, the White House responded with a memo that acknowledged industrial-scale distillation as an adversarial threat. It affirmed the Trump administration’s commitment to working with the private sector to build defenses against industrial-scale distillation and hold foreign actors accountable for such campaigns.

Why it matters: The ChinaTalk report is based largely on interviews and circumstantial evidence, and some of its claims have not been verified independently. But it calls into question the structure of the international AI market. Apparently, limits put in place to manage access to AI have created incentives for a parallel market that may undermine the economics and governance of AI systems. Developers who use proxy servers may not gain access to models they’ve paid for, and their prompts, code, and agent traces may be logged and used for purposes beyond their control. AI companies may not be paid fairly for services rendered, and they may have little visibility into who uses their technology. Models built by distilling the outputs of low-cost API calls may evade guardrails that were designed to keep the parent models from aiding criminal activity.

We’re thinking: We place a high value on openness. The benefits of AI should be available to all to the greatest extent possible. Knowledge distillation is a valuable technique that should be available to developers everywhere, and restrictions on models can fail to stop determined actors while harming legitimate developers and researchers. At the same time, using fraudulent and otherwise dishonest means to gain access to proprietary AI models is not acceptable. Businesses in China — or anywhere — that aim to offer access to closed U.S. models should come to terms with Anthropic in a legitimate way.

Subscribe to The Batch