MacroScope

One of the main causes of concentration in large language model training is access to advanced GPUs. Recently, however, especially through DeepSeek, Chinese companies have drawn attention with claims that they can train capable models at lower cost. So how real is China’s low-cost AI claim?

First, one misreading needs to be cleared up. The claim that DeepSeek trained its V3 model for “$5.6 million” refers only to GPU usage costs during the final training run. R&D expenses, failed experiments, data preparation, personnel, data center investments, energy, network infrastructure, and previous architectural iterations are not included in that figure. What we see here, therefore, is not the full lifecycle cost, but a narrowly defined cost item.

But this is not merely an accounting trick. Chinese companies have achieved real efficiency gains. Techniques such as Mixture of Experts (MoE) architectures, 8-bit Floating Point (FP8) training, Multi-head Latent Attention (MLA), and communication optimization reduce the compute and hardware burden required for model training. These solutions, developed under hardware constraints, make it possible to do the same work with fewer active parameters, lower memory pressure, and more limited training costs. That is why, in China’s case, two things are true at the same time. Cost reporting is incomplete, but the engineering advances are real.

U.S. export controls also seem to have played a role in these developments. Teams that had to work with chips such as the H800, which has more limited interconnect capacity than the NVIDIA H100, appear to have shifted away from brute-force scaling and toward optimization. That pressure has produced technical innovation in some areas.

China has a large hardware stockpile, a vast domestic market, and strong state support. The low-cost appearance of Chinese models is not driven only by technical efficiency. Engineering salaries are lower than in the United States. Data center construction is cheaper. Energy and compute subsidies are more widespread. Large platform companies can also cross-subsidize AI services. For that reason, low API prices do not always reflect a durable cost advantage; they may instead signal a deliberate market-capture strategy.

In summary, one way to break the monopoly in GenAI may come from architectural innovations that reduce model training costs, lower labor costs, state support, and aggressive pricing. The sharing of trained model files also allows new teams to move forward more cheaply by adapting existing models rather than building from scratch.

China is trying to push costs down through technical, commercial, and political means. Although this approach has not yet fully dismantled the GenAI oligopoly, it is changing the basis of competition under a chip-dominated monopoly.

Tags:

No responses yet

Leave a Reply

Your email address will not be published. Required fields are marked *