The release of DeepSeek, particularly the open-source Reasoning-LLM R1, has caused turbulence in the stock markets. But is DeepSeek truly as cost-effective as claimed? How does its training approach differ, and what impact does this have on the AI model market?
Venturing into the unknown: With its open-source availability and purportedly significantly lower operating costs, DeepSeek's Reasoning-LLM R1 openly challenges established AI model providers, most notably ChatGPT operator OpenAI.
(Image: Screenshot / DeepSeek)
The announcement from Chinese AI startup DeepSeek caused an earthquake in the stock market: On January 20, 2025, the company announced the availability of a Large Language Model (LLM) that is said to directly compete with the performance of OpenAI's top model o1—and at a fraction of the price. Furthermore, the so-called Reasoning-LLM DeepSeek-R1 is supposed to be available under an MIT license as open source, thus being freely available for download and modification according to individual needs.
In fact, R1 is not the only LLM that DeepSeek provides under an open-source license. Already in December 2024, relatively unnoticed by the public, the model DeepSeek-V3, a "Mixture-of-Experts" (MoE) LLM with a total of 671 billion parameters, of which 37 billion are activated per parameter, was released. The real stir in January was triggered by the release of a free chatbot app that can be compared to ChatGPT—and quickly soared to the top of the charts on mobile app stores. The touted excellent performance of the DeepSeek models, which can compete with the best proprietary LLMs from OpenAI and Anthropic, led to a stock market crash on January 27. Companies like NVIDIA, AMD, and other leading AI companies plummeted by an estimated $600 billion altogether.
Western AI model providers from the USA, such as the Meta group with its (also open-source licensable) Llama-v2 model or Microsoft as the main investor in ChatGPT developer OpenAI, remained noticeably calm following the news, raising allegations of IP theft and responding with further multi-billion dollar investments in their own models. In contrast, the open-source community reacted with genuine enthusiasm. On the AI community platform Huggingface, there are already more than 700 models based on DeepSeek-R1 or DeepSeek-V3 after just one month of availability. The Reasoning-LLM R1 with 685 billion parameters has been directly downloaded over 670,000 times from there (as of January 31, 2025), and the MoE model DeepSeek-V3 even over 800,000 times.
What makes DeepSeek different?
DeepSeek itself claims that training the DeepSeek-V3 model cost just $5.6 million; a fraction of what is known from AI models by OpenAI or Meta. This is all the more astonishing considering that the training supposedly took place exclusively on NVIDIA's H800 GPUs. NVIDIA launched the H800 series in March 2022 as a less performant variant of its hardware acceleration GPUs. The components were designed with specifications below the export restrictions in effect at the time, allowing them to be sold to China. The H800 series offers a boost frequency of 1755 MHz and a memory frequency of 1593 MHz. After the US Department of Commerce tightened export restrictions in October 2023, this GPU series was also banned from being sold to China. The export restrictions were explicitly tightened to make it more difficult for China to train advanced LLMs. Apparently, the startup DeepSeek, officially launched the same year, managed to stockpile a sufficient supply of the necessary GPUs or acquired accelerators through other means.
The claimed $5.6 million is likely to be an inflated figure. Market researchers at SemiAnalysis have calculated that the acquisition of 50,000 Nvidia GPUs for training alone must have cost at least $1.6 billion. The ongoing operation of a dedicated data center for model development alone is estimated to cost approximately $944 million.
Yet, even aside from the actual costs, the development of the DeepSeek models is noteworthy. After all, according to what is known, DeepSeek seems to have achieved impressive results with the available AI benchmarks on less powerful hardware. On Chatbot Arena, an AI evaluation service from the University of California in Berkeley, both R1 and V3 rank among the top ten available AI models. In the ranking, the DeepSeek models outperform, among others, Claude from Anthropic and Grok from xAI. DeepSeek-R1 even manages to surpass the latest build of OpenAI's o1.
But how was this allegedly more cost-efficient development achieved on previous generation AI acceleration hardware? A so-called "DualPipe Parallelism Algorithm" was reportedly used. It was developed to circumvent the limitations of the Nvidia H800. It employs low-level programming to meticulously control how training tasks are scheduled and batched. The "Mixture-of-Experts" (MoE) architecture of the 671 billion parameter V3 model is also said to be a targeted development to compensate for the hurdles posed by weaker training hardware. Instead of relying on a single explicit neural network, this LLM uses a mix of multiple networks, known as "experts," during training. These experts can be independently activated on individual 37 billion parameter blocks within the entire LLM. Since each expert is smaller and more specialized, training the model requires less memory. Additionally, the finished model is leaner, reducing computational costs once the model is deployed.
Date: 08.12.2025
Naturally, we always handle your personal data responsibly. Any personal data we receive from you is processed in accordance with applicable data protection legislation. For detailed information please see our privacy policy.
Consent to the use of data for promotional purposes
I hereby consent to Vogel Communications Group GmbH & Co. KG, Max-Planck-Str. 7-9, 97082 Würzburg including any affiliated companies according to §§ 15 et seq. AktG (hereafter: Vogel Communications Group) using my e-mail address to send editorial newsletters. A list of all affiliated companies can be found here
Newsletter content may include all products and services of any companies mentioned above, including for example specialist journals and books, events and fairs as well as event-related products and services, print and digital media offers and services such as additional (editorial) newsletters, raffles, lead campaigns, market research both online and offline, specialist webportals and e-learning offers. In case my personal telephone number has also been collected, it may be used for offers of aforementioned products, for services of the companies mentioned above, and market research purposes.
Additionally, my consent also includes the processing of my email address and telephone number for data matching for marketing purposes with select advertising partners such as LinkedIn, Google, and Meta. For this, Vogel Communications Group may transmit said data in hashed form to the advertising partners who then use said data to determine whether I am also a member of the mentioned advertising partner portals. Vogel Communications Group uses this feature for the purposes of re-targeting (up-selling, cross-selling, and customer loyalty), generating so-called look-alike audiences for acquisition of new customers, and as basis for exclusion for on-going advertising campaigns. Further information can be found in section “data matching for marketing purposes”.
In case I access protected data on Internet portals of Vogel Communications Group including any affiliated companies according to §§ 15 et seq. AktG, I need to provide further data in order to register for the access to such content. In return for this free access to editorial content, my data may be used in accordance with this consent for the purposes stated here. This does not apply to data matching for marketing purposes.
Right of revocation
I understand that I can revoke my consent at will. My revocation does not change the lawfulness of data processing that was conducted based on my consent leading up to my revocation. One option to declare my revocation is to use the contact form found at https://contact.vogel.de. In case I no longer wish to receive certain newsletters, I have subscribed to, I can also click on the unsubscribe link included at the end of a newsletter. Further information regarding my right of revocation and the implementation of it as well as the consequences of my revocation can be found in the data protection declaration, section editorial newsletter.
While OpenAI does not disclose the parameters of its state-of-the-art models, it is speculated that their in-house models comprise over a trillion parameters. Nevertheless, DeepSeek-V3 achieved results with existing benchmarks that match or even surpass OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet. Naturally, this immediately sparked speculation about whether the LLM actually delivers comparable performance – or is merely optimized to excel in benchmark requirements but performs weaker in practical applications. Nonetheless, the results also caused a stir within expert circles.
Even more impressive is the reported performance of DeepSeek's so-called reasoning model, DeepSeek-R1. This model was explicitly developed to compete with OpenAI's Chain-of-Thought flagship model, ChatGPT-o1—a fact that DeepSeek prominently highlights on its website. Since OpenAI keeps its LLMs proprietary, an exact comparison is challenging. Incidentally, R1 is not the first open reasoning model to appear in recent months. Nevertheless, it can certainly be said to be more powerful than earlier such models, such as QwQ from Alibaba.
Like with DeepSeek-V3, the Chinese startup claims to have achieved results with an unconventional approach. Most LLMs are trained using a method that involves supervised fine-tuning (SFT). In this technique, the model's responses to prompts are reviewed and labeled by humans. These evaluations are fed back into training to enhance the model's responses. This has proven to be the most efficient method for high-performance models so far. However, the review and labeling of responses by humans is time-consuming and costly—and as models become more complex, the associated SFT effort increases significantly.
To arrive at a correspondingly powerful LLM more quickly and cheaply, DeepSeek initially attempted to forgo SFT. Instead, they relied on Reinforcement Learning (RL) to train DeepSeek-R1-Zero. A rule-based reward system, described in the model's white paper, was supposed to help DeepSeek-R1-Zero learn to think. However, this approach led to problems, such as language mixing (the use of multiple languages in a single response), which made reading the answers difficult.
Therefore, according to DeepSeek's description, they adopted a hybrid approach: Training begins with a "cold start," initially using a small SFT dataset of only a few thousand examples. Once matured, RL is used to complete the training. Nevertheless, many AI experts remain skeptical of this approach: In reinforcement learning, minor deviations or errors that occur early in the training process can have far-reaching consequences. Since these are not corrected in manual fine-tuning, these deviations "reinforce" with each further learning cycle. During the inference of the finished LLM, it may then become apparent that the model shows significant performance gaps in certain areas.
"The opportunity for Europe to catch up"
What do these developments mean for the AI market? After Meta released its LLM Llama-v2 under an open-source license in July 2023, the market for AI applications and the development of AI models for research purposes gained tremendous momentum. Cloud providers like AWS, Google Cloud, and Microsoft Azure have integrated Llama-v2 to enhance their services and offer services that can build on this AI model. This allows for better scalability and flexibility for companies working on these platforms. However, experts do not consider Llama-v2 suitable for handling complex queries. The "strawberry" test in the summer of 2024 became a source of ridicule online: The three most common openly available AI models, including ChatGPT 4.0 and Llama-v2, could not recognize that the word "strawberry" contains more than two r's – their pattern recognition failed on this type of inquiry.
Currently, the proprietary ChatGPT-o1 model is considered the benchmark on the AI market. However, even if DeepSeek-R1 doesn't reach the quality of OpenAI, a better AI model than Llama-v2, which is also freely usable and adaptable for commercial and research purposes under an MIT license, is likely to cause a shake-up in the AI applications market. Potentially even a larger impact than the appearance of Llama-v2 itself at that time.
This also presents opportunities for European companies: "If you developed your application with OpenAI, you can easily migrate to the others ... the switch took only minutes," said Hemanth Mandapati, head of the German startup Novo AI, to Reuters news agency on the sidelines of the GoWest conference for venture capitalists in Gothenburg, Sweden. The emergence of DeepSeek offers companies access to highly advanced AI technology at a fraction of the currently established costs: OpenAI charges $2.5 for 1 million input tokens, the data units processed by the AI model. In contrast, DeepSeek currently charges $0.014 for the same number of tokens.
Furthermore, European companies already working on their own models could optimize these models with minimal expenses and offer them on the market at a fraction of the previous prices. "There was an offer from DeepSeek that was five times lower than the current prices," said Mandapati. "I save a lot of money, and users see no difference."
Other European entrepreneurs echoed this sentiment to Reuters. "It's a significant step towards democratizing AI and leveling the playing field with Big Tech," said Seena Rejal, Chief Commercial Officer of the British company NetMind.AI. His company has also decided to adopt DeepSeek models early on. However, other companies, especially larger corporations like Nokia or SAP, remain reserved for the time being. "Cost is just one factor," said Alexandru Voica, Head of Corporate at the British company Synthesia. "Other factors include: Do you have all the security certifications, the frameworks, the software ecosystem that allows companies to build and integrate with your platform?" Especially when it comes to data security and IP protection, some companies are hesitant to rely on a Chinese provider in this area.
The price competition in the industry has undoubtedly already begun: Last Friday, on January 31, 2024, Microsoft announced that Copilot users can now use the ChatGPT-o1 model for free. Until now, the price for this service was $20 per month. It is expected that other providers will soon follow suit.
Meanwhile, in the hobbyist scene, the rush for DeepSeek models is enormous. This is also because DeepSeek offers a range of smaller, streamlined variants of DeepSeek-R1 with a reduced parameter set, which can be used on home computers – and even SBCs: A video showing the use of a 14 billion parameter DeepSeek-R1 model on a Raspberry Pi 5 garnered over 1.7 million views within 48 hours.
AI community works on "genuine" open-source DeepSeek clones
However, the fact that DeepSeek-R1 and DeepSeek-V3 are open source for customization and application does not mean the company reveals its hand in the development of its LLMs. Although DeepSeek is "open," some details remain concealed. DeepSeek does not disclose the datasets or the training code used to train its models. Revealing source code is generally considered a community standard in open-source communities. The obfuscation of the base datasets has led both Meta and OpenAI to suspect that the Chinese startup might have used existing AI models and saved a considerable amount of research effort and money during the base training of its LLMs in this way.
To be fair, most "open" LLMs only provide the model weights necessary for running or fine-tuning the model. The full training dataset and the code used for training remain hidden. Meta has also been repeatedly criticized on social media for this practice, and the "true" status of an open-source model has been questioned because of this—despite its free usability.
The models from DeepSeek are similarly opaque. However, the open-source AI community is already working on uncovering the veil. On HuggingFace, the largest online repository for open-source AI models, the Open-R1 project was announced on January 28: an attempt to develop a fully "true" open-source version of DeepSeek-R1. "The release of DeepSeek-R1 is an incredible boon for the community, but not everything has been disclosed—although the model weights are open, the datasets and the code for training the model are not," writes the Huggingface team in a related blog post. "The goal of Open-R1 is to create these last missing pieces so that the entire research and industrial community can build similar or better models with these recipes and datasets." Since this is done entirely openly, everyone in the AI community can contribute —and everyone can benefit from a powerful, entirely open project. Moreover, the project could publicly confirm or refute the open question of whether DeepSeek's claimed efficiency of the reinforcement learning approach is true. To achieve this, they plan to proceed in three steps:
Replicating the R1-Distill models by distilling a high-quality reasoning dataset from DeepSeek-R1.
Replicating the pure RL pipeline used by DeepSeek to create R1-Zero. This involves assembling new, extensive datasets for mathematics, logical reasoning, and code.
Demonstrating that it is possible to transition from the base model to efficient reinforcement learning through a multi-stage training from an SFT "cold start".
Even though questions remain, there is no doubt that DeepSeek has made waves simply by entering the market. Some indications suggest that the Chinese developers are primarily interested in capturing significant market shares as quickly as possible with their low prices. In any case, DeepSeek-V3 and DeepSeek-R1 are already a disruption. Whether this heralds a technological or "just" a price revolution will be revealed in the coming months. (sg)