-
熊节、塞尔吉奥·阿马德乌:DeepSeek为什么要开源?这可能与人工智能的领导权息息相关
THE DISPUTE FOR LEADERSHIP IN ARTIFICIAL INTELLIGENCE, CHINA, AND OPEN SOURCE
Why is technological leadership important? How to define technological leadership in AI? Artificial Intelligence (AI) is a transversal technology, and its advancements have profound impacts on the economy, society, and national security. Technological leadership, first and foremost, provides a series of competitive advantages, as inventions and innovations grant their developers gains and benefits that others do not possess. Secondly, technological leadership is a critical geopolitical factor, as it allows for the influence of global standards, norms, and regulations. Thirdly, technological leadership can drive an innovation ecosystem that consolidates long-term development. Fourthly, leadership can enhance security in an international context of threats, including military ones. Fifthly, leadership enables the direction of technology to benefit social, environmental, and political objectives.
From a technopolitical perspective, where technoscience is not neutral and has implications for power relations and social organization (Winner, 2020), leadership in AI is not merely about developing the most advanced technology but also about creating a sociotechnical environment that realizes broader social values and objectives, ensuring that innovation follows certain purposes. The trajectory of AI development may prioritize increasing productivity for the economic system or may aim to find socially just and environmentally sustainable solutions. It may seek to concentrate power and reinforce international asymmetries or contribute to the distribution of knowledge and equitable development. It may stifle the inventiveness of populations and cultures or ensure technodiversity. It may be tied to the concentration or distribution of power.
Currently, AI leadership resides in the United States, under the direction of the so-called Big Techs. These companies control indispensable resources for the development of existing AI, particularly AI dominated by the deep learning approach. This approach is based on the use of statistics and probability for the classification and extraction of patterns from large amounts of data. To perform these operations, AI developers rely on significant computational power. Training an advanced AI model like OpenAI's GPT costs millions of dollars and requires many hours of processing with specialized hardware, such as specific chips designed for these tasks. These are called "AI inference chips" or "inference accelerators." They achieve better results in less time. For example, Google's Tensor Processing Units (TPUs) are optimized for inference and training. Neural Processing Units (NPUs) or Neural Network Accelerators, common in mobile devices and edge computing, are also used. Graphics Processing Units (GPUs) are utilized for both training and inference. Currently, these chips are essential for applications such as image recognition, natural language processing, and other real-time AI tasks.
The U.S. government has, for some time, adopted a policy of restricting access to cutting-edge chips, primarily aimed at delaying AI development in China and other countries considered adversaries. The goal is to maintain U.S. leadership in AI. With Donald Trump's inauguration in January 2025, the policy of technological blockade was intensified. Additionally, the U.S. president announced a $500 billion investment in the Stargate project. Trump's plan is to develop physical and virtual AI infrastructures in the United States, in collaboration with companies like Oracle, OpenAI, and SoftBank, to "fuel the next generation of AI". Companies such as Nvidia, Arm, and Microsoft are partners in the project, which is beginning to be implemented in Texas and will, over the next four years, include "colossal data centers" across various regions of the United States.
American tech elites, represented by figures like Elon Musk, believe that artificial intelligence is approaching the "singularity"—the emergence of Artificial General Intelligence (AGI). They argue that AGI will completely surpass and replace human labor in all intellectual domains, and that if the United States is the first to achieve AGI, its technological hegemony will become unassailable. However, neither ChatGPT nor DeepSeek has shown any signs of approaching AGI. They are useful tools for processing natural language and demonstrate limited reasoning abilities within specific domains, but there is no evidence that they—or any known AI research—are nearing AGI.
THE OPEN SOURCE TURNAROUND
In May 2024, a small Chinese company called DeepSeek launched its Large Language Model (LLM) inspired by Llama, a model licensed under a restricted research agreement prohibiting commercial use. What stood out in the open-source model, DeepSeek V2, was its unprecedented cost-effectiveness. DeepSeek had reduced the cost of inference to just 1 yuan per million tokens, approximately one-seventh of Llama3 70B and significantly less than GPT-4. Tokens are basic units of text that language models use to process and understand human language. Depending on the context and language, tokens can be thought of as "chunks" of words, syllables, or even individual characters. AI models convert text into tokens, which are represented numerically. These numbers are then processed by the model to generate responses or perform tasks. Therefore, the number of tokens in a text directly affects the cost and processing time. The more tokens, the more complex and time-consuming the inference.
DeepSeek, like all Chinese companies, was and is subject to the U.S. government's blockade on cutting-edge chips. This led DeepSeek's leader and his team to focus more on research and optimization. Liang Wenfeng, in an interview in July 2024, stated, "Our starting point is not to seize the opportunity to make a fortune but to advance to the forefront of technology to promote the development of the entire ecosystem." The Chinese company's attempt to lead AI development is evident. To achieve this, DeepSeek did not limit itself to organizing data and running on available clouds. The team worked hard to find solutions in the face of the scarcity of cutting-edge chips. This required altering architectures and experimenting with new procedures, as well as extensive applied mathematics.
The young leader of DeepSeek, Liang Wenfeng, stated, "What we lack in terms of innovation is definitely not capital but confidence and knowledge on how to organize a high density of talent to achieve effective innovation." He continued, "Innovation is not entirely business-driven; it also requires curiosity and creativity. We are stuck in the inertia of the past, but this is also temporary." Liang Wenfeng's idea is to copy less and study more. To bet on open models not to use them but to improve them and find paths that require fewer computational resources.
Open source is fundamental to DeepSeek's strategy but may not be for other Chinese companies like Tencent, Baidu, and Alibaba, among others. However, open source allows knowledge to be distributed globally, generating possibilities for new discoveries at a faster and more inclusive pace. Liang Wenfeng stated:
"Actually, nothing is lost with open source and the publication of papers. For the technical team, being followed is a great sense of accomplishment. In fact, open source is more of a cultural behavior than a commercial one. Giving is actually an extra honor. A company that does this will also have cultural appeal."
Open source is not a technology. It is a development process based on knowledge sharing. Generally, it encourages the organization of communities willing to collaboratively solve problems and maintain solutions by updating them. Language models like Mistral 7B (Mistral AI) and Falcon (Technology Innovation Institute) are open and licensed under Apache 2.0. The reinforcement learning model Stable-Baselines3 is also open with an MIT license. There are numerous other open models in the field of AI. So why did DeepSeek's model become so relevant? Because it disrupted the global race for AI leadership. How? By drastically reducing the computational costs of a large language model.
Open source is fundamental for distributing knowledge but does not solve the problem of the computational infrastructure needed to train and run models. DeepSeek presented an open model with high performance and lower processing requirements.
DeepSeek-R1 has already demonstrated stronger inference capabilities than OpenAI's ChatGPT o1, while its costs (including both training and usage) are significantly lower. By open-sourcing its model, DeepSeek has facilitated the democratization of large language models—enabling smaller companies, countries with less developed technological and digital infrastructure, and even individuals to train their own “sovereign AI” based on DeepSeek, without relying on Big Tech products or handing over their data to these companies. Indonesia and India have already begun building their own AI infrastructure using DeepSeek as a foundation. Prior to this, only the United States and China had the capability to access large language models at such a high level.
DEEPSEEK R1'S BET ON REINFORCEMENT LEARNING
"DeepSeek-R1 - Zero chose an unprecedented path, the path of 'pure' reinforcement learning, which completely abandoned the predefined Chain of Thought (CoT) model and supervised fine-tuning (SFT), relying solely on simple reward and punishment signals to optimize the model's behavior."
In an analysis conducted by Tencent's team on the findings of DeepSeek's R1 model, they suggested that it might be necessary to rethink the role of supervised learning in AI development. Perhaps they were focused on making AI mimic how humans think rather than betting more on the native problem-solving capabilities of reinforcement learning systems. In reinforcement learning, rewards and punishments are mathematically expressed in the model. The agent (which can be an algorithm or a system) makes decisions based on a policy that seeks to maximize cumulative rewards over time. Rewards are numerical values that the agent receives for performing actions in a given state of the environment.
Machine learning is a field of artificial intelligence that allows computers to identify patterns and make decisions based on data without being explicitly programmed to do so. Machine learning relies on algorithms that extract patterns from large amounts of data and adjust their parameters to improve predictive capabilities over time. These algorithms can be divided into three main categories: supervised learning (when the model learns from labeled data), unsupervised learning (when the model identifies patterns without predefined labels), and reinforcement learning (when the model learns through trial and error, receiving rewards or penalties based on its actions). Deep learning is a subset of machine learning that uses artificial neural networks with multiple layers to process data in a hierarchical and sophisticated manner.
Due to these innovations, the training cost of DeepSeek R1 was drastically reduced, representing only 1/10 to 1/20 of ChatGPT's cost. While OpenAI's model spent $20, DeepSeek performed the same activity with just $1. In January 2025, the DeepSeek model cost only 16 yuan per million tokens, while ChatGPT cost up to 438 yuan—a difference of 27 times! This means that organizations can use DeepSeek's model at a lower cost while achieving greater efficiency.
COMPUTATIONAL POWER AND THE GEOPOLITICS OF AI
The plummeting stock prices of Nvidia and other Big Techs were heralded by many as the end of U.S. leadership in AI. This does not seem to be accurate. The sharp decline in the value of the powerful GPU manufacturer was driven by the sudden sale of a massive volume of its shares following the news that DeepSeek had managed to develop a large language model at 10% of OpenAI's costs. This could change the course of AI. The growing dependence on high-processing chips might be shifting. Based on this reasoning and fear, speculators took the opportunity to sell their positions in Nvidia and other companies.
The dependence on cutting-edge chips did not end with the innovations coming from China. Chips with less than 2 nanometers represent a crucial advancement for artificial intelligence. They ensure greater processing capacity with lower energy consumption. As AI models become more complex and require billions or trillions of parameters, computational efficiency remains a critical factor. Smaller chips allow for greater transistor density, improving calculation speed and energy efficiency, reducing operational costs, and the need for cooling. This evolution is fundamental for the large-scale implementation of AI, from data centers to mobile devices, including military applications.
It is important to note that nanochips expand embedded applications in devices and favor their use in IoT, healthcare, robotics, and autonomous vehicles. Another promise is that with more advanced and smaller chips, AI models can be run locally, reducing dependence on the cloud and ensuring faster and more secure responses. In the geopolitical context, the race for smaller chips intensifies technological disputes between powers like the U.S. and China, as control over this technology defines competitiveness in the digital economy and cybersecurity.
The United States maintains its leadership in the development and manufacturing of chips and semiconductors through a combination of technological dominance, strategic investments, and control of supply chains. American companies like NVIDIA, Intel, AMD, and Qualcomm lead the design of advanced chips. The U.S. government reinforces its position with subsidies and incentives, such as the CHIPS and Science Act, which allocates billions of dollars to strengthen domestic semiconductor production and reduce dependence on Asia.
In addition to technological superiority, the U.S. uses sanctions and export controls to limit access to critical technologies by strategic rivals like China. The Department of Commerce imposes severe restrictions on the export of advanced semiconductor manufacturing equipment, such as ASML's machines and chip design software from Cadence and Synopsys. These restrictions make it difficult for China to develop its own advanced chips and reinforce the U.S. position in the sector. Simultaneously, Washington invests in strategic alliances, such as the "Chip 4 Alliance" (with Japan, South Korea, and Chinese Taiwan), ensuring that its allies follow U.S. guidelines to restrict technology transfer to countries considered competitors. This consolidated strategy allows the U.S. to maintain its hegemony in the semiconductor industry, essential for the digital economy and national security.
While the United States is making every effort to restrict China’s access to advanced chips (below 7nm) and their production capabilities, China is continuously developing its ability to independently manufacture these high-end chips. Semiconductor Manufacturing International Corporation (SMIC) has already demonstrated the capability to produce 7nm chips and is believed to be likely capable of producing 5nm chips. Companies like Shanghai Micro Electronics Equipment (SMEE) are actively developing extreme ultraviolet (EUV) lithography technology to replace the lithography machines monopolized by ASML, which have been restricted from being sold to China.
On the other hand, in the field of mature process chips used in automotive and industrial sectors—where the technology is not the most cutting-edge but demand is significantly higher—China’s chip industry has already established a large-scale and complete industrial chain. In 2024, China’s total chip exports exceeded 1 trillion RMB (approximately 139 billion USD) . It is foreseeable that once Chinese companies achieve technological breakthroughs in advanced processing, their existing supply chain advantages will significantly reduce the prices of high-end chips. Moreover, chip processing is constrained by physical limits and cannot be improved indefinitely. It is only a matter of time before China catches up with the United States.
CONCLUSION
"Nvidia's leadership is not just the result of one company's efforts but the combined efforts of the entire Western technology community and industry. They can see the next generation of technological trends and have a roadmap. AI development in China also requires this ecosystem. Many domestic chips cannot develop due to the lack of supporting technical communities and only second-hand information, so China needs someone at the forefront of technology." (Liang Wenfeng, 2024)
The founder of DeepSeek, Liang Wenfeng, stated, "The problem we face has never been money but the ban on cutting-edge chips." Even if the trend of data concentration and the need for increasing computational power—which requires increasingly sophisticated chips—shifts and loses momentum, international capitalism does not seem to alter its fundamental asymmetries. Undoubtedly, the technoscientific development of China allows countries technologically dependent on the U.S. to structure strategies that benefit their development. Having sovereign, controllable, world-class large language models was once out of reach for countries outside the United States and China—especially those in the Global South. Now, DeepSeek has democratized this technology, opening up new possibilities for Global South countries in this field. At the same time, it has also presented new tasks and challenges for the governments of these nations.
What the DeepSeek phenomenon points to is the importance of open source for strengthening international collaborative chains that can reduce inequalities and large knowledge asymmetries. However, open source does not solve the problem of building sovereign infrastructures essential for local and national development. Therefore, it falls to states seeking to improve their techno-economic position to reduce the power of Big Techs, control the fundamental inputs of AI—especially data from their populations—and invest in solutions that reduce the environmental impact and labor precarization that automated systems have generated in capitalist countries. Betting on quality education for youth requires encouraging technodiversity and converting the cultural vitality of peoples into technological expressions.
本文系观察者网独家稿件,文章内容纯属作者个人观点,不代表平台观点,未经授权,不得转载,否则将追究法律责任。关注观察者网微信guanchacn,每日阅读趣味文章。
-
本文仅代表作者个人观点。
- 责任编辑: 郑乐欢 
-
锂电池“打一针”就能“重生”!《自然》刊登我国科研团队新发现
2025-02-13 06:42 -
从四个角度全面驳斥美方对DeepSeek的质疑和污蔑
2025-02-12 07:34 心智观察所 -
我国成功发射卫星互联网低轨卫星
2025-02-11 19:20 航空航天 -
蹭热度?ai.com重定向至DeepSeek
2025-02-10 14:35 人工智能 -
中国半导体产业要长远发展,这个问题必须解决好
2025-02-10 13:57 心智观察所 -
“中国物理学研究领先世界,美国机构被挤出前十”
2025-02-09 09:14 科技前沿 -
撬开日本海关的口:日本半导体设备对华依赖度有多高?
2025-02-06 08:06 心智观察所 -
中国平台,集中上线
2025-02-04 21:12 -
“人造太阳”再创纪录,是中国式科研方法论又一次胜利
2025-02-04 13:05 心智观察所 -
“大洋一号”功勋船舶将升级改造
2025-02-02 15:35 -
“霸榜全球140个市场”,拉新最多的是…
2025-02-01 22:06 观察者头条 -
突破70多年来的传统认知!他们发现距地球16万公里的“太空合声”
2025-02-01 16:53 天文 -
中国光子毫米波雷达技术取得突破性进展
2025-01-31 22:54 科技前沿 -
果然,台当局又跳了出来
2025-01-31 22:01 台湾 -
英伟达平台上线DeepSeek
2025-01-31 18:18 -
阿斯麦CEO:DeepSeek,好消息
2025-01-30 09:34 -
20光年外,科学家又发现“超级地球”
2025-01-29 19:03 -
DeepSeek超越ChatGPT,登顶美国区免费APP榜单
2025-01-27 09:02 观网财经-科创 -
理解DeepSeek的中国式创新,要先回顾深度学习的历史
2025-01-27 08:03 心智观察所 -
探索宇宙线起源之谜再添“观天”利器
2025-01-21 20:09 天文
相关推荐 -
全美数十万人涌上街头,“住手吧!特朗普” 评论 9“中方清单上的每一项,都瞄准要害” 评论 167“特朗普不代表所有美国人,请别报复我们州” 评论 294中柬云壤港联合保障和训练中心正式挂牌运行 评论 159美股蒸发超5万亿美元,“95年来最大政策失误” 评论 364最新闻 Hot
-
越南越捷航空将使用中国商飞客机运营国内航线
-
全美数十万人涌上街头,“住手吧!特朗普”
-
泽连斯基准备用测谎仪“抓内鬼”
-
欧盟报复美国?西班牙和意大利跳出来反对
-
“中方清单上的每一项,都瞄准要害”
-
罕见!党内盟友与特朗普唱反调:明年我们可能面临“血洗”
-
“崇拜了美国这么多年,我们还能信什么?”
-
“特朗普不代表所有美国人,请别报复我们州”
-
中柬挂牌,洪玛奈感谢中国
-
特朗普转发“巴菲特支持特朗普经济政策”,巴菲特回应了
-
演都不演了,特朗普顾问直说:阿根廷得结束这份中国协议,不然…
-
中柬云壤港联合保障和训练中心正式挂牌运行
-
这一重大科研装置,成功部署!
-
美媒关注:辛辣回击特朗普关税,中国媒体用上AI歌曲和短片
-
特朗普再要求降息,鲍威尔:关税远超预期,再等等
-
美股蒸发超5万亿美元,“95年来最大政策失误”
-