![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
随着DeepSeek R1的引入,人工智能(AI)的景观经历了重大的动荡,这种模型不仅是一个新的参赛者,而且是潜在的游戏改变者。
Artificial Intelligence (AI) has taken a new turn with the introduction of DeepSeek R1, a model that is not just a new entrant but a potential game-changer in the AI industry. In a recent development that has sent shockwaves through the tech world, DeepSeek R1 has managed to match the performance of AI giants like OpenAI’s o1 at a mere 3-5% of the cost. This efficiency is not just a market disruptor but a technical marvel that challenges the very foundations of how AI models have been developed and deployed.
人工智能(AI)随着DeepSeek R1的引入,这是一个新的转变,该模型不仅是新进入者,而且是AI行业的潜在游戏改变者。在最近通过技术界引发冲击波的发展中,DeepSeek R1设法以Openai的O1(例如Openai的O1)的表现,仅占成本的3-5%。这种效率不仅是一个市场破坏者,而且是一个技术奇迹,它挑战了如何开发和部署AI模型的基础。
DeepSeek R1’s benchmark performances are nothing short of impressive. “On the AIME mathematics test, it scored 79.8% compared to OpenAI’s 79.2%,” Siegler highlighted, underscoring its capability. The model also achieved a 97.3% accuracy on the MATH-500 benchmark, surpassing OpenAI’s 96.4%. These achievements come with a dramatic reduction in operational costs, with DeepSeek R1 running at “55 cents per million token inputs and $219 per million token outputs,” in stark contrast to OpenAI’s higher rates. This cost-performance ratio is a wake-up call for the industry, suggesting a shift towards more economically viable AI solutions.
DeepSeek R1的基准表演令人印象深刻。 Siegler强调说:“在AIME数学测试中,它的得分为79.8%,而OpenAI的79.2%则得到了79.2%的速度。”该模型还达到了Math-500基准的97.3%精度,超过了Openai的96.4%。这些成就的运营成本急剧降低,DeepSeek R1的运行方式“每百万个代币投入55美分,每百万个代币产量为219美元”,与Openai的较高利率形成鲜明对比。这种成本绩效比率是对该行业的唤醒呼吁,这表明向更经济可行的AI解决方案转变。
The market has responded with what can only be described as shock. Siegler pointed out, “In pre-market trading, Nvidia was down 10 to 11%,” with other tech behemoths like Microsoft and Google also witnessing significant drops. This market reaction signals a potential reevaluation of investment in AI infrastructure, particularly in hardware like Nvidia’s GPUs, which have been at the heart of AI’s scaling narrative.
市场的反应只能被描述为震惊。西格勒指出:“在市场前交易中,NVIDIA下降了10%至11%,”其他技术庞然大物等其他技术庞然大物也见证了大量下降。这种市场反应标志着对AI基础设施的投资的潜在重新评估,尤其是在Nvidia的GPU等硬件中,这是AI扩展叙事的核心。
From a technical standpoint, DeepSeek R1’s architecture is a testament to innovation under constraint. “It’s based on a mixture-of-experts architecture,” Siegler explained, allowing the model to activate only necessary parameters for each query, thus optimizing for both speed and efficiency. This approach contrasts with the monolithic models that activate all parameters regardless of the task at hand, leading to higher computational and energy costs.
从技术角度来看,DeepSeek R1的体系结构证明了受到限制的创新。 Siegler解释说:“它基于Experts体系结构的混合物。”该模型仅激活每个查询的必要参数,从而优化了速度和效率。这种方法与单层模型形成鲜明对比,该模型激活了所有参数,而不论手头的任务如何,从而导致更高的计算和能源成本。
The model’s development involved a process of distillation from larger models to create compact yet potent versions. “They took, for example, a Llama model with 70 billion parameters and distilled it down,” said Siegler, outlining how DeepSeek managed to maintain high performance with fewer resources.
该模型的开发涉及从较大模型的蒸馏过程,以创建紧凑而有效的版本。 Siegler说:“例如,他们以700亿个参数的Llama模型,并将其蒸馏出来。”
DeepSeek R1 diverges from the prevalent self-supervised learning methods by employing pure reinforcement learning (RL). “The models tend to figure out what’s the right answer on their own,” noted Siegler, indicating that this self-guided learning approach not only reduces the need for vast labeled datasets but also fosters unique reasoning capabilities within the model. This RL focus has allowed DeepSeek to fine-tune models through trial and error, improving their reasoning without the need for extensive human annotation, which is both cost and time-intensive.
DeepSeek R1通过采用纯强化学习(RL)来与普遍的自我监督学习方法不同。 Siegler指出:“这些模型倾向于弄清楚正确的答案。”他表明,这种自导的学习方法不仅减少了对庞大的标记数据集的需求,而且还促进了模型中独特的推理能力。这种RL重点使DeepSeek通过反复试验来微调模型,改善了其推理,而无需大量的人类注释,这既是成本又是时间密集的。
The scaling hypothesis, which posits that performance increases with more compute, data, and time, is now under scrutiny. “DeepSeek has shown you can actually do all this without that,” Siegler remarked, suggesting that the era of simply scaling up might be nearing an end. This could potentially reduce the dependency on massive hardware investments, redirecting focus towards smarter, more efficient AI development strategies.
缩放假设认为,随着更多的计算,数据和时间,绩效的增加,现在正在受到审查。 Siegler表示:“ DeepSeek表明您实际上可以做所有这一切。”他表明,简单扩展的时代可能即将结束。这可能有可能减少对大规模硬件投资的依赖,从而将注意力转向更智能,更有效的AI开发策略。
The immediate market fallout has been significant, with Nvidia’s stock plummeting. “It’s going to be pretty hard for this day at least,” Siegler observed, reflecting on the market’s knee-jerk reaction. However, some see this as a long-term opportunity for companies like Nvidia, where increased efficiency might spur demand for more specialized, less resource-heavy AI hardware.
NVIDIA的股票下降,直接市场的影响很大。 Siegler观察到,对市场的膝盖反应进行了反思:“至少这一天将非常困难。”但是,对于Nvidia等公司来说,这是一个长期的机会,在这种情况下,提高效率可能会刺激对更专业,资源不足的AI硬件的需求。
The business implications are profound. Companies like Microsoft and Google, which have been integrating AI into their ecosystems, now face a dilemma. “If the underlying economics just totally changed overnight, what does that do to their models?” Siegler questioned. This might push these companies towards reimagining their AI offerings, possibly leading to price adjustments or new service models to align with the new cost structures.
业务影响是深远的。像Microsoft和Google这样的公司已经将AI集成到其生态系统中,现在面临困境。 “如果基本经济学在一夜之间完全改变,那对他们的模型有什么影响?”西格勒质疑。这可能会促使这些公司重新构想其AI产品,这可能会导致价格调整或新的服务模型,以与新的成本结构保持一致。
There’s a dichotomy in how this development is perceived. On one hand, there’s optimism that efficiency will lead to broader adoption and innovation. On the other, there’s caution about the implications for companies that have invested heavily in scaling. “Do we continue to spend billions for marginal gains, or do we leverage this efficiency to push towards practical AI applications?” Siegler pondered.
这种发展的感知方式有二分法。一方面,有乐观的效率将导致更广泛的采用和创新。另一方面,人们对在扩展方面投入大量投资的公司的影响有所谨慎。 “我们是否继续花费数十亿美元来获得边际收益,还是利用这种效率来推动实用的AI应用?”西格勒(Siegler)思考。
In response, tech leaders are attempting to calm the markets with narratives around increased efficiency leading to higher usage, with Nadella citing Jevons Paradox. “It feels like there’s a group text going on,” Siegler said, hinting at a coordinated message to reassure investors.
作为回应,技术领导者试图以提高效率的叙述来平息市场,纳德拉引用杰文斯(Jevons)的悖论。西格勒说:“感觉就像有一个小组文字正在进行。”
The ultimate test for DeepSeek R1 and similar models will be their application in real-world scenarios. “We need to see AI applications like we need to see an economy that takes use of
DeepSeek R1和类似模型的最终测试将是它们在实际情况下的应用。 “我们需要看到AI应用程序,就像我们需要看到的经济
免责声明:info@kdj.com
所提供的信息并非交易建议。根据本文提供的信息进行的任何投资,kdj.com不承担任何责任。加密货币具有高波动性,强烈建议您深入研究后,谨慎投资!
如您认为本网站上使用的内容侵犯了您的版权,请立即联系我们(info@kdj.com),我们将及时删除。
-
- Floki持有者,准备好了 - 因为可能正在酿造大事!
- 2025-04-05 19:40:12
- 最新的市场回收使弗洛基(Floki)急需提升,将其价格推向了强大的技术模式。
-
-
- 比特币(BTC)2025年4月的价格预测:BTC突破会吗?
- 2025-04-05 19:35:12
- 在不断的战斗中,比特币的价格行动仍然不确定,这导致了范围有限的价格行动。
-
- 根据链度指标,比特币(BTC)可能进入了可能的市场冷却阶段。
- 2025-04-05 19:35:12
- 该分析公司的数据阐明了比特币已实现的帽和热帽之间的关系 - 这两个指标与整体网络估值有关。
-
- 陶一直在默默工作,我们认为某些子网项目以后可能变得重要
- 2025-04-05 19:30:12
- 这不是投资建议。请注意,由于流动性很薄,投资DTAO子网是有风险的。
-
- 贝莱德在欧洲推出了ishares比特币ETP
- 2025-04-05 19:30:12
- 在2024年1月在美国成功推出了贝莱德的iShares比特币信托(IBIT)之后,此后已发展成为世界上最大的现货比特币ETF
-
-
-
- 我们成为一个:共享音乐体验的力量
- 2025-04-05 19:20:12
- 这部电影跟随DJ/制作人/策展人Kikelomo Oludemi调查了人类如何通过音乐联系。