|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sonnet 版本的突出功能是它能够与您的计算机进行交互,允许它拍摄和读取屏幕截图、移动鼠标、单击网页上的按钮以及键入文本。
Anthropic, the AI research and development company, has announced the latest update to its flagship AI model, Claude. This update introduces a highly anticipated new capability: the ability to interact with and control computers directly. Sonnet, a text-based interface to Claude, now allows users to perform a wide range of computer-based tasks, such as taking and reading screenshots, moving the mouse, clicking buttons on webpages, and typing text.
人工智能研发公司 Anthropic 宣布对其旗舰人工智能模型 Claude 进行最新更新。此更新引入了一项备受期待的新功能:直接与计算机交互和控制计算机的能力。 Sonnet 是 Claude 的基于文本的界面,现在允许用户执行各种基于计算机的任务,例如拍摄和阅读屏幕截图、移动鼠标、单击网页上的按钮以及键入文本。
This capability is being rolled out in a “public beta” phase, which Anthropic admits is “experimental and at times cumbersome and error-prone,” according to the company's announcement. In a blog post detailing the rationale behind this new feature, Anthropic explained: “A vast amount of modern work happens via computers. Enabling AIs to interact directly with computer software in the same way people do will unlock a huge range of applications that simply aren’t possible for the current generation of AI assistants.”
根据该公司的公告,这项功能正在“公开测试”阶段推出,Anthropic 承认这是“实验性的,有时很麻烦且容易出错”。在一篇博客文章中,详细介绍了这一新功能背后的基本原理,Anthropic 解释道:“大量的现代工作是通过计算机进行的。让人工智能能够像人类一样直接与计算机软件交互,这将解锁大量的应用程序,而这些应用程序对于当前一代的人工智能助手来说根本不可能实现。”
While the concept of computers controlling themselves isn’t exactly new, the way Sonnet operates sets it apart. Unlike traditional automated computer control, which typically involves writing code, Sonnet requires no programming knowledge. Users can open apps or webpages and simply instruct the AI, which then analyzes the screen and figures out which elements to interact with.
虽然计算机控制自身的概念并不新鲜,但 Sonnet 的运行方式使其与众不同。与通常涉及编写代码的传统自动化计算机控制不同,Sonnet 不需要编程知识。用户可以打开应用程序或网页并简单地指示人工智能,然后人工智能会分析屏幕并找出要与之交互的元素。
Anthropic acknowledges the risks inherent in this technology, admitting that “for safety reasons we did not allow the model to access the internet during training,” though the beta version now permits internet access. The company also recently updated its “Responsible Scaling Policy,” which defines the risks associated with each stage of development and release. According to this policy, Sonnet has been rated at “AI Safety Level 2,” which indicates “early signs of dangerous capabilities.” However, Anthropic believes it is safe enough to release to the public at this stage.
Anthropic 承认这项技术固有的风险,并承认“出于安全原因,我们不允许模型在训练期间访问互联网”,尽管测试版现在允许访问互联网。该公司最近还更新了其“负责任的扩展政策”,该政策定义了与开发和发布的每个阶段相关的风险。根据这项政策,Sonnet 被评为“人工智能安全级别 2”,这表明“存在危险能力的早期迹象”。然而,Anthropic 认为现阶段向公众发布是足够安全的。
Defending its decision to release the tool before fully understanding all the potential misuse scenarios, Anthropic said, “We can begin grappling with any safety issues before the stakes are too high, rather than adding computer use capabilities for the first time into a model with much more serious risks.” Essentially, the company would prefer to test these waters now while the AI’s capabilities are still relatively limited.
Anthropic 为其在完全了解所有潜在误用情况之前发布该工具的决定进行了辩护,他表示:“我们可以在风险太大之前开始解决任何安全问题,而不是首次将计算机使用功能添加到一个需要大量操作的模型中。”更严重的风险。”从本质上讲,该公司更愿意现在就测试这些水域,而人工智能的能力仍然相对有限。
Of course, the risks associated with AI tools like Claude aren’t just theoretical. OpenAI recently disclosed 20 instances where state-backed actors had used ChatGPT for nefarious purposes, such as planning cyberattacks, probing vulnerable infrastructure, and designing influence campaigns. With the U.S. presidential election looming just two weeks away, Anthropic is keenly aware of the potential for misuse. “Given the upcoming US elections, we’re on high alert for attempted misuses that could be perceived as undermining public trust in electoral processes,” the company wrote.
当然,与克劳德这样的人工智能工具相关的风险不仅仅是理论上的。 OpenAI 最近披露了 20 个国家支持的行为者将 ChatGPT 用于邪恶目的的实例,例如策划网络攻击、探测易受攻击的基础设施以及设计影响力活动。距离美国总统大选仅剩两周时间,Anthropic 敏锐地意识到滥用的可能性。该公司写道:“鉴于美国大选即将到来,我们对可能被视为破坏公众对选举过程信任的企图滥用行为保持高度警惕。”
Anthropic says “The updated Claude 3.5 Sonnet shows wide-ranging improvements on industry benchmarks, with particularly strong gains in agentic coding and tool use tasks. On coding, it improves performance on SWE-bench Verified from 33.4% to 49.0%, scoring higher than all publicly available models—including reasoning models like OpenAI o1-preview and specialized systems designed for agentic coding. It also improves performance on TAU-bench, an agentic tool use task, from 62.6% to 69.2% in the retail domain, and from 36.0% to 46.0% in the more challenging airline domain. The new Claude 3.5 Sonnet offers these advancements at the same price and speed as its predecessor.”
Anthropic 表示:“更新后的 Claude 3.5 Sonnet 在行业基准上显示出广泛的改进,尤其是在代理编码和工具使用任务方面取得了显着的进步。在编码方面,它在 SWE-bench Verified 上的性能从 33.4% 提高到 49.0%,得分高于所有公开可用的模型,包括 OpenAI o1-preview 等推理模型和专为代理编码设计的专用系统。它还将 TAU-bench(一项代理工具使用任务)的性能提高,在零售领域从 62.6% 提高到 69.2%,在更具挑战性的航空领域从 36.0% 提高到 46.0%。新的 Claude 3.5 Sonnet 以与其前身相同的价格和速度提供了这些进步。”
Anthropic has put safeguards in place to prevent Sonnet’s new capabilities from being exploited for election-related meddling. They’ve implemented systems to monitor when Claude is asked to engage in such activities, such as generating social media content or interacting with government websites. The company is also taking steps to ensure that screenshots captured during tool usage will not be used for future model training. However, even Anthropic’s engineers have been caught off guard by some of the tool’s behaviors. In one instance, Claude unexpectedly stopped a screen recording, losing all the footage. In a lighthearted moment, the AI even began browsing photos of Yellowstone National Park during a coding demo, which Anthropic shared on X with a mix of amusement and surprise.
Anthropic 已采取保障措施,以防止 Sonnet 的新功能被用于与选举相关的干预。他们已经实施了系统来监控克劳德何时被要求参与此类活动,例如生成社交媒体内容或与政府网站互动。该公司还采取措施确保在工具使用过程中捕获的屏幕截图不会用于未来的模型训练。然而,即使是 Anthropic 的工程师也对该工具的一些行为感到措手不及。有一次,克劳德意外地停止了屏幕录制,丢失了所有镜头。在一个轻松的时刻,人工智能甚至开始在编码演示期间浏览黄石国家公园的照片,Anthropic 在 X 上分享了这些照片,既有趣又惊喜。
Anthropic emphasizes the importance of safety in rolling out this new capability. Claude has been rated at AI Safety Level 2, meaning it doesn’t require heightened security measures for current risks but still raises concerns about potential misuse, like prompt injection attacks. The company has implemented systems to monitor election-related activities and prevent abuses like content generation or social media manipulation.
Anthropic 在推出这一新功能时强调了安全的重要性。 Claude 已被评为 AI 安全级别 2,这意味着它不需要针对当前风险采取强化的安全措施,但仍然引起对潜在滥用(例如即时注入攻击)的担忧。该公司已实施系统来监控与选举相关的活动,并防止内容生成或社交媒体操纵等滥用行为。
Although Claude’s computer use is still slow and prone to errors, Anthropic is optimistic about its future. The company plans to refine the model to make it faster, more reliable, and easier to implement. Throughout the beta phase, developers are encouraged to provide feedback to help improve both the model’s effectiveness and its safety protocols.
尽管 Claude 的计算机使用速度仍然很慢并且容易出错,但 Anthropic 对它的未来持乐观态度。该公司计划完善该模型,使其更快、更可靠、更容易实施。在整个测试阶段,鼓励开发人员提供反馈,以帮助提高模型的有效性和安全协议。
免责声明:info@kdj.com
所提供的信息并非交易建议。根据本文提供的信息进行的任何投资,kdj.com不承担任何责任。加密货币具有高波动性,强烈建议您深入研究后,谨慎投资!
如您认为本网站上使用的内容侵犯了您的版权,请立即联系我们(info@kdj.com),我们将及时删除。
-
- Qubetics 钱包:虚拟卡功能增强安全性和便利性
- 2024-10-24 02:25:02
- 加密领域最大的挑战之一是与托管钱包相关的安全风险,其中第三方提供商控制用户的私钥。
-
- 化名加密货币交易员 Hoeem 概述了驾驭波动的 Meme 代币市场的三波交易策略
- 2024-10-24 02:25:02
- 霍伊姆在推特上概述了他的方法,将交易分为三种不同的浪潮。
-
- Gnosis 宣布 DevConflict 会议定于 11 月 9 日至 10 日在曼谷举行
- 2024-10-24 02:20:13
- 该活动旨在将预测市场与区块链辩论相结合,重点关注与以太坊相关的讨论。