|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sonnet 版本的突出功能是它能夠與您的電腦進行交互,允許它拍攝和讀取螢幕截圖、移動滑鼠、單擊網頁上的按鈕以及鍵入文字。
Anthropic, the AI research and development company, has announced the latest update to its flagship AI model, Claude. This update introduces a highly anticipated new capability: the ability to interact with and control computers directly. Sonnet, a text-based interface to Claude, now allows users to perform a wide range of computer-based tasks, such as taking and reading screenshots, moving the mouse, clicking buttons on webpages, and typing text.
人工智慧研發公司 Anthropic 宣布對其旗艦人工智慧模型 Claude 進行最新更新。此更新引入了一項備受期待的新功能:直接與電腦互動和控制電腦的能力。 Sonnet 是 Claude 的基於文字的介面,現在允許使用者執行各種基於電腦的任務,例如拍攝和閱讀螢幕截圖、移動滑鼠、單擊網頁上的按鈕以及鍵入文字。
This capability is being rolled out in a “public beta” phase, which Anthropic admits is “experimental and at times cumbersome and error-prone,” according to the company's announcement. In a blog post detailing the rationale behind this new feature, Anthropic explained: “A vast amount of modern work happens via computers. Enabling AIs to interact directly with computer software in the same way people do will unlock a huge range of applications that simply aren’t possible for the current generation of AI assistants.”
根據該公司的公告,這項功能正在「公開測試」階段推出,Anthropic 承認這是「實驗性的,有時很麻煩且容易出錯」。在一篇部落格文章中,詳細介紹了這項新功能背後的基本原理,Anthropic 解釋道:「大量的現代工作是透過電腦進行的。讓人工智慧能夠像人類一樣直接與電腦軟體交互,這將解鎖大量的應用程序,而這些應用程式對於當前一代的人工智慧助手來說根本不可能實現。
While the concept of computers controlling themselves isn’t exactly new, the way Sonnet operates sets it apart. Unlike traditional automated computer control, which typically involves writing code, Sonnet requires no programming knowledge. Users can open apps or webpages and simply instruct the AI, which then analyzes the screen and figures out which elements to interact with.
雖然電腦控制自身的概念並不新鮮,但 Sonnet 的運作方式使其與眾不同。與通常涉及編寫程式碼的傳統自動化電腦控制不同,Sonnet 不需要程式設計知識。使用者可以打開應用程式或網頁並簡單地指示人工智慧,然後人工智慧會分析螢幕並找出要與之互動的元素。
Anthropic acknowledges the risks inherent in this technology, admitting that “for safety reasons we did not allow the model to access the internet during training,” though the beta version now permits internet access. The company also recently updated its “Responsible Scaling Policy,” which defines the risks associated with each stage of development and release. According to this policy, Sonnet has been rated at “AI Safety Level 2,” which indicates “early signs of dangerous capabilities.” However, Anthropic believes it is safe enough to release to the public at this stage.
Anthropic 承認這項技術固有的風險,並承認“出於安全原因,我們不允許模型在訓練期間訪問互聯網”,儘管測試版現在允許訪問互聯網。該公司最近還更新了其“負責任的擴展政策”,該政策定義了與開發和發布的每個階段相關的風險。根據這項政策,Sonnet 被評為“人工智慧安全等級 2”,這表明“存在危險能力的早期跡象”。然而,Anthropic 認為現階段向公眾發布是足夠安全的。
Defending its decision to release the tool before fully understanding all the potential misuse scenarios, Anthropic said, “We can begin grappling with any safety issues before the stakes are too high, rather than adding computer use capabilities for the first time into a model with much more serious risks.” Essentially, the company would prefer to test these waters now while the AI’s capabilities are still relatively limited.
Anthropic 為其在完全了解所有潛在誤用情況之前發布該工具的決定進行了辯護,他表示:「我們可以在風險太大之前開始解決任何安全問題,而不是首次將電腦使用功能添加到一個需要大量操作的模型中。從本質上講,該公司更願意現在就測試這些水域,而人工智慧的能力仍然相對有限。
Of course, the risks associated with AI tools like Claude aren’t just theoretical. OpenAI recently disclosed 20 instances where state-backed actors had used ChatGPT for nefarious purposes, such as planning cyberattacks, probing vulnerable infrastructure, and designing influence campaigns. With the U.S. presidential election looming just two weeks away, Anthropic is keenly aware of the potential for misuse. “Given the upcoming US elections, we’re on high alert for attempted misuses that could be perceived as undermining public trust in electoral processes,” the company wrote.
當然,與克勞德這樣的人工智慧工具相關的風險不僅僅是理論上的。 OpenAI 最近披露了 20 個國家支持的行為者將 ChatGPT 用於邪惡目的的實例,例如策劃網路攻擊、探測易受攻擊的基礎設施以及設計影響力活動。距離美國總統大選僅剩兩週時間,Anthropic 敏銳地意識到濫用的可能性。該公司寫道:“鑑於美國大選即將到來,我們對可能被視為破壞公眾對選舉過程信任的企圖濫用保持高度警惕。”
Anthropic says “The updated Claude 3.5 Sonnet shows wide-ranging improvements on industry benchmarks, with particularly strong gains in agentic coding and tool use tasks. On coding, it improves performance on SWE-bench Verified from 33.4% to 49.0%, scoring higher than all publicly available models—including reasoning models like OpenAI o1-preview and specialized systems designed for agentic coding. It also improves performance on TAU-bench, an agentic tool use task, from 62.6% to 69.2% in the retail domain, and from 36.0% to 46.0% in the more challenging airline domain. The new Claude 3.5 Sonnet offers these advancements at the same price and speed as its predecessor.”
Anthropic 表示:「更新後的 Claude 3.5 Sonnet 在行業基準上顯示出廣泛的改進,尤其是在代理程式編碼和工具使用任務方面取得了顯著的進步。在編碼方面,它在 SWE-bench Verified 上的性能從 33.4% 提高到 49.0%,得分高於所有公開可用的模型,包括 OpenAI o1-preview 等推理模型和專為代理編碼設計的專用系統。它還將 TAU-bench(一項代理工具使用任務)的性能提高,在零售領域從 62.6% 提高到 69.2%,在更具挑戰性的航空領域從 36.0% 提高到 46.0%。新的 Claude 3.5 Sonnet 以與其前身相同的價格和速度提供了這些進步。
Anthropic has put safeguards in place to prevent Sonnet’s new capabilities from being exploited for election-related meddling. They’ve implemented systems to monitor when Claude is asked to engage in such activities, such as generating social media content or interacting with government websites. The company is also taking steps to ensure that screenshots captured during tool usage will not be used for future model training. However, even Anthropic’s engineers have been caught off guard by some of the tool’s behaviors. In one instance, Claude unexpectedly stopped a screen recording, losing all the footage. In a lighthearted moment, the AI even began browsing photos of Yellowstone National Park during a coding demo, which Anthropic shared on X with a mix of amusement and surprise.
Anthropic 已採取保障措施,以防止 Sonnet 的新功能被用於與選舉相關的干預。他們已經實施了系統來監控克勞德何時被要求參與此類活動,例如產生社交媒體內容或與政府網站互動。該公司還採取措施確保在工具使用過程中捕獲的螢幕截圖不會用於未來的模型訓練。然而,即使是 Anthropic 的工程師也對該工具的一些行為感到措手不及。有一次,克勞德意外地停止了螢幕錄製,失去了所有鏡頭。在一個輕鬆的時刻,人工智慧甚至開始在編碼演示期間瀏覽黃石國家公園的照片,Anthropic 在 X 上分享了這些照片,既有趣又驚訝。
Anthropic emphasizes the importance of safety in rolling out this new capability. Claude has been rated at AI Safety Level 2, meaning it doesn’t require heightened security measures for current risks but still raises concerns about potential misuse, like prompt injection attacks. The company has implemented systems to monitor election-related activities and prevent abuses like content generation or social media manipulation.
Anthropic 在推出這項新功能時強調了安全的重要性。 Claude 已被評為 AI 安全等級 2,這意味著它不需要針對當前風險採取強化的安全措施,但仍引起對潛在濫用(例如即時注入攻擊)的擔憂。該公司已實施系統來監控與選舉相關的活動,並防止內容生成或社交媒體操縱等濫用行為。
Although Claude’s computer use is still slow and prone to errors, Anthropic is optimistic about its future. The company plans to refine the model to make it faster, more reliable, and easier to implement. Throughout the beta phase, developers are encouraged to provide feedback to help improve both the model’s effectiveness and its safety protocols.
儘管 Claude 的電腦使用速度仍然很慢且容易出錯,但 Anthropic 對它的未來持樂觀態度。該公司計劃完善該模型,使其更快、更可靠、更容易實施。在整個測試階段,鼓勵開發人員提供回饋,以幫助提高模型的有效性和安全協議。
免責聲明:info@kdj.com
所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大,建議您充分研究後謹慎投資!
如果您認為本網站使用的內容侵犯了您的版權,請立即聯絡我們(info@kdj.com),我們將及時刪除。
-
- Qubetics 錢包:虛擬卡功能可增強安全性和便利性
- 2024-10-24 02:25:02
- 加密領域最大的挑戰之一是與託管錢包相關的安全風險,其中第三方提供者控制用戶的私鑰。
-
- 化名加密貨幣交易員 Hoeem 概述了駕馭波動的 Meme 代幣市場的三波交易策略
- 2024-10-24 02:25:02
- 霍伊姆在推特上概述了他的方法,將交易分為三種不同的浪潮。
-
- Gnosis 宣布 DevConflict 會議定於 11 月 9 日至 10 日在曼谷舉行
- 2024-10-24 02:20:13
- 該活動旨在將預測市場與區塊鏈辯論相結合,重點關注與以太坊相關的討論。