Focus on CHK and US stocks

Report: Google will develop ai that can control computers.

wallstreetcn · Oct 27 14:47

据The Information报道，该项目的代号为“Project Jarvis”，旨在接管用户的浏览器，以帮助消费者完成诸如收集研究、购买产品或预订航班等各种日常任务。知情人士透露，12月还将发布谷歌下一代旗舰Gemini大型语言模型，该模型将为Jarvis提供动力。

10月26日，据The Information报道，谷歌将开发可控制计算机的人工智能，计划最早在12月预览这款新的AI产品。

报道称，这款产品也称为“计算机使用代理”，旨在接管用户的浏览器，以帮助消费者完成诸如收集研究、购买产品或预订航班等各种日常任务。据The Information援引的三位知情人士表示，该项目的代号为“Project Jarvis”，与Anthropic本周宣布的一款产品类似。

他们还透露，12月还将发布谷歌下一代旗舰Gemini大型语言模型，该模型将为Jarvis提供动力。

致力于追赶OpenAI，定制化适配Chrome

不过，Jarvis的发布时间表表明，尽管谷歌在AI技术的基础研究方面有着一定积累，但谷歌显然还在追赶其竞争对手。目前，谷歌仍在开发具有所谓“推理能力”的AI，而OpenAI早在9月份就已经推出了这一功能。

分析认为，谷歌的Gemini聊天机器人在与OpenAI的ChatGPT竞争中严重落后，导致企业纷纷转向了OpenAI的大型语言模型LLMs，这也使得谷歌的Gemini模型难以迎头赶上。为了提高AI开发效率，上周，谷歌将负责Gemini聊天机器人的团队并入了其主要AI团队DeepMind.

值得注意的是，当前，AI开发人员已经将“代理（agents，即能够完成复杂任务而无需人类监督的AI系统）”视为行业的下一阶段。Salesforce、微软和Workday等企业纷纷从OpenAI和其他公司购买了LLMs，并竞相使用这一技术开发AI代理。

Anthropic和谷歌则试图通过与个人计算机或浏览器直接交互的软件，将AI代理概念推向更深的层次。OpenAI今年的大部分时间也在开发类似的软件。

知情人士表示，谷歌的这款AI代理产品与Anthropic推出的产品类似，都是通过频繁截取用户计算机屏幕上的内容，并在采取点击按钮或输入文本等行动前对截图进行解释以响应用户的命令。

不过，两家公司的代理产品也存在关键差异：

Anthropic表示其产品可以操作安装于电脑不同应用程序上，而Jarvis目前只能操作浏览器，并且已经针对谷歌的Chrome浏览器进行了“定制化”调整。

知情人士还表示，至少在目前，Jarvis的目标用户为那些希望实现网页日常任务自动化的人。在谷歌今年春天的开发者大会上，首席执行官Sundar Pichai暗示，未来的Gemini版本可以自主执行多项操作，如帮助用户退回一双鞋子等。

产品响应速度慢，安全性或受质疑

知情人士还提示道，“Jarvis”的计划是暂定的，也可能会有所变动。报道称，谷歌可能会先向少数早期测试者发布该产品，以帮助识别和修复其不足之处。该代理目前运行速度相对较慢，因为模型需要在采取每个行动前思考几秒钟。

此外，由于谷歌还需要访问客户的隐私信息如登录密码和信用卡信息等，才能访问不同的网站来完成任务或根据客户的要求进行购买。

分析指出，谷歌需要让人们相信，其AI代理能够安全地处理他们的个人数据，这是它执行任务所必需的。

除此之外，LLMs还有一些普遍的漏洞，比如可能会产生错误答案，此前，谷歌在其搜索引擎中使用LLM驱动的对话式答案，出现了许多明显的错误。

编辑/ping

According to The Information, the project, codenamed 'Project Jarvis,' aims to take over users' browsers to help consumers complete various daily tasks such as research, product purchase, or flight booking. Insiders revealed that in December, Google will also release the next generation flagship Gemini large language model, which will power Jarvis.

On October 26, according to The Information, Google will develop ai that can control computers and plans to preview this new AI product as early as December.

The report states that this product, also known as 'Computer Usage Agent,' is designed to take over users' browsers to help consumers complete various daily tasks such as collecting research, purchasing products, or booking flights. According to three informed sources cited by The Information, the project is codenamed 'Project Jarvis', similar to a product announced by Anthropic this week.

They also revealed that in December, Google will release the next generation flagship Gemini large language model, which will power Jarvis.

Dedicated to catching up with OpenAI, customization adapted to Chrome.

However, the release schedule of Jarvis indicates that despite Google's accumulation of basic research in AI technology, Google is clearly still catching up with its competitors. Currently, Google is still developing AI with so-called 'reasoning ability', while OpenAI already launched this feature as early as September.

Analysis believes that Google's Gemini chatbot is significantly lagging behind OpenAI's ChatGPT in competition, leading companies to turn to OpenAI's large language models (LLMs), making it difficult for Google's Gemini model to catch up. In order to improve AI development efficiency, Google merged the team responsible for the Gemini chatbot into its main AI team, DeepMind, last week.

It is worth noting that currently, AI developers have considered 'agents' (AI systems capable of completing complex tasks without human supervision) as the next stage of the industry. Companies like Salesforce, microsoft, and Workday have been purchasing LLMs from OpenAI and other companies, and actively using this technology to develop AI agents.

Anthropic and Google are trying to push the concept of AI agents to a deeper level through software that interacts directly with personal computers or browsers. OpenAI has also been developing similar software for most of this year.

Insiders say that Google's AI agent product is similar to the one launched by Anthropic, both involving frequent captures of content on user computer screens and providing explanations of the screenshots before taking actions like clicking buttons or entering text to respond to user commands.

However, there are key differences in the agent products of the two companies:

Anthropic states that its product can operate on different applications installed on a computer, while Jarvis can currently only operate on browsers, and has been 'customized' for Google's Chrome browser.

Insiders also mention that at least for now, Jarvis is targeted at users who want to automate daily web tasks. At Google's developer conference this spring, CEO Sundar Pichai hinted that the future Gemini version could autonomously perform multiple tasks, such as helping users return a pair of shoes.

Product response speed is slow, and security is questionable.

Insiders also suggest that the plan for 'Jarvis' is provisional and might change. Reports suggest that Google may initially release the product to a few early testers to help identify and fix its shortcomings. The agent currently runs relatively slow as the model needs to think for several seconds before taking each action.

Furthermore, as Google needs access to customers' private information such as login passwords and credit card details to access different websites to complete tasks or make purchases based on customer requests.

Analysis shows that Google needs to convince people that its ai agents can safely handle their personal data, which is necessary for it to perform tasks.

In addition, LLMs also have some common vulnerabilities, such as the possibility of generating incorrect answers. Previously, Google used LLM-driven conversational answers in its search engine, resulting in many obvious errors.

Edit / ping

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

Focus on CHK and US stocks

报道：谷歌将开发可控制计算机的人工智能

Report: Google will develop ai that can control computers.

致力于追赶OpenAI，定制化适配Chrome

产品响应速度慢，安全性或受质疑

Dedicated to catching up with OpenAI, customization adapted to Chrome.

Product response speed is slow, and security is questionable.

Risk Disclaimer

Statement