Face-to-face! Google launches Gemini, the “most versatile” AI model, which exceeds GPT-4 in many indicators

All Weather TMT · Dec 7, 2023 07:36

来源：全天候科技

谷歌迈出了在人工智能（AI）科技应用上追赶OpenAI的重要一步，推出手机、云、数据中心都可应用的超级全能AI模型，正面对决GPT-4。

美东时间12月6日周三，谷歌正式向公众发布新一代大语言模型（LLM）Gemini，号称谷歌迄今为止“最大、也最全能的AI模型”，有高级推理能力，回答难题时“考虑得更仔细”。有别于其他公司LLM竞品的是，谷歌强调Gemeni是最灵活的模型，因为它用不同大小的版本，可以适用于各种生成式AI应用。

其中，最轻盈的版本Gemni Nano可以直接在智能手机上离线运行；相对而言更强大的版本Gemini Pro可以执行多种任务，将通过谷歌的类ChatGPT聊天机器人Bard，为众多谷歌AI 服务提供支持，加持谷歌的Gmail、Maps Docs和YouTube等服务；功能最强大的版本Gemini Ultra也是谷歌迄今打造的最强大LLM，主要为数据中心和企业应用而设计。

谷歌旗下AI研究机构DeepMind的产品副总裁Eli Collins称，Gemini的多样性意味着，它“能够在从移动设备到大型数据中心的所有设备上运行。”他表示，谷歌早就希望打造的新一代AI模型更像是乐于助人的合作者，而不是一种智能的软件，Gemini让谷歌距离这种愿景又近了一步。

现在Gemini只有英语版，谷歌将很快推出其他语言的版本。谷歌CEO Sundar Pichai说，Gemini代表了AI的新时代。最终，Gemini将与谷歌的搜索引擎、广告产品、Chrome浏览器等更多产品结合。

具体应用时间表方面，从本周三起，安卓系统开发人员可以注册使用Gemini Nano版，打造智能手机和电脑的Gemini支持App。谷歌称，Gemini可以立即在其旗舰手机Pixel 8 Pro上启用，实现诸如归纳电话录音对话要点等新的生成式AI功能。

Gemini Pro版从本周三开始支持Bard，实现高级的推理、规划、理解等功能，在170个国家地区以英语一种语言操作运行，可能不包括英国或者其他欧洲地区，因为谷歌称在和当地的监管机构合作。

从12月13日下周三开始，谷歌将通过谷歌云，在旗下Vertex AI 和 AI Studio平台向云客户提供 Gemini Pro版。

Gemini Ultra将首先面向开发者和企业客户开放，该版本的应用详情将在下周公布。谷歌计划，明年初，向公众大范围开放Gemini Ultra应用。

谷歌还计划，明年初发布Gemini Ultra支持的进阶版Bard Advanced，在面向大众推出以前，先将推出一个测试项目，以便改进Bard Advanced。

谷歌展示了Gemini家族三个版本。

谷歌此次毫不讳言和GPT-4一较高低的雄心。发布Gemini前，谷歌对它进行了一系列以标准行业指标评估的测试。谷歌称，在八项测试中，Gemini Pro有六项的表现优于OpenAI 的 GPT-3.5。在通用语言理解、推理、数学和编码方面测试中，八项基准指标里，Gemini 有七项超过了OpenAI 的最新版模型 GPT-4。

同时，谷歌评估了旗下可以解释和生成代码的最新生成式AI产品AlphaCode 2，发现在竞争性编程领域，它领先85%的竞争对手。

DeepMind的CEO Demis Hassabis称，谷歌运行了32种完善的基准指标相关测试，对比Gemini和GPT-4这两个模型，既有诸如多任务语言理解这类广泛的整体测试，到生成Python代码这种单一能力的测试。32种基准指标种，Gemini有30项都“遥遥领先”。

在多选问题、数学问题、Python代码任务、阅读等方面，Gemini Pro和Ultra与GPT-4、GPT-3.5等其他LLM的评分对比。

谷歌称，Gemini是一种“原生多模态”AI模型。这意味着它从一开始就经过预先训练，可以处理用户基于文本和图像的提示词任务，支持文本和图像的服务。比如家长可以通过上传数学问题的图像，以及在工作表里尝试解决问题的照片，帮助孩子做家庭作业。Gemini还能阅读答案，理解为何是对的、为何是错的，并解释需要进一步说明的概念。

谷歌称，谷歌搜索运用生成式AI技术的“搜索生成式体验”在明年融入和Gemini的新功能。

谷歌承认，Gemini仍然可能存在AI产生的虚假信息或者捏造信息。Collins 称这是尚未解决的研究问题，不过他说，Gemini有迄今为止谷歌AI模型的最全面安全评估。为评估 Gemini 的安全性，谷歌对该模型进行了对抗性测试，模仿有不良企图的用户利用该模型输入提示词，帮助研究人员检查模型中是否存在仇恨言论和政治偏见。这类测试包括“真实毒性提示词”，它包含从网上提取的10万多个提示词。

谷歌强调Gemini的AI工具效率会非常高、速度非常快。它在谷歌自研的新版云芯片Tensor Processing Units（TPU）上训练，TPU v5p的性能更强，该芯片训练现有模型的速度比前代快2.8倍。TPU v5p是为数据中心的训练和大模型运行而设计。

谷歌机器学习副总裁 Amin Vahdat 表示，这种方法让谷歌“对未来标准AI基础设施有了新的认识”。谷歌仍然使用第三方AI芯片运行Gemini 模型。

风险提示及免责条款市场有风险，投资需谨慎。本文不构成个人投资建议，也未考虑到个别用户特殊的投资目标、财务状况或需要。用户应考虑本文中的任何意见、观点或结论是否符合其特定状况。据此投资，责任自负。

编辑/Jeffrey

Source: All Weather Technology

Google has taken an important step in catching up with OpenAI in artificial intelligence (AI) technology applications, and launched a super versatile AI model that can be applied to mobile phones, clouds, and data centers, and confronted GPT-4 head-on.

On Wednesday, December 6, EST, Google officially unveiled the next-generation large language model (LLM) Gemini to the public, claiming that Google is “the largest and most versatile AI model” to date. It has advanced reasoning ability and “thinks more carefully” when answering difficult questions. What distinguishes it from other companies' LLM competitors is that Google emphasizes that Gemeni is the most flexible model because it uses versions of different sizes and can be applied to various generative AI applications.

Among them, the lightest version, Gemni Nano, can run offline directly on a smartphone; the relatively more powerful version, Gemini Pro, can perform various tasks, support many Google AI services through Google's ChatGPT chatbot Bard, and support Google's Gmail, Maps Docs, and YouTube services; the most powerful version, Gemini Ultra, is also the most powerful LLM built by Google so far, mainly designed for data centers and enterprise applications.

Eli Collins, vice president of product at Google's AI research firm DeepMind, said Gemini's diversity means it “can run on everything from mobile devices to large data centers.” He said that Google has long wanted to build a new generation of AI models that are more like helpful collaborators than intelligent software, and Gemini has brought Google one step closer to this vision.

Currently Gemini is only available in English, and Google will soon launch versions in other languages. Google CEO Sundar Pichai said that Gemini represents a new era of AI. Eventually, Gemini will be integrated with Google's search engine, advertising products, Chrome browser, and more.

As for the specific application schedule, starting this Wednesday, Android system developers can sign up to use the Gemini Nano version to create Gemini-enabled apps for smartphones and computers. According to Google, Gemini can be immediately enabled on its flagship phone, the Pixel 8 Pro, to implement new generative AI functions such as summarizing key points of recorded phone conversations.

The Gemini Pro version supports Bard starting this Wednesday, implementing advanced functions such as reasoning, planning, and understanding. It operates in 170 countries and regions in an English language. It may not include the UK or other European regions, as Google says it is cooperating with local regulators.

Starting next Wednesday, December 13, Google will provide the Gemini Pro version to cloud customers through Google Cloud on its Vertex AI and AI Studio platforms.

Gemini Ultra will first be available to developers and enterprise customers, and app details for this version will be announced next week. Google plans to make the Gemini Ultra app widely available to the public early next year.

Google also plans to release an advanced version of Bard Advanced supported by Gemini Ultra early next year. Before launching it to the public, it will launch a test project to improve Bard Advanced.

Google has presented three versions of the Gemini family.

This time, Google has made no secret of its ambition to rival GPT-4. Before releasing Gemini, Google conducted a series of tests evaluating it using standard industry metrics. According to Google, out of eight tests, Gemini Pro performed better than OpenAI's GPT-3.5 in six tests. In tests on common language understanding, reasoning, mathematics, and coding, Gemini surpassed OpenAI's latest model, GPT-4, in seven of the eight benchmarks.

Meanwhile, Google evaluated its latest generative AI product, AlphaCode 2, which can explain and generate code, and found that it is 85% ahead of competitors in the competitive programming field.

Demis Hassabis, CEO of DeepMind, said that Google has run 32 kinds of perfect benchmark-related tests. Comparing the two models of Gemini and GPT-4, there are a wide range of overall tests, such as multi-tasking language understanding, to tests of a single ability such as generating Python code. Of the 32 types of benchmark indicators, Gemini is “far ahead” in 30 of them.

In terms of multiple choice questions, math problems, Python code tasks, reading, etc., Gemini Pro and Ultra are compared with other LLM ratings such as GPT-4 and GPT-3.5.

According to Google, Gemini is a “native multi-modal” AI model. This means it's pre-trained right from the start to handle user text- and image-based prompt tasks, and supports text and image services. For example, parents can help their children with their homework by uploading images of math problems and pictures of trying to solve problems in worksheets. Gemini can also read answers, understand why it's right and why it's wrong, and explain concepts that need further explanation.

According to Google, Google Search's “generative search experience” using generative AI technology will be incorporated into Gemini's new features next year.

Google admits that Gemini may still have false information or fabricated information generated by AI. Collins said this is an unresolved research issue, but he said that Gemini has the most comprehensive safety assessment of Google's AI model to date. To evaluate the safety of Gemini, Google conducted adversarial tests on the model, imitating users with bad intentions using the model to enter prompts to help researchers check for hate speech and political bias in the model. This type of test includes “real toxicity tips,” which include more than 100,000 hints taken from the Internet.

Google emphasized that Gemini's AI tools will be very efficient and very fast. It is trained on a new version of Google's self-developed cloud chip Tensor Processing Units (TPU). The performance of the TPU v5p is stronger, and the chip trains existing models 2.8 times faster than the previous generation. The TPU v5p is designed for data center training and large model operation.

Amin Vahdat, Google's vice president of machine learning, said this approach gave Google “a new understanding of future standard AI infrastructure.” Google still uses a third-party AI chip to run the Gemini model.

Risk warning and disclaimer The market is risky, and you need to be careful when investing. This article does not constitute personal investment advice, nor does it take into account the specific investment goals, financial situation, or needs of individual users. Users should consider whether any opinions, opinions, or conclusions in this article are appropriate for their particular situation. You invest on this basis at your own risk.

Editor/Jeffrey

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

正面对决！谷歌推出“最全能”AI模型Gemini，多项指标超GPT-4

Face-to-face! Google launches Gemini, the “most versatile” AI model, which exceeds GPT-4 in many indicators

Risk Disclaimer

Statement