share_log

正面对决!谷歌推出“最全能”AI模型Gemini,多项指标超GPT-4

Face-to-face! Google launches Gemini, the “most versatile” AI model, which exceeds GPT-4 in many indicators

All Weather TMT ·  Dec 7, 2023 07:36

Source: All Weather Technology

Google has taken an important step in catching up with OpenAI in artificial intelligence (AI) technology applications, and launched a super versatile AI model that can be applied to mobile phones, clouds, and data centers, and confronted GPT-4 head-on.

On Wednesday, December 6, EST, Google officially unveiled the next-generation large language model (LLM) Gemini to the public, claiming that Google is “the largest and most versatile AI model” to date. It has advanced reasoning ability and “thinks more carefully” when answering difficult questions. What distinguishes it from other companies' LLM competitors is that Google emphasizes that Gemeni is the most flexible model because it uses versions of different sizes and can be applied to various generative AI applications.

Among them, the lightest version, Gemni Nano, can run offline directly on a smartphone; the relatively more powerful version, Gemini Pro, can perform various tasks, support many Google AI services through Google's ChatGPT chatbot Bard, and support Google's Gmail, Maps Docs, and YouTube services; the most powerful version, Gemini Ultra, is also the most powerful LLM built by Google so far, mainly designed for data centers and enterprise applications.

Eli Collins, vice president of product at Google's AI research firm DeepMind, said Gemini's diversity means it “can run on everything from mobile devices to large data centers.” He said that Google has long wanted to build a new generation of AI models that are more like helpful collaborators than intelligent software, and Gemini has brought Google one step closer to this vision.

Currently Gemini is only available in English, and Google will soon launch versions in other languages. Google CEO Sundar Pichai said that Gemini represents a new era of AI. Eventually, Gemini will be integrated with Google's search engine, advertising products, Chrome browser, and more.

As for the specific application schedule, starting this Wednesday, Android system developers can sign up to use the Gemini Nano version to create Gemini-enabled apps for smartphones and computers. According to Google, Gemini can be immediately enabled on its flagship phone, the Pixel 8 Pro, to implement new generative AI functions such as summarizing key points of recorded phone conversations.

The Gemini Pro version supports Bard starting this Wednesday, implementing advanced functions such as reasoning, planning, and understanding. It operates in 170 countries and regions in an English language. It may not include the UK or other European regions, as Google says it is cooperating with local regulators.

Starting next Wednesday, December 13, Google will provide the Gemini Pro version to cloud customers through Google Cloud on its Vertex AI and AI Studio platforms.

Gemini Ultra will first be available to developers and enterprise customers, and app details for this version will be announced next week. Google plans to make the Gemini Ultra app widely available to the public early next year.

Google also plans to release an advanced version of Bard Advanced supported by Gemini Ultra early next year. Before launching it to the public, it will launch a test project to improve Bard Advanced.

Google has presented three versions of the Gemini family.

This time, Google has made no secret of its ambition to rival GPT-4. Before releasing Gemini, Google conducted a series of tests evaluating it using standard industry metrics. According to Google, out of eight tests, Gemini Pro performed better than OpenAI's GPT-3.5 in six tests. In tests on common language understanding, reasoning, mathematics, and coding, Gemini surpassed OpenAI's latest model, GPT-4, in seven of the eight benchmarks.

Meanwhile, Google evaluated its latest generative AI product, AlphaCode 2, which can explain and generate code, and found that it is 85% ahead of competitors in the competitive programming field.

Demis Hassabis, CEO of DeepMind, said that Google has run 32 kinds of perfect benchmark-related tests. Comparing the two models of Gemini and GPT-4, there are a wide range of overall tests, such as multi-tasking language understanding, to tests of a single ability such as generating Python code. Of the 32 types of benchmark indicators, Gemini is “far ahead” in 30 of them.

In terms of multiple choice questions, math problems, Python code tasks, reading, etc., Gemini Pro and Ultra are compared with other LLM ratings such as GPT-4 and GPT-3.5.

According to Google, Gemini is a “native multi-modal” AI model. This means it's pre-trained right from the start to handle user text- and image-based prompt tasks, and supports text and image services. For example, parents can help their children with their homework by uploading images of math problems and pictures of trying to solve problems in worksheets. Gemini can also read answers, understand why it's right and why it's wrong, and explain concepts that need further explanation.

According to Google, Google Search's “generative search experience” using generative AI technology will be incorporated into Gemini's new features next year.

Google admits that Gemini may still have false information or fabricated information generated by AI. Collins said this is an unresolved research issue, but he said that Gemini has the most comprehensive safety assessment of Google's AI model to date. To evaluate the safety of Gemini, Google conducted adversarial tests on the model, imitating users with bad intentions using the model to enter prompts to help researchers check for hate speech and political bias in the model. This type of test includes “real toxicity tips,” which include more than 100,000 hints taken from the Internet.

Google emphasized that Gemini's AI tools will be very efficient and very fast. It is trained on a new version of Google's self-developed cloud chip Tensor Processing Units (TPU). The performance of the TPU v5p is stronger, and the chip trains existing models 2.8 times faster than the previous generation. The TPU v5p is designed for data center training and large model operation.

Amin Vahdat, Google's vice president of machine learning, said this approach gave Google “a new understanding of future standard AI infrastructure.” Google still uses a third-party AI chip to run the Gemini model.

Risk warning and disclaimer The market is risky, and you need to be careful when investing. This article does not constitute personal investment advice, nor does it take into account the specific investment goals, financial situation, or needs of individual users. Users should consider whether any opinions, opinions, or conclusions in this article are appropriate for their particular situation. You invest on this basis at your own risk.

Editor/Jeffrey

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment