share_log

“无人问津”Clould Next?谷歌第一次说清楚了“AI护城河”:基建!

“No one cares” Clould Next? Google has clarified the “AI moat” for the first time: infrastructure!

wallstreetcn ·  Apr 12 20:29

Source: Wall Street News

Facing the fierce offensive of many rivals, it is slightly behind in the artificial intelligence arms race$Alphabet-A (GOOGL.US)$/$Alphabet-C (GOOG.US)$, is catching up.

Although the recently concluded Google Cloud Next 2024 conference was slightly quiet in the headlines, Google used it to showcase a series of innovations and significant progress it has made in the field of artificial intelligence, especially its far leading position in artificial intelligence infrastructure.

Google CEO Pichay put it bluntly:

We've long known that artificial intelligence will transform every industry and company, including ourselves. That's why we've been building artificial intelligence infrastructure for over a decade, including TPU that has now evolved to the fifth generation.

Google's strong infrastructure can significantly improve the effectiveness of large model applications

Google Cloud CEO Curian emphasized that Google's strong infrastructure helps Google provide customers with the ability to train and deploy the most advanced language models, and stands at the forefront of artificial intelligence platform transformation.

In the keynote address at this year's Google Cloud Annual Conference, Google highlighted several of its major advantages in terms of artificial intelligence infrastructure.

On the one hand, by integrating Google search, it is possible to greatly improve the response quality of large-scale language models and significantly reduce the “illusion” phenomenon. On the other hand, Google also allows customers to easily use data in enterprise databases and applications as the knowledge base for models, actually implement AI technology, and combine generative artificial intelligence with actual enterprise data.

For example, in one demo, after integrating the Google Gemini and BigQuery data warehouses and the Looker business intelligence platform, workers can receive alerts that specific products are about to sell out; using generative artificial intelligence, workers can see sales trends, find similar models, and develop action plans to deal with declining inventory.

In this case, through Google's deep infrastructure services, the big model can not only provide information, but also act as an easier to use natural language interface for data collection, drastically reducing the time and knowledge threshold required for the task.

Analysts commented that in the past, Google's essential advantage was based on an open Internet and large-scale data processing capabilities, but in the current era of artificial intelligence, this essential advantage is reflected in its absolute leading position in infrastructure and computing scale. And through this Cloud Next conference, Google clearly demonstrated this, which can be said to have found a key “essential support” for itself in the fierce competition for artificial intelligence.

Gemini 1.5 Pro bombardment site! Extremely long contextual understanding disrupts AI application results

What's even more exciting is that Google unveiled its new-generation Gemini 1.5 Pro large language model. The model is a qualitative leap forward from previous generation products, not only greatly improving performance, but also making breakthroughs in long-term context understanding. According to Curian:

Gemini 1.5 Pro's performance has been greatly improved, and breakthroughs have been made in understanding long contexts. This means that it can continuously run 1 million word tokens, opening up new possibilities for enterprises to create, discover, and build using artificial intelligence.

There is no doubt that Google has the largest TPU computing power resources in the industry. Coupled with continuous optimization and mass production of the TPU architecture over the years, it can accelerate training and inference in parallel at the chip, cluster, and even the entire data center level, unleashing unprecedented computing power.

In the traditional transformer architecture, the memory requirements of the model will grow exponentially as the length of the context increases, which creates a ceiling for contextual understanding. Gemini 1.5 Pro, on the other hand, uses innovative mechanisms such as circular attention, so that memory requirements are only linearly related to the length of the context, while also being able to process extremely long texts of up to 100 million tokens.

The key to this breakthrough is Google's ability to innovate at the infrastructure level. With its huge TPU computing power resources, Google seems to have fully unleashed the power of infrastructure on Gemini 1.5 Pro.

In a series of presentations at the conference, Gemini 1.5 Pro demonstrated unparalleled contextual understanding and generation capabilities, and is widely used in many enterprise-level scenarios such as compliance review, marketing content creation, and software development.

Taking the compliance document review as an example, the engineer directly uploads the report to be analyzed and the company's compliance manual document to Gemini for Workspaces. Based on two 150-page long texts, AI can accurately detect compliance issues in the proposal without the need for manual effort to compare them word for word.

In the scenario of creating marketing content for outdoor products, engineers directly generated corresponding creative images through the Imagen visual model, then used Gemini to contextualize the business logic and operational data in the company's entire code base (100,000 lines of code), and finally generated complete storyboard content.

In terms of software development, once new engineers join the team, Gemini Code Assist allows them to familiarize themselves with the entire code base within minutes and automatically implement new features to ensure that the generated code meets company standards.

Google says:

Gemini's code transformation feature is fully codebase-aware and allows enterprises to easily reason over the entire code base. In contrast, other models cannot handle more than 12,000 to 15,000 lines of code.

Undoubtedly, this extraordinary ability to understand and generate depends entirely on Google's groundbreaking innovation at the infrastructure level.

As Pichay said:

When a model grasps all the context of a problem as it works, it can unleash powerful functionality; this is only possible with a longer context, and Google's infrastructure ultimately achieves this.

Editor/jayden

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment