Source: Wall Street News
$Alphabet-A(GOOGL.US)$The culmination — Gemini is coming, GPT-4's strongest killer is about to come out.
On September 14, the media quoted three people directly familiar with the matter as saying that Google has provided an early version of Gemini to a small group of companies, which means that Google is considering including it in consumer services. At the same time, Google will also sell to businesses through the company's cloud computing services, which also means that Gemini is getting closer and closer to being released.
According to people familiar with the matter, Google will also release Gemini versions of different sizes, so developers can buy simplified versions to handle simpler tasks, and versions that are small enough to run on personal terminals.
In order to compete with OpenAI and speed up the Gemini development cycle, Google CEO Chichai took a critical step in April of this year, merging teams with completely different cultures and codes — Google Brain and DeepMind. Demis Hassabis, the former founder of DeepMind, became CEO.
Haasabis is clearly very confident about the new team that came together. He said the new team brought together two forces critical to recent advances in artificial intelligence. Google founder Sergei Brin was also blown back to the battlefield by AI and personally participated in Gemini training.
Over the months that followed, the mystery of Gemini was lifted little by little and unraveled little by little. This is everything that is currently known about Gemini.
Gemini's multi-modal abilities
The next leap forward in language models may be to perform more tasks on computers. The previous article mentioned that Gemini's greatest strength is its multi-modal ability, which not only understands and generates text and code, but also understands and generates images. In contrast, ChatGPT is just a plain text model that can only understand and generate text.
Additionally, an important step in developing a language model with ChatGPT-like abilities is to use human feedback reinforcement learning to improve its performance. DeepMind's deep experience with reinforcement learning can give Gemini new capabilities.
At the Google Developer I/O Conference in May, Google mentioned that from the beginning, Gemini's goal was to efficiently integrate tools and APIs across multiple modes. At the time, Google's teaser was: “Although it's still early days, we've already seen multi-modal capabilities in Gemini that we've never seen before in previous models, which is very impressive.”
Gemini and AlphaGo merge
Google DeepMind CEO Hassabis broke the news that the new Gemini model will be combined with AlphaGo and the big language model.
Gemini will combine the language functions of large models such as AlphaGo and GPT-4, and the ability to solve problems and plan systematically will be greatly enhanced.
Some AI experts believe that indirect learning of language models through text is a major limitation of their development. And AlphaGo's advantage can solve this. In 2016, the AI system designed by DeepMind, AlphaGo defeated World Go champion Lee Se-seok by a score of 4 to 1, making it the first robot in history to beat the Go World Championship.
AlphaGo is based on reinforcement learning technology pioneered by DeepMind. By making AlphaGo try over and over and receive feedback on performance, AlphaGo learns to handle difficult questions about choosing what kind of action to take. At the same time, AlphaGo used Monte Carlo tree search technology to explore and memorize possible behaviors on the board.
It will come in a variety of sizes and features
Google notes that Gemini is undergoing training, and once fine-tuned, it will be usable “in all sizes and features,” just like the Palm 2. Google says it can be deployed in different products to benefit everyone.
In addition to applications in enterprise services, Gemini also has huge potential for medical use cases. Google has been testing an artificial intelligence tool called Med-Palm 2 that could be enhanced with Gemini features. The model can be used as a medical chatbot or robotic technology to assist with surgeries and medical procedures.
Additionally, Google's insights into building DeepMind's Gato (a “universal” system) and the recently introduced RT-2 (a robot Transformer model) can also be integrated into Gemini. The collaboration between Google Brain and DeepMind poses a major challenge to OpenAI and other competitors in the field of artificial intelligence.
Gemini integrates into various Google apps
In an interview in September, Cichai revealed information about Gemini's integration into Google products. He said conversational AI like Bard “isn't the end state,” but rather a midpoint on the way to more advanced chatbots.
Pichai said the final version of the Gemini and Bard fusion will be an “amazing universal personal assistant,” integrating every aspect of people's daily lives, such as travel, work, and play.
He reiterated that Gemini will combine the strengths of text and images, saying the current AI chatbot will “seem insignificant” within a few years.
Compared to existing models, Gemini will improve the code generation capabilities of software developers. Google wants to use it to surpass$Microsoft(MSFT.US)$GitHub Copilot code helper.
TOB sales are the focus, and Google Cloud is making every effort to catch up with Microsoft Cloud
Google hopes to use Gemini to attract more users for its products, especially the cloud computing business.
Google plans to provide the Gemini model to enterprises through its Google Cloud Vertex AI service, and will release versions with different parameters.Disguise has boosted Google's cloud service business.
In May of this year, Google announced that it would provide Google Cloud customers with a Palm 2 LLM set through Vertex AI. Recently, Google also offered customers a one-month free trial of the Google Big Model through the coding platform startup Replit.