"Investor Network" Jordan.
In recent years, the global wave of large models has swept through the technology industry, driving a boom in entrepreneurial investment and accelerating the application of AI technology by traditional software vendors. However, despite high expectations for this technological revolution, large model products with true "super applications" value have not yet emerged. The global technology spotlight remains primarily focused on OpenAI, which has successfully built a new ecosystem and provided a flexible framework for commercial exploration. This has sparked a debate: does the surge of large models represent a technological leap, or is it just a short-lived bubble?
Baidu founder Robin Li shared his views at the Baidu World Conference on November 12, 2024. In his keynote speech titled "The Application Is Here," he introduced two new AI technologies - iRAG, an enhanced retrieval of text-to-image technology, and a no-code tool called Segda. Li pointed out that the biggest change in large model technology over the past two years has been the shift from theory to practice, making the technology more available and trustworthy. He emphasized that the "illusion" of large models has been eliminated, and the application of technology is accelerating.
As one of the earliest technology companies to release large models, Baidu introduced its own large model product, Wenxin Yiyan, shortly after OpenAI launched ChatGPT, and made significant progress in practical applications. Recently, a report titled "2024 Global AI Ecosystem Panorama" published by Sullivan positioned Baidu, along with Google and OpenAI, as members of the AI-Native Giants quadrant.
Breaking through 1.5 billion calls, achieving breakthrough from text to image.
As of early November 2024, Baidu's Wenxin large model daily call volume has exceeded 1.5 billion times, compared to the 50 million times disclosed a year ago, which is nearly 30 times growth. This growth reflects the rapid maturity of China's large model technology applications. Robin Li pointed out that this "explosive growth" signifies the rapid development of China's large model technology over the past two years.
In the past two years, large model technology has been innovating at an unprecedented speed. However, despite significant technological advances, the application of large models still faces some core challenges. Especially the issue of the "illusion" has become the biggest bottleneck restricting the widespread application of large models. The so-called "illusion" refers to inaccurate or completely incorrect responses generated by large models when producing content. If this problem is not resolved, large models will have difficulty becoming truly reliable intelligent assistants.
To address this challenge, Baidu has used Retrieval-Augmented Generation (RAG) technology. By combining Baidu's massive internet data with its search engine, large models can generate more accurate content based on existing knowledge resources, thereby reducing the occurrence of illusions. Robin Li emphasized in his speech: "Eliminating illusions has become the key to the practical application of large models."
Currently, text-based RAG technology has made significant breakthroughs, and Baidu's Wenxin large model can effectively eliminate illusions in text generation. However, in areas such as image and multimodal generation, illusion problems still exist, especially when dealing with specific scenarios or characters, generated images may exhibit significant errors or unnatural behaviors.
To address this issue, Baidu has further expanded the application of retrieval-enhanced technology and introduced iRAG (Image-based RAG) technology. This new technology combines Baidu's vast collection of high-quality images in search with powerful generative models, capable of generating more realistic, natural images. Li Yanhong showcased several images generated by the Wenxin large model, including an image of a Volkswagen car leaping over the Great Wall. Unlike traditional text-to-image generation systems, there are no deformities or errors in the Great Wall background and car details in this image, demonstrating outstanding generative quality.
Li Yanhong also presented another image generated by the Wenxin iRAG technology, "Einstein traveling around the world", where the seamless integration of Einstein with background landmarks in the image appears almost like a real photograph, achieving a high level of realism. Li Yanhong stated: "The application of Wenxin iRAG technology not only solves illusion problems in image generation, but also greatly enhances the realism and practicality of generated images."
The commercial potential of this technology is also immense. Li Yanhong cited an example where in the past, creating a set of brand promotional posters could cost tens of thousands of dollars, whereas now, using iRAG technology, the production cost is almost zero. In various fields such as film and television production, comic creation, and advertising poster design, iRAG technology can significantly reduce production costs, accelerate content creation, and improve production efficiency.
Unlocking business value and opening the door to AI for more people.
Eliminating the "illusions" of large models is crucial for the widespread application of AI technology, and this solution is just the beginning. Li Yanhong pointed out that with the continuous improvement of large model capabilities, AI applications are on the verge of entering a "starry" era, and Baidu is well-prepared to further expand the applications of large models in various fields, driving AI technology deep into all industries and truly unleashing its commercial value.
At the conference, Li Yanhong introduced Baidu's latest no-code development tool - Miaoda, which is undoubtedly a significant breakthrough in this process. Through Miaoda, users can describe their requirements in simple natural language without writing any code, to automatically generate application code and quickly complete software development.
The highlight of Miaoda is its low threshold and high flexibility. Li Yanhong explained: "Users only need to describe their ideas, whether building an event registration system or developing a business management tool, everything can be easily accomplished." In the demonstration, Li Yanhong used a meeting invitation as an example to show how multiple intelligent agents can collaborate in development using Miaoda. Through natural language commands, multiple intelligent agents work together to complete tasks such as system design, content creation, and program development, and can even automatically identify and repair bugs in the system.
With the continuous advancement of technology, Robin Li predicts that in the future, Meida will help more enterprises and individuals to achieve low-cost, rapid application development, and even enable everyone to have the ability like a programmer. This will drive more innovative ideas to come to fruition, ushering in an era where 'one can make money just by having ideas.' On November 12th, Meida officially started accepting enterprise test applications, with over 300 companies already applying for Meida testing.
In addition, Robin Li showcased the Wenxin AI platform's TOP100 intelligent bodies and TOP100 industrial applications, stating that 'Baidu is not launching a 'super app,' but aiming to help more individuals and companies create millions of 'super useful' applications.' This signifies that Baidu is not only driving technological innovation, but also promoting the widespread adoption and implementation of AI technology in various fields, unlocking a more extensive commercial value.