share_log

ChatGPT兴起,创成式AI能否重塑工具软件底层逻辑?

With the rise of ChatGPT, can generative AI reshape the underlying logic of tool software?

中金點睛 ·  Mar 3, 2023 15:55

Authors: Yu Zhonghai, Wang Zhihao, Wei Chuanfei, Han Rui, Hu Anqi, Tan Zhe Xian

Source: the finishing touch of Zhongjin

ChatGPT is a step closer to AGI, general artificial intelligence enabling application software becomes possible, which combined with tool software has a wide imagination space. For application software manufacturers, the short-term cost of accessing the AI large model is low, and the long-term imagination is large, so we observe that the vast majority of application software manufacturers actively access the large model. At present, the capability of AI large model represented by ChatGPT mainly lies in human-computer interaction AI and generative AI, and its combination with application software mainly covers AI+ tool software, AI+ search engine, AI+ service application, AI+ vertical industry application and so on. We believe that generative AI has a natural fit with tool software, and there is a broad space for downstream application scenarios and imagination.

Short-term dimension: the integration of generative AI improves production efficiency and becomes the new focus of tool software competition. At present, generative AI mainly helps users improve production efficiency by embedding existing tools and software. many manufacturers have participated in exploration and practice in the fields of text (such as Notion AI), pictures (such as Stable Diffusion, Midjourney), video (such as Make-A-Video), 3D model creation, audio and so on. We believe that, from the perspective of product value, the function of AI converged applications may become the incremental payment point of tool software; from the perspective of competition, the speed of new and traditional manufacturers born in AI to follow up the integration of AI applications will change the existing pattern. However, with the popularity of generative AI applications, AI fusion tools may become "standard" in the future, and the application depth of AI fusion scenarios will become a new focus of competition.

Long-term dimension: generative AI may reshape business logic and realize the transition from production tools to productivity. Ideally, we believe that in the future, the real AGI will be able to create without relying on the command and guidance of human users, and the tools and software enabled by AI may complete the transformation from the production tools to the productivity providers. at that time, the underlying AI capability providers and tool software vendors will jointly participate in the distribution of production value. In order to better understand AI's reshaping of business logic from a long-term perspective, we compare AGI with cloud computing from the perspectives of industrial structure, business logic, competition pattern and value sharing. We believe that just as "going to the cloud" has become a "compulsory course" for application software at present, "AI+" may also become standard for application software in the future, and bring a new round of value release.
Risk

The technological progress is not as expected, the commercial landing rhythm is not as expected, and the competition in the industry is intensified.

Text

AGI large model is getting better and better, creating AI depth enabling tool software

ChatGPT is one step closer to AGI, making it possible for general artificial intelligence enabling applications.

ChatGPT has set off an upsurge of global AI, and the road to AGI may be approaching. ChatGPT (Chat Generative Pre-Trained Transformer) is an artificial intelligence chat robot program developed by OpenAI. It is based on the GPT-3.5 model and can complete relatively complex language processing tasks, including man-machine dialogue, automatic text generation, automatic summary, coding and so on. It was launched in November 2022. Two months after its launch, the number of users reached 100 million, setting off another round of AI craze around the world. The popularity of ChatGPT makes the industry realize that the AI industry is a step closer on the road to AGI (General artificial Intelligence), which in turn leads to worldwide discussion and imagination on how AGI will reshape various industries in the future.

All kinds of application software manufacturers around the world actively embrace the new ecology of artificial intelligence represented by OpenAI. After the launch of ChatGPT, Microsoft Corp plans to invest an additional 10 billion US dollars in OpenAI and explore converged application scenarios in his search and office software. Due to the huge application potential and possibility of ChatGPT, the vast number of application manufacturers around the world have also begun to actively try to access the technical interface of OpenAI, in the hope that AI and its existing products can produce new chemical reactions. The domestic market also quickly followed up. Baidu, Inc. announced that Wenxin, its target product, would complete internal testing in March and be open to the public. at present, hundreds of domestic enterprises have announced access to Wenxin, including Hande Information, Kingdee, Yuxin and other enterprise service software manufacturers. At the same time, we expect that more large models will continue to appear at home and abroad to attract more application software manufacturers to enrich and expand the AI ecology.

For application software manufacturers, the short-term cost of accessing AI large model is lower, and the long-term imagination space is larger. At present, large models such as ChatGPT are in the initial stage of release, and the exploration of business model is just beginning. At this stage, the focus is on ecological construction rather than commercial realization, so no matter OpenAI, Baidu, Inc. or other large model manufacturers, they all keep an open attitude to the interface calls of application software vendors in the short term. This means that for application software vendors, the cost of accessing AI in a short period of time is not high, while AI is quite imaginative about the positive changes that their product form and business logic can bring in the long run.Therefore, we observe that the vast majority of application software manufacturers are actively accessing the ability of large models, and the number of related applications is expanding exponentially.

At present, the ability of AI large model represented by ChatGPT mainly lies in human-computer interaction AI ability and generative AI ability, and its combination with application software mainly covers the following directions:

► AI+ tool software: assist text, picture, video production. The integrated application of AI and authoring tool software mainly gives full play to the generative AI ability of large LLM models such as ChatGPT, which can complete text generation, picture generation, video generation and other auxiliary creative tasks based on user input instructions and guidance. At present, typical application representatives include Notion AI and Office in text category (Microsoft Corp plans to access ChatGPT), Stable Diffusion (owned by Stability AI), Midjourney, DALL-E (owned by OpenAI), Imagen (owned by Alphabet Inc-CL C) and Designs.ai, Make-A-Video (owned by Meta), Lumen5 in video category.

► AI+ search engine: use natural language processing to transform traditional search clicks into interactive question-and-answer forms and generate personalized results. The combination of AI and search engine is mainly based on the human-computer interactive dialogue ability of natural language processing, to help users solve problems in the form of question and answer, and immediately generate personalized planning, suggestions, analysis and so on. Typical representatives include Microsoft Corp NewBing search engine, which adds interactive chat and auxiliary writing functions after being connected to ChatGPT.

► AI+ service applications: give full play to human-computer interaction to improve self-service experience. The combination of AI and service applications is mainly in the form of self-help question and answer chat robot, giving full play to the human-computer interaction ability of the LLM model. Typical representatives include intelligent question and answer and help robots in various service application scenarios, such as e-commerce, games, maps and so on.

► AI+ vertical industry applications: combined with existing vertical industry applications, it essentially belongs to the above three forms. Typical representatives include Yuxin, hand, Kingdee, Hang Seng Electronics, etc., which integrate the human-computer interaction and creation ability of large models to achieve more efficient information acquisition, analysis, and the formation of intelligent solutions. We believe that, in essence, it can also be classified as one of the above three forms. In the future, we need to further explore how to better combine with the vertical scene, and carry out more in-depth training for the industry corpus in order to achieve better results.

Chart: the main merging direction of AI large model and application software

资料来源:各公司官网,中金公司研究部
Source: official website of each company, China International Capital Corporation Research Department

The investment of application software manufacturers in the field of AI will focus more on the exploration of AI application scenarios and the integration with existing applications. From the perspective of the division of labor of the whole AI industry chain, we think that in the future, large model manufacturers will undertake most of the underlying algorithm development and optimization work, while application software manufacturers will focus more on the exploration and deep ploughing of application scenarios, as well as deeper integration with existing AI models. As to whether general artificial intelligence can be industrialized in the future, we think that both the advanced bottom model and the matching upper application are indispensable, and there will be a closer and reasonable division of labor and cooperation between AI manufacturers and application software manufacturers in the future.

In the above application directions and scenarios, we pay more attention to the possibility of combining generative AI with tools. In the application of large model represented by ChatGPT, generative AI is a more outstanding ability, and it has a natural fit with the existing tool software (text creation tools, picture creation tools, 3D model creation tools, etc.), and the downstream application scene is broad and imaginative. Therefore, in this report, we will focus on the enabling of generative AI to tools and the possibility of long-term reshaping the underlying business logic and industrial ecology of tools.

What possibilities will the generative AI enabler software create for it? From a short-term perspective, generative AI is mainly embedded in existing tools and software as an innovative auxiliary function to help users improve their productivity, and manufacturers can charge extra as a value-added service. But in the long run, if the future generative AI can achieve active creation that does not depend on user guidance, it is expected to transform from production tools to productive forces, in a real sense to replace part of the work of "creators".Therefore, our attitude towards generative AI is conservative in the short term and not underestimated in the long run.
Chart: a list of the applications of language models such as OpenAI and Wen Xin Yiyan that have been or are planned to be connected worldwide.

资料来源:同花顺财经,IT之家,新民晚报,新浪财经,中金公司研究部
Source: flush Finance, IT House, Xinmin Evening News, Sina Finance, China International Capital Corporation Research Department

Short-term dimension: merging generative AI to improve production efficiency and become a new focus of tool software competition

At present, generative AI mainly helps users improve production efficiency by embedding existing tools and software. After the integration of the tool software into the creative AI, it can carry out auxiliary creation under the framework, instructions and guidance specified by the user, so as to help users reduce repetitive, mechanical, rule-oriented work, and even undertake some creative work. For example, based on guidelines, collect and induce text creation on the basis of existing corpus, generate pictures and videos based on text description, and assist in parameter optimization in 3D model creation.At present, many manufacturers have chosen to participate in exploration and practice in many modal fields, such as text, 2D pictures, 3D models, audio, video, and so on.

► generative AI and text creation: overseas manufacturers such as Notion have built-in AI writing assistant to automatically generate text content in different application scenarios according to user description, and Microsoft Corp also plans to access ChatGPT capability in Office; domestic manufacturers such as Kingsoft Office's WPS can achieve document proofreading, full-text translation and auxiliary writing and other functions. In addition to C-end applications, there are also manufacturers who have developed AI-aided text creation products specifically for enterprises. The typical representative is the formula under the fourth paradigm, which can integrate large-scale generative language models such as GPT with vertical domain knowledge within the enterprise, and guarantee private deployment at the same time, so as to meet the requirements of enterprise applications for vertical industry knowledge, data security, content credibility and so on.

► generative AI and picture creation: many overseas companies have launched text-based image generation products, including DALL E 2 of OpenAI, Stable Diffusion of Stability AI, Midjourney and so on. The operation process is mostly similar. Enter keywords to generate multiple AI painting content, and support further modification and addition of details. Different manufacturers have different styles in generating pictures. DALL-E2 is realistic, Midjourney is science fiction, and Stable Diffusion has no style. You can try to adjust it many times according to the detailed instructions. Domestic manufacturers have also followed up, such as Tiangong Qiao painting under Kunlun Wanwei and Wanxing painting under Wanxing Technology, and so on.

► generative AI and audio creation: overseas manufacturers such as Alphabet Inc-CL C released AudioLM in October last year, which can generate similar style audio based on the input audio clips, and in January this year launched MusicLM, which can generate music directly from text and images. Microsoft Corp also released VALL-E in January this year, which can imitate people's speech and copy the speaker's mood and tone. In addition, there are Dance Diffusion under Stability AI and Jukebox under Open AI. Some domestic manufacturers have also followed suit, including iFLYTEK dubbing, Baidu, Inc. speech synthesis and Tencent Zhiying.

► generative AI and video creation: overseas manufacturers such as Make-A-Video under Meta support video generation based on text description; Alphabet Inc-CL C's Imagen Video and Phenaki support video creation with different image quality and length requirements respectively, and in early February it released a new video editing method, Dreamix, which can edit existing videos and generate videos by providing pictures and descriptions; in addition, there is AI video generation model GEN-1 launched by Runway. Domestic manufacturers have also tried, such as Wanxing's Wanxing broadcast support to generate digital promotional videos based on keywords, VidPress incubated by Baidu, Inc. to support automatic production of dubbing, subtitles and picture video content after importing images, as well as Danghong Technology's picture quality incremental AI products and Shangtang Zhiying's auxiliary intelligent script creation products.

► generative AI and 3D model creation: 3D CAD products such as Creo, Autodesk Fusion360, Solid Edge and Solidworks have been widely integrated with AI Inside application capabilities, which are mainly used to assist in parameter optimization and sketch generation. In the field of EDA, overseas EDA manufacturers such as Synopsys and Cadence have explored the design of AI enabling chips to achieve higher design efficiency through existing design data training models.

The integration of tools and AI can optimize user experience and production efficiency, and enhance the competitiveness of products. Whether from the point of view of bringing users a "sense of novelty" or from the perspective of improving user productivity, access to AI is a better choice for tools to enhance the attractiveness and competitiveness of products. At the same time, due to the low trial and error cost of short-term access to the large model, we judge that the majority of tool software manufacturers will have an open attitude to the access of related capabilities, and the industrial ecology is expected to grow rapidly.

But from an objective point of view, the current generative AI still has many shortcomings, mainly as an auxiliary production tool. At present, the generative AI represented by ChatGPT still has many shortcomings, such as lack of industry-specific corpus training, corpus lag, unable to guarantee the correctness of logical reasoning, so it only appears as an auxiliary production tool in the short term, and does not have the ability to produce and create on its own initiative. And in the process of use, users also need to pay extra attention to possible copyright disputes, sensitive information, prejudice and discrimination. We believe that the integration of generative AI and application software is still in its infancy and has broad room for improvement.

What will be the impact on the industry ecology and business pattern in the short term after AI enabling tools?

From the point of view of product value, the function of AI converged application may become the incremental payment point of tool software. In the short term, tool software manufacturers can use the converged applications with AI as differentiated function points and value-added services, charge users incrementally, and then open the product payment ceiling. For example, Teams Premium released by Microsoft Corp costs US $10 per month to subscribe to GPT-3.5-based automatic generation of meeting notes and other functions; Copilot, also an auxiliary code generation and modification application under Microsoft Corp, is also charged extra; Notion is currently free of charge for alpha testing of AI enhancements, but officials say there will be a high probability charge for future official versions.Chart: AI enhancements may become an incremental paypoint for tool software, further opening the product revenue ceiling

资料来源:公司官网,中金公司研究部
Source: China International Capital Corporation Research Department, the company's official website

From the perspective of competition, the speed of new manufacturers and traditional manufacturers born in AI to follow up the integration of AI applications will change the existing pattern. We regard AGI as a new technological revolution, which may have an impact on the traditional industrial structure. Analogy cloud computing era, Salesforce.com Inc and other emerging SaaS manufacturers to grasp the "cloud" new trend, the sudden rise of "overtaking" SAP and other established software manufacturers; and Oracle, Microsoft Corp and other traditional manufacturers of the cloud transformation results also directly affect the evolution trend of their market influence. In fact, there are already a number of AIGC-related unicorns developing rapidly. In the future era of AI converged applications, we think that the emergence of emerging manufacturers originating from AI and the transformation effect of traditional manufacturers AI may change the existing competition pattern.

Chart: AIGC-related unicorns are growing rapidly or changing the existing landscape

However, with the popularity of generative AI applications, AI fusion tools may become "standard" in the future. Because tool software manufacturers do not need to invest in the development of large AI models, but only focus on the practice and adaptation of AI integration applications, the early cost is not high, so we judge that if the early tool software manufacturers achieve commercial success through the integration of AI, other participants in the industry will quickly follow, and AI integration tools may become "standard". Under the circumstances, we believe thatTool software vendors may not be able to charge for AI enhancements separately, and the difference in competition between vendors will change from "with or without AI enhancements" to "whether they can make good use of AI".

In the future, the application depth of AI fusion scenario will become a new focus of competition among tool software manufacturers. When AI converged applications become the "standard" of tool software manufacturers, the focus of competition among manufacturers will be on how to explore application scenarios that are more suitable for AI and maximize the effectiveness of generative AI. On the premise that they can also access the AI general large model capability, we believe that in the future, manufacturers who can better integrate AI with existing application scenarios and give greater play to the value of AI are expected to win in the new round of competition, and the existing fixed competition pattern in some areas may also be impacted or even subverted.

Long-term dimension: generative AI may reshape business logic and realize the transition from production tools to productivity

Ideally, AGI can upgrade production tools to productivity and reshape the underlying business logic of tool software. In the long run, the application of AGI (general artificial intelligence) fusion tool software has more imagination, and there is no lack of the viewpoint that general artificial intelligence is compared to a new "industrial revolution" and "technological singularity" in the industry. Ideally, we believe that the real AGI in the future will be able to create without relying on the commands and guidance of human users. At that time, the tool software that integrates the independent creative ability of AGI will no longer be just a "production tool" to assist human users to improve efficiency, but will become an independent incremental "productivity".

After becoming productivity, the tool software enabled by AI should directly participate in the distribution of production value, and the production value should be shared by the underlying AI capability provider and the tool software manufacturer. We believe that in the future, if the tool software enabled by AI can complete the transformation from the production tool provider to the productivity provider, its business logic will no longer charge the tool indirectly, but should directly participate in the distribution of production value, such as a book written entirely by AI-enabled text creation software. Both the underlying general AI capability provider and the text authoring tool software provider are entitled to a share of book sales.

Chart: generative AI upgrades production tools to productivity, bringing about a qualitative change in business logic

资料来源:Business Digest,中金公司研究部
Source: Business Digest, China International Capital Corporation Research Department

In the short term, the downstream vendors with scarce AI fusion scenarios are more critical; in the long run, the bargaining power will be transferred to the platform vendors who master the underlying general AI capabilities. In the early stage of AGI exploration, suitable downstream application scenarios are scarce, and the underlying general AI platform vendors hope to access as many application vendors as possible, so as to get more opportunities to train large models in vertical application scenarios. But in the long run, due to the high technical and cost requirements of training large models, and with the gradual deepening of AGI applications, we think that the final bargaining power may be transferred to a small number of platform manufacturers with underlying general AI capabilities, which are expected to get a higher proportion in value distribution. However, no matter which proportion of the final value distribution is high or low, we believe that in this process, the business logic of tool software manufacturers has undergone a qualitative change-that is, it is possible to be directly involved in the process of sharing production value.

Chart: ideally, AGI brings about logical changes in value distribution of tool software.

How to better understand AI's reshaping of business logic from a long-term perspective? We compare it with the SaaS model brought about by cloud computing. We believe that AI and cloud computing are both epoch-making technological changes. Cloud computing has created SaaS, a new business model and changed the competition pattern of traditional enterprise service software. Therefore, we compare AGI with cloud computing in terms of industrial structure and business logic, and discuss its possible business impact.

From the perspective of ► industrial structure, the computing power, model and AI convergence applications in AI correspond to IaaS, PaaS and SaaS in cloud computing respectively. We believe that, similar to the three-tier industrial structure of cloud computing, the training of the AI model needs the strong hardware support of the underlying layer, the computing layer is the corresponding IaaS layer in cloud computing, while the large AI model is similar to the basic software and bears the general requirements. At the same time, the interface of the large model is also trying to pay by quantity, and MaaS (Model-as-a-Service) is the corresponding PaaS layer in cloud computing. The top application software calls the AI model to provide enterprises and consumers with the vertical scenario function after integrating AI capabilities, that is, the SaaS software that provides services based on the underlying cloud computing infrastructure and platform capabilities.

Chart: computing power, model and AI converged applications in AI can correspond to IaaS, PaaS and SaaS in cloud computing respectively.

From the perspective of ► business logic, cloud computing has changed from selling products to subscribing services, and AGI is expected to bring about a change in the payment for the use of production tools to the direct participation of productivity in value distribution. Cloud computing enables customers to shift from an one-off buy-out of basic hardware and software products to continuous payment to enjoy the services provided by cloud vendors. Subscription system means better cash flow and revenue sustainability for suppliers, as well as higher total customer payments. As we discussed earlier, if the tool software enabled by AI in the future can complete the transformation from the supply of production tools to the provider of productivity, its business logic will shift from charging fees for the use of tools to directly participating in the distribution of production value, which also means better revenue sustainability and higher income ceiling for suppliers.

From the perspective of ► competition pattern, the entry of new manufacturers and the degree of adaptation of traditional manufacturers to new technologies have changed the existing pattern. Taking the market pattern of database basic software as an example, the changes in the market pattern in the past decade are mainly affected by the entry of cloud vendors and cloud native independent database manufacturers, as well as the effectiveness of cloud transformation of traditional database enterprises. From an analogical point of view, we believe that the entry of new tool software vendors originating from AI in the future, and the speed and ability of existing manufacturers to integrate AI may also reshape the market competition pattern.

From the perspective of ► value sharing, the underlying infrastructure vendors provide general capabilities, while the upper application vendors focus on the vertical scenario. In the cloud computing industry chain, IaaS and PaAS layer vendors provide general software and hardware infrastructure capabilities, while SaaS layer manufacturers focus on providing vertical functional applications. Analogically speaking, AI underlying platform vendors provide general large model capabilities, while upstream tool software vendors look for landing scenarios that are suitable for AI empowerment and realization. In terms of the computing cost required by AI, we believe that AI vendors will bear the training costs, while the subsequent reasoning costs will be shared by AI vendors and application software vendors (similar to cloud computing leasing cloud computing resources, the future AI industry will lease models and computing power).

Chart: in the long run, AI is expected to bring about the business logic reshaping of tool software like cloud computing.

"going to the cloud" has become a "compulsory course" for application software, and we think that "AI+" may also become a standard part of application software in the future. At present, supporting cloud deployment has basically become a necessary capability for software manufacturers. Most software companies established after 2010 have chosen the cloud-native technology route, while traditional software companies have also actively turned to the cloud. And the business model is also transformed to a subscription system. From the perspective of the integration of application software to AI, we think that "AI+" is expected to become the standard of the new generation of application software, and application software manufacturers will also form a new set of mature business model in the exploration and running-in with AI manufacturers.

After the business model is reshaped, cloud computing promotes the revaluation of applications, and AGI may also bring a new round of value release in the future. Cloud computing promotes the upgrading of business model and business logic through the change of software development, deployment, delivery and charging, which in turn leads to the revaluation of tools and even the whole application software industry in the capital market. We believe that in the long run, future generative AI enabler software may bring a new round of value release. However, in the short term, because there are still many defects in the current large model, downstream applications and incremental payment scenarios are still being explored, and there is also a need for further discussion on copyright and laws and regulations. therefore, our above conjecture still has a lot of uncertainty in the future evolution direction and needs to be followed and observed continuously.

To sum up, the AI fusion tool software has a wide imagination, but there are still many challenges in the actual landing. We emphasize the point of view that it is not exaggerated in the short term and not underestimated in the long term. The imagination of AI fusion tool software is wide, but the final implementation still depends on the underlying computing power and the evolution and iteration of large model algorithms. at the same time, there are still legal and ethical issues to be discussed and solved. We believe that the future of AGI application is bright, but the road is tortuous. We emphasize the view that it is not exaggerated in the short term and not underestimated in the long run, and suggest that investors should continue to follow the latest industrial trends and pay attention to the possible application scenarios of AI fusion tools.

Chart: AIGC continues to make breakthroughs in key technologies, and AI fusion tools have a wide imagination. We emphasize that we do not exaggerate in the short term and do not underestimate in the long run.

资料来源:OpenAI官网,《Denoising Diffusion Probabilistic M
Source: OpenAI official website, "Denoising Diffusion Probabilistic M"

Industrial practice and Application trend of generative AI enabling tool Software

Generative AI and text creation: ChatGPT is expected to accelerate the landing of AI text creation

Generative AI can complete the functions of writing, rewriting, correction, translation and so on in the text creation scene. AI can train text creation tools with the help of extensive text data on the Internet. at present, the application ability of Transformer large model in natural language scenes has been relatively mature, and we think that text creation is expected to become a fast landing application scene of generative AI. We have observed that Notion and Microsoft Corp have begun to integrate AI language models into notes and office software; the fourth paradigm has also launched AIGC tools for enterprise customers. Jinshan Office, the leader of office software, is also expected to achieve AI empowerment and improve the efficiency of text creation in the medium to long term. We believe that generative AI can achieve four major capabilities in text creation scenarios:

► writing: based on a massive corpus, Transformer neural network has the ability of language understanding and text generation, so it can generate logically coherent and fact-rich segments according to the user's simple instructions.

► rewriting: compared with ordinary language models, large language models have certain reasoning capabilities and can form thinking chains to solve abstract problems, so they can complete text rewriting tasks according to user requirements.

► correction: by comparing learning and summarizing rules in massive text data, generative AI can correct the spelling, grammar, punctuation and other errors of the given text, making the modified text more in line with the common language paradigm.

► translation: generative AI can use cyclic neural network and convolutional neural network to disassemble complex segments and translate them in context, thus greatly improving the integrity, accuracy and readability of translation.Chart: four abilities of generative AI in text creation scenes

Case 1:Notion AI optimizes text creation

Notion AI can generate rich text content based on simple instructions. Notion AI is an artificial intelligence tool for Notion products, which helps users improve the efficiency and experience of text creation by integrating machine learning and NLP technology. Under the AI large-scale language model, users only need to list the basic needs, and the product can automatically generate rich text content, which covers a variety of scenarios, such as meeting agenda, sales email, press releases and so on. Notion AI also has summary, error correction, translation, continuation, brainstorming and other functions; Notion AI will also become the interface of Notion knowledge base, users only need to enter search requirements, Notion AI will automatically present relevant information. We expect that Notion AI's automatic text generation, text summary, text editing and other functions may greatly optimize the user's creative process and user experience, and help Notion's product power to leap forward.Case 2: Microsoft Corp's integration plan of AI and Office

With AI enabled, Microsoft Corp Office is expected to optimize the product experience. Microsoft Corp invested 1 billion US dollars in OpenAI in 2019 and established a more in-depth cooperative relationship with it. Recently, Microsoft Corp plans to integrate the next generation language model of OpenAI into Word, PowerPoint, Outlook and other applications in Office office software. Users only need to enter simple instructions to obtain automatically generated text content. The new version of Office will have automatic summarization, content suggestions, and text generation capabilities to provide an experience similar to the Bing-ChatGPT sidebar, where users can interact with chatbots.

The large number of users and training data are expected to contribute to the rapid iteration of Office AI application capabilities. The advantage of user scale of Office office software is obvious (there are 1.5 billion sets of PC version installed worldwide in 21 years). We believe that the integration of OpenAI's artificial intelligence technology and Office software on the one hand can enable AI to find a high-quality landing scene; on the other hand, the huge user scale of Office software is expected to provide AI with a steady stream of massive training data, thus forming a flywheel effect and constantly improving AI's text creation experience.

Case 3: mold force tables provide AI word processing applications embedded in table scenarios

The mold force table realizes the "batch calculation" of the text content of the table through the large model of AI. The mold force table is jointly developed by the face wall intelligence company and the large model open source community OpenBMB (the main members are from Tsinghua University). It embeds the word processing ability of the AI model into the function, and the model can be called by entering the function in the table. Currently supported functions include IE (information extraction), QA (question and answer), MT (translation), SA (emotion analysis), TG (title generation), etc. It also supports integration with Excel basic functions. We think that through the AI word processing application in the table, we can realize the text batch calculation and greatly improve the office efficiency.

Chart: mold force table to realize AI word processing ability in table scene

资料来源:OpenBMB开源社区微信公众号,中金公司研究部
Source: official account of Wechat, OpenBMB open source community, China International Capital Corporation Research Department

Case 4: the fourth paradigm meets the AIGC requirements of the enterprise scenario

The fourth paradigm introduces the "style theory" of enterprise-level GPT products, which helps enterprises to use internal knowledge to solve problems. The fourth paradigm aims to solve the limitations of large-scale generative language models in internal enterprise scenarios and meet the needs of AIGC in enterprise scenarios by combining GPT-like language models with vertical domain knowledge. The "style theory" focuses on three major product features: 1) data security, which solves the concerns of enterprise customers about data security through privatization deployment; 2) content credibility, which is based on the enterprise's internal database, and marks the original source of the information when providing the answer, which increases the credibility and reliability of the answer; 3) the cost is controllable, and the calculation cost is relatively controllable, and the demand for data labeling is small. We believe that AIGC tools such as "Shi Shuo", which serve B-end customers, can help to realize enterprise knowledge reuse and improve the efficiency of enterprise production and management.

Chart: the fourth paradigm "style theory" product work interface

Case 5: Bamboo Intelligence enables Writing & Dialogue & knowledge search and other scenarios with the help of AIGC

Bamboo Intelligence launched ChatGPT-like products, enabling enterprise-level AIGC applications. Founded in 2015, the company provides AI enabling solutions for finance, enterprises, health care, manufacturing, intelligent terminals and government affairs. In September 2022, the company launched AI SaaS products, covering customer service, sales services, internal services and other scenarios to provide cloud AI tools for small and medium-sized enterprises. In the field of AIGC, the company has also continued to dig deeply, previously launched a number of intelligent creative writing software such as Magic Writer, and recently launched an enterprise-level Gemini GPT product series, including enterprise dialogue robot KKBot, interactive cognitive search engine ChatSearch, with the help of AI to achieve comprehensive empowerment in sales customer service, human-computer interaction, knowledge exploration and so on.

Case 6: impression notes assist text creation with the help of lightweight large models developed by ourselves.

Based on the self-developed "Elephant GPT" model, the "impression AI" generative text tool is launched. Since 2019, domestic note application manufacturers have been impressed by the AI application scene of note-taking AI in note-taking word processing, and have launched AI tools such as intelligent recommendation, smart tag, intelligent summary, knowledge star map and so on. Impression note at the same time continue to invest in large model research and development, in 2023 launched a combination of OPT, BLOOM and other GPT-3.5 structure large language model independent research and construction of the big language model "Elephant GPT", and based on this launched the "impression AI" generative text tool module embedded in their own notes products, to achieve domestic manufacturers through self-research model to achieve AI text creation of the first application. The future impression note plan uses reinforcement learning (RLHF) based on human feedback to optimize the model, and plans to combine with private corpus to enable personal style writing.

Case 7:Minimax opens a new scene of C-terminal landing.

Different from ChatGPT's professional knowledge Q & A, MiniMax launched Glow's main chat and social function. Founded at the end of 2021, the company has developed a general large model from text to vision, text to voice, and text to text. In November 2022, MiniMax launched the first AI dialogue robot platform Glow, in which users can choose existing agents for dialogue, or create agents through a brief description and optimize them in subsequent conversations. The dialogue generation, portrait generation and timbre generation of the agents invoke the capabilities of the three modal models of MiniMax. Different from the ChatGPT chat robot, which tends to question search, text generation and other functions, the agents generated by Glow have different backgrounds and personality settings, and the content of conversation with users also tends to chat with company, emotional interaction, plot interpretation. We believe that the chat robot of MiniMax has a good interaction effect with users and has strong user stickiness, so it opens a new scene of C-terminal landing.

Case 8: potential AI application scenario of Jinshan Office

Jinshan Office has a solid layout in the field of AI. Jinshan Office, the leader of domestic office software, also has a wide range of technology and business layout in AI fields such as computer vision, natural language processing, voice processing and so on. Since 2017, the company has started to build the AI center, and has developed nearly 100 AI capabilities around the office field. In the field of natural language processing, Jinshan Office has developed an auxiliary writing function. Users only need to provide an outline, AI can automatically generate text based on corpus algorithm, and users can use the text generated by AI as manuscript, which greatly improves the writing efficiency. In addition, Jinshan Office has also achieved AI proofreading, translation, error correction and other functions, and regard it as an important incremental function of WPS office software suite.

We judge that Jinshan Office will follow the AI industry trend at the same time, cut into the follow-up at the right time. We judge that Jinshan Office will focus on the AI application side. The company's existing product WPS has accumulated a large number of users, diverse user scenes and high complexity. We believe that if Kingsoft Office can dig user scenes deeply, it will be able to provide corresponding AI text creation services in email, office, marketing, government affairs, literature and other subdivided scenes to enhance user experience and deepen the product moat. In the future, we judge that the company will try to access the application timely after fully considering the capabilities of domestic AI model manufacturers, so as to give full play to the application potential of AI model in the field of office software as much as possible.

Generative AI and Audio Generation: cross-modal applications enter the Audio Industry

Overseas case 1: different teams of Alphabet Inc-CL C have audio generation research results.

Alphabet Inc-CL C released different audio generation models in 2023, and they have their own characteristics. There have been attempts to create music by AI before, such as the visual music creation model Riffusion, the AudioML released by Alphabet Inc-CL C and the Jukebox launched by OpenAI. The current research results are based on the Diffusion model, tagged audio data, through the extraction of data features, text and audio pairing to achieve the text to generate audio.

► MusicLM: this is a model for generating high-fidelity music from text descriptions. For example, users can enter "calm violin melodies with distorted guitar improvisation". MusicLM converts the conditional music generation process into a hierarchical Seq-to-Seq modeling task, and can maintain the frequency of 24 kHz to generate a few minutes of music, both text description and audio quality are better than the previous model. In addition, MusicLM can also transform the original melody based on the text description and generate the corresponding music accompaniment according to the picture painting and text description.

► Noise2Music: continuously apply the Diffusion model to generate 24kHZ audio clips, use two depth models to pseudo-mark large pseudo-label audio data sets to generate training sets, and the big prediction model to generate music descriptive text, embed the pre-trained music-text joint model, and assign the corresponding text to the audio through zero-shot classification. Noise2Music can understand the more complex semantics of prompt and generate different styles, such as "an alto sings a slow jazz ballad in a live performance", or imitates different instruments such as piano, saxophone, African drums, etc.

► SingSong: this model can automatically generate accompaniment according to human voice, and its technical basis is based on sound source separation and audio generation of human voice. Users only need to input their human voice to get the corresponding musical instrument accompaniment. The researchers gathered a group of listeners to evaluate the effectiveness of the model, showing two 10-second accompaniment audio with the same voice, and SingSong received significantly better feedback than other baseline models.

Overseas case 2: British academic institutions propose AudioLDM to improve quality and optimize computing power consumption

The AudioLDM model solves the problems of limited quality and high computational cost in the research of "text to audio". The University of Surrey and Imperial College in the UK have jointly released and opened up a framework based on denoising diffusion implicit model and comparative learning: AudioLDM. The model improves the quality of audio generated by text; only text data is needed in the training process to achieve an equivalent or even better effect than using audio-text; in addition, the computational resource consumption of model training is low. and the sound style can be transformed or imitated without extra training.

Domestic case 1: iFLYTEK launches a new training framework to optimize voice prosody

IFLYTEK launched the SMART-TTS framework and launched Xunfei open platform, iFLYTEK audio and learning power. SMART-TTS does not directly learn the mapping of text and audio features, but through modular disassembly of the speech synthesis learning process, pre-training to strengthen each module. The framework can provide 11 kinds of emotions, such as "happy, sorry, sad", each emotion has 20 levels of strong and weak adjustment, and can also provide pause, stress and speed of voice, which can realize the feelings expressed by real people in the voice of digital people. In addition, iFLYTEK's speech synthesis supports 37 languages, 11 dialects, 2 national languages, as well as Chinese-English mixed natural synthesis.

Domestic case 2: domestic AI voice generation "unicorn" Yun Zhisheng

In addition to text-generated music, speech synthesis is also an important direction of audio generation. The domestic "unicorn" Yunzhisheng provides speech synthesis products and services, including text-to-speech synthesis, voice library customization and voice cloning. Among them, speech synthesis can convert text into natural and smooth speech, provide more timbre, different emotions, and provide functions such as adjusting volume, speed and pitch; audio library customization is mainly for enterprise customers, providing customized sound library services to generate exclusive IP pronunciation through in-depth learning; sound cloning can quickly obtain sound models with similar timbre and pronunciation style by recording a small number of user voices. These functions are suitable for intelligent customer service, intelligent hardware, news broadcasting, self-media dubbing and other audio scenes.

Generative AI and Picture creation: cross-modal brings rich imagination

In 2022, with the birth and open source of CLIP and Diffusion models, the landing of DALL ·E 2 and Stable Diffusion models was further promoted, and cross-modal generation such as text generation and image generation became the main line of AIGC landing. After OpenAI has the foundation of a large model, a large amount of corresponding data of graphics and text in the open source database, the computing power support of the head manufacturer and the lowering of the threshold, it releases an upgraded version of the "Wensheng diagram" model DALL ·E 2, which pushes AI painting (text cross-modal image generation) to the ground, setting off a wave of AI painting. In August 2022, Stability AI open source Stable Diffusion model marked a significant reduction in the threshold of cross-modal applications of AIGC in the field of AI painting, and opened the era of "industrial production" of national creation. On this basis, the overseas application layer gives birth to fine-tuning models and plug-ins such as Midjourney, ChilloutMix and Controlnet, which continuously improve the quality of generated images and gradually promote the commercialization of AI image creation.

Overseas case 1: DALL E and DALL E 2, the founders of Wen Sheng Tu

DALL ·E was first launched by OpenAI and began commercializing its technology through Azure OpenAI services in 2021, and an upgraded version of DALL ·E 2 was released in April 2022. With the GPT-3-based image text matching model CLIP released by OpenAI in 2021, DALL ·E 2 has the ability to connect text and visual images. Through the Diffusion-based image generation model GLIDE, DALL ·E 2 can generate realistic images according to the text, the resolution is improved by 4 times, the accuracy is higher, and the business is wider. It has three functions: 1) generate an image according to the text prompt, 2) generate a new image from a given image, and 3) edit image elements with text.

DALL ·E 2 currently adopts the business model of paid purchase times: after joining the Open Beta program, there are 50 free points in the first month, each point corresponds to a drawing, and then 15 points are added free of charge each month. The current price is $15. 115 points. Compared with DALL E, DALL E 2 can not only generate a more real and accurate image, but also express the scene more completely and edit the existing image through natural language description. Compared with other models in this field, DALL ·E 2 has higher controllability, excellent spatial structure relationship processing and strong image simulation. The technology of DALL ·E 2 is mature and takes the lead in bringing AI painting from imagination to reality. In July 2022, DALL E 2 launched an invitation public test, which is an important driving force for the heat rise of AIGC in 2022.

Overseas case 2:Stability AI open source Stable Diffusion, painting with AI for export

Stability AI was founded in 2020, with the underlying ability to launch and open source Stable Diffusion in 2022, with a post-investment valuation of more than $1 billion, and was promoted to a unicorn in the seed round financing stage. Stable Diffusion is mainly based on the latent diffusion model (Latent Diffusion Model), generates images through iterative "denoising" input and decoding output, and uses spatial dimensionality reduction to solve the pain points of memory and model reasoning, which not only enables users to quickly generate high-resolution and high-definition images on consumer-grade graphics cards, but also establishes an open source ecology, which greatly reduces the threshold for users. At this point, the open source ecology promotes the initial solution of the data, model and computing problems of AIGC, which directly lowers the threshold for users and penetrates into many vertical fields.

Overseas case 3: successfully realized business model, AI mapping phenomenal application Midjourney

Midjourney has built a closed-source "Vincent diagram" model based on CLIP and Diffusion, which has achieved 10 million users and more than $100m in revenue. The product is located in the Discord community, and users generate the desired image by inviting the Midjourney robot to the channel and typing the prompt that starts with "/ image". Midjourney has more than 10 million community members and gets feedback through users' choice of generated results, resulting in a large and unique data set that sets up barriers to competition. The pictures generated by Midjourney require short prompt, high quality, sci-fi color, and are loved by designers, Web3 & NFT practitioners and individual users. The SaaS paid business model has been used to make a profit.

Compared with the overseas cutting-edge technology, the domestic AI picture creation is relatively early, but the corresponding results have also made some progress, the emergence of a number of innovative products and technologies. Among them, represented by Baidu, Inc. 's Wen Xin Yi style and Wanxing Technology's Wanxing painting, it not only shows the domestic ability of artificial intelligence painting, but also innovates and develops "AI simple pen drawing", which expands the interactive way of creation and improves the efficiency and experience of users.

Domestic case 1: Baidu, Inc. based on Wen Xin big model, AI painting ability to mark overseas

Wen Xin Yi GE is the first AI painting product launched by Baidu, Inc. relying on flying oars and Wen Xin big model. The product supports text generation of more than ten different styles of images, such as national style, oil painting, watercolor, gouache, animation, realism, etc., providing a creative platform for professional content creators and providing possibilities for entry-level users and public users to achieve imagination. Faced with the triple challenges of application landing: understanding of creative needs, original generation of images and satisfaction of creative needs, Wenxin Yige carried out three major technological innovations, namely, knowledge-based prompt learning, text cross-mode deep fusion and text-driven image editing, realizing creative planning, detail description ability and multi-round interaction to improve the quality.

Domestic case 2: Wanxing Technology ploughs AIGC painting, OpenAI empowers domestic manufacturers' case benchmark

Wanxing Technology has been engaged in overseas business for 20 years and connected to OpenAI's API to create a new creative artifact for the creative field of drawing: Wanxing Ai painting. Wan Xingai painting is located in the professional creation of "AI to generate high-quality works of art", providing random generation and keyword creation of two AI painting modes, users can enter keywords, choose picture proportion and art style, and the paintings generated by AI can be obtained in 30 seconds, and the works support a variety of art styles, such as hand painting, cyberpunk, QQ, CG digital rendering and so on. And the product supports bilingual creation in both Chinese and English, emphasizing key words through exclamation points and parentheses.

In February 2023, Wanxingai painting took the lead in launching "AI sketches" in the industry. The product became the world's first AI painting software through user interaction and "picture", marking a new era for Wanxing painting to help AI painting enter a new era. Compared with the previous methods of painting, simple strokes require less prompt for users, and now you can generate high-quality art paintings in 5 seconds with just a few strokes; users can also iteratively upgrade the model through image selection feedback. Through the sketch "picture", users have a more sense of participation in the creation, and the process is more interesting.Chart: Wanxing "AI painting" creation interface


Generative AI and video authoring: the cross-modal step is still in its early stages and is expected to open the application ceiling

The benchmarking case of overseas technology giants opens up the imagination of AI video creation. In September 2022, Meta released Make-A-Video to generate video from text, which can generate short videos in seconds based on a few words or sentences. Only a week later, Alphabet Inc-CL C released Imagen Video and Phenaki, which are aimed at generating high-quality and long-term videos, respectively. At present, there are still some shortcomings in the field of cross-modal video generation in AIGC. The video generated by AI has obvious shortcomings, such as blurring and distortion of objects, and can not generate longer scenes to tell stories in detail and coherently. However, we believe that AIGC video generation is expected to achieve a breakthrough in technology and open the application ceiling.

Case 1:Make-A-Video realizes cross-modal generation between text and video

Make-A-Video can generate video based on text. Make-A-Video is a further upgrade of the text-generated image model Make-A-Scene released by Meta in July 2022. A few seconds of video can be generated by inputting text into Make-A-Video, supporting different video styles. In addition to text-generated video, Make-A-Video can also input single or two images to create motion, that is, image-generated video.

Case 2: Alphabet Inc-CL C continues to produce results in the field of cross-modal video generation

Alphabet Inc-CL C dabbled in both text-generated video and image-generated video. A week after Meta launched Make-A-Video, Alphabet Inc-CL C launched Imagen Video and Phenaki, in which Imagen Video has higher picture quality but shorter video generation time, while Phenaki generates video of poor quality but can generate more than 2 minutes of video. In November 2022, Alphabet Inc-CL C released a video that combines the two for the first time, taking into account both quality and length. On February 2, 2023, Alphabet Inc-CL C proposed a new video editing method, Dreamix, which can edit existing videos and generate videos by providing pictures and descriptions.

The GEN-1 model introduced by case 3:Runway is superior in generating video quality.

The video styles generated by the GEN-1 model are diverse. Runway was founded in 2018 and is one of the co-publishers of Stable Diffusion. In February 2023, Runway launched GEN-1, an AI video generation model, to synthesize a new video by applying the composition and style of image or text prompts to the structure of the source video, thus taking a step forward in the quality and length of the generated video.

Domestic manufacturers: also in the early stage of exploration, to assist in the improvement of creative efficiency

Domestic manufacturers are also in the early stage of exploration in the field of video generation. Domestic manufacturers' application of AIGC technology in the video field is more focused on video content creation and quality upgrading, realizing video attribute change and "pipelined" content creation. At present, it is mostly used in B-end to provide production efficiency improvement for content creators.

► text generation video: in May 2022, the Joint Zhiyuan Research Institute of Tsinghua University released the CogVideo model based on Transformer architecture. This model is the first open source text generation video AI model in the industry, but the resolution of the generated video is low, the length is limited, and currently only supports Chinese input.

► image quality enhancement and restoration: Donghong technology has been more mature in picture quality enhancement products, including video frame insertion, video detail enhancement, video picture quality enhancement, old image restoration and coloring, etc.

Automatic creation of ► video: VidPress, an intelligent video creation tool incubated by Baidu, Inc., supports automatic video content production of dubbing, subtitles and pictures after importing picture and text links. It has provided intelligent video generation function for end users of People's Daily and other media organizations, hundred accounts and good-looking video platforms.

► intelligent script creation: "Video element Analysis" launched by Shangtang Zhiying can extract and analyze a variety of elements in the video, such as characters, scenes, props, lines and other information, automatically generate sub-shot scripts with an accuracy of 98%, and extract popular video style elements, which can effectively reduce script writing time and help advertisers save content production costs.

Limited to the maturity of the technology, the video created independently by AI is still unable to realize the 2B terminal directly, but it has already made efforts in the process of assisting commercial creation. On January 31, 2023, Netflix, Japan (rinna) and WIT STUDIO jointly created the first release-level animated film "Dogs and teenagers" assisted by AIGC technology. the animation is more than 3 minutes long and uses AIGC to complete part of the scene rendering, which proves that AI technology has begun to achieve commercial landing in the process of auxiliary video creation, but there is still a long way to go before it is really applied to large-scale projects and commercialized realization.

In addition, manufacturers who land in the vertical field based on self-developed sparse models have multimodal matrices. Take going out and asking questions as an example, create multimodal AIGC product matrices such as text, image, voice, video, digital human, etc., and provide one-stop content generation tools for layout. After going out and asking about the launch of its first commercial AIGC product-dubbing platform "Magic Voice Workshop" in 2020, we have made a comprehensive layout of AI sound, AI writing, AI picture generation, voice and image cloning, digital human video and other AIGC areas, and multi-point blossom will focus on a wide range of business scenes.

Generative AI and 3D Model creation: based on Parametric Modeling, GPT word processing enabling

The 3D modeling of industrial scene requires high AI ability, and the generative design can not be fully supported at this stage. Different from the creation of pictures and videos, 3D models are mainly used in the production of industrial scenes, which requires more rigorous and rational modeling and creative ability. at present, AI tools such as ChatGPT are lack of mathematical and logical capabilities, so the progress of direct modeling of generative AI through text description is relatively slow. On the other hand, the design of large assembly scenes such as aircraft and ship models requires very rigorous processes and parameters, and we think that the support capacity of generative AI design in such large-scale scenarios is limited. At present, we observe that the main landing of AI in the field of 3D CAD and EDA is still "AI Inside" enabling.

Generative Design in 3D CAD: AI Inside Enablement based on Parametric Modeling

Generative design (Generative design) in 3D CAD scenes mainly uses the ability of AI to generate a large number of models to choose from. According to the official website of PTC, the generative design under the 3D model scene is mainly based on the designer's given constraints (including space, material, manufacturing method, cost constraints, etc.) and objectives, and with the help of the ability of AI to quickly generate a target model that meets the needs for designers to choose appropriate models for further design and optimization, so as to significantly improve design efficiency. We observe that the current AI applications in 3D CAD are mainly divided into two categories:

► AI auxiliary parameter optimization: usually used in the improvement process of 3D CAD model, based on the CAE simulation results (such as excessive stress or obvious deformation of some parts), we can generate a large number of potential parameters for the parts to be optimized and select them by adding constraints to other parts, and finally get the optimization results.

► AI realizes sketch generation: for example, the Xdesign module of Catia and Solidworks introduces the AI-aided sketch creation function to get the recommended shape given by the system by given parameters and materials. To some extent, it can help engineers to carry out the underlying geometry, so as to speed up the overall design progress.

3D CAD generative design is based on parametric modeling. In fact, parametric modeling has a long history. In 1987, Pro/E released by PTC Company introduced history-based parametric modeling for the first time. Up to now, mainstream 3D CAD products have parametric modeling functions. Whether it is AI auxiliary parameter optimization or sketch generation, it is essentially based on the given constraints to generate a large number of parameters, and then generate a design scheme for designers to choose based on these parameters. At present, the mainstream 3D CAD products, such as Catia, NX, Pro/E, Solidworks, SolidEdge and so on, all have AI module to realize the auxiliary design function.

AI Inside in EDA: design efficiency optimization based on existing design data

AI enabling is expected to help chip design to achieve real "automation". The current EDA tools, even in the more automated digital chip design process, still require a large number of manual operation scenarios of designers. We believe that the improvement in the degree of automation brought about by AI is expected to reduce repetitive work in the design process and further liberate the productivity of designers. At present, the empowerment of EDA design tools by AI can be divided into two levels: AI Inside and AI Outside: AI Inside generally refers to AI enabling the corresponding design software to make design tools more intelligent and efficient; the corresponding is AI Outside, that is, to enable machines to accumulate experience through learning, so that to a certain extent, it can replace manual work to become a new "productivity".

The backend of chip design (especially the layout and routing) is the main application scenario of AI Inside in EDA. In the digital chip design process, the most important layout and routing link at the back end of the design involves the physical shape and placement of logic devices, and engineers need to consider multiple factors such as grid node, grid granularity, wiring density and so on. Therefore, placement and routing is usually a time-consuming link in data chip design, and the design efficiency is expected to be significantly improved through the image recognition and optimization algorithm of AI. At present, overseas Cadence, Synopsys and other EDA head manufacturers have the ability to design AI Inside enabling chips:

► Cadence: in March 2020, Cadence released an updated version of the digital full-process tool, which integrates the layout and routing tool Innovus and the front-end physical verification Genus tool through iSpatial technology, and integrates machine learning technology. Users can use the existing design data to train iSpatial to minimize the design margin in the layout and routing process.

► Synopsys: DSO.ai, the AI application for EDA, was released by Synopsys in 2020. According to the company's website, Design Space Optimization (DSO) searches large design spaces with the help of machine learning algorithms, which can be used to optimize the input parameters and selection of chip design workflow to meet the exact needs of specific projects.[1]We think that it is essentially similar to the parameter optimization function in 3D CAD model design.

Looking to the future, AI Outside is expected to achieve real "chip design automation" at a higher level. Different from the concept of AI Inside enabling EDA tools, AI Outside pays more attention to the dimensions of tool users, which means that EDA tools achieve the effect of reducing manual intervention and releasing productivity by learning human design patterns and accumulating design experience. At present, both Synopsys and Cadence have explored the realization of design automation with the help of AI Outside. We think that the main resistance to the realization of AI Outside at this stage lies in the cost of data acquisition. The AI Outside training process requires high reliability of chip data, but the chip design company's data is difficult to obtain. We think that EDA company may gradually move towards the goal of AI Outside by relying on the binding relationship with the wafer factory.

The Fusion of generative Design and GPT Model: the potential path from text to Model

Imagination of the fusion of generative design and GPT model: text description parameterization. We believe that large models such as GPT still have a large application space in 3D model design. The potential direction in the future may be to understand the text needs of designers with the help of the word processing ability of ChatGPT, that is, to understand and transform the text description into a series of model parameters, and get the corresponding model design scheme through 3D CAD generative design.

► generative design is an existing technology reserve. At present, the generative design of 3D model has been able to achieve parameter optimization and sketch generation. We think that with the gradual improvement of technology, the step from given parameters to 3D model generation may not be the bottleneck from text to model.

The conversion of ► text to parameters is the biggest difficulty in the process of Vincent model. The current Transformer model is better at natural language processing of the scene. We think that it is difficult to convert the text into the parameters needed by the designer. Breaking through the bottleneck of the text description to the parameter description is expected to pave the way for the realization of the text to the model. In 2021, the Deepmind paper discussed the possibility of connecting graphics and sequences, and realized the CAD sketch generation with the help of the natural language processing ability of Transformer model.

DeepMind uses the natural language processing ability of Transformer model to realize sketching. Sketch design is the skeleton of a 3D model, which defines how the entity maintains its original shape under the parameter transformation through specific constraints. DeepMind published a paper in 2021, discussed the similarity between CAD sketch drawing and natural language modeling, and proposed a machine learning model that can generate CAD sketches automatically, which performs well in unconditional synthesis and image-to-sketch conversion tasks. The highlight of this paper is to realize the correspondence between the pattern and the sequence, so that the Transformer large model can be used to deal with the sequence. We believe that with the gradual deepening of the application of Transformer large model, its integration with CAD may continue to advance, and the application of text-based higher-level model generation may be born in the future.
Risk

Technological progress is not as expected: as a cutting-edge emerging technology, artificial intelligence is still in a period of rapid technological development, and its progress has a certain degree of uncertainty. if the technological progress is not as expected, it may lead to the slow progress of industrialization.

Commercial landing rhythm is not as expected: commercial landing is the key point for artificial intelligence to move smoothly to the next stage. If commercial landing rhythm is not as expected, it will have a negative impact on the progress of artificial intelligence.

Industry competition intensifies: artificial intelligence is a hot spot in the industry, with significant business value in the future. Technology giants and start-ups are all laid out in this field, and the industry competition in vertical category and application layer may be further intensified in the future.

Edit / irisz

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment