share_log

银河证券:Sora对算力需求影响几何?

Galaxy Securities: What impact does Sora have on computing power demand?

Zhitong Finance ·  Apr 1 14:59

Demand for Sora's computing power will grow exponentially, driving demand for computing power infrastructure.

The Zhitong Finance App learned that Galaxy Securities released a research report stating that based on the big language model, the computing power requirement for Sora was deduced. According to relevant research estimates, the Sora parameter size is estimated at 30B (to be confirmed). If measured according to this parameter, the bank estimates that Sora's computing power requirement for a single session may reach 2.6 x 10^24 Flops, which is equivalent to 8.2 times GPT-3 175B. Currently, Sora is still in its early stages. There are still examples such as being able to accurately simulate physical motion rules and scenarios, confuse left and right directions, and obfuscate spatial details. However, as Sora continues to iterate and improve, and the scale of training data sets increases, future computing power demand will explode exponentially, and we continue to be optimistic about investment opportunities in upstream computing power infrastructure.

Galaxy Securities's views are as follows:

Demand for Sora's computing power will grow exponentially, driving demand for computing power infrastructure.

In the early morning of February 16, Beijing time, OpenAI released Sora, the first Wensheng video model. It can use text commands to generate high-definition and smooth video of up to 60 seconds, which has significant advantages in generating video length, coherence, and multi-camera switching. This paper deduces the computing power requirement for Sora's single training based on the big language model. According to relevant research estimates, the Sora parameter size is estimated at 30B (to be confirmed). If measured according to this parameter, the calculation suggests that Sora's computing power requirement for a single training session may reach 2.6 x 10^24 Flops, which is equivalent to 8.2 times GPT-3 175B. The bank believes that at present, Sora is still in its infancy, and there are still opportunities to accurately simulate physical motion laws and scenarios, confuse left and right directions, and obfuscate spatial details, etc., but as Sora continues to iterate and improve, and the scale of training data sets increases, future computing power demand will explode exponentially, and it continues to be optimistic about investment opportunities in upstream computing power infrastructure.

Sora is based on the DIT architecture and uses Transformer to replace U-Net.

Sora is actually a model built based on dIT (Diffusion Transformer, Diffusion Transformer). It also uses Diffusion and Transformer, which is a new architecture for diffusion models. Inspired by the big language model, Sora replaced U-Net in the diffusion model with Transformer. By combining Transformer and Diffusion into a model DIT based on diffusion transformation, Sora tends to have a standard Transformer architecture while retaining its scalability. Similar to how big language models transform text into understandable tokens, Sora converts video into a series of patches (visual coding blocks) and reduces dimensions, uses patch as a unified form of visual image expression, predicts the original image information through noise removal, and then generates video.

Sora achieved leaps and bounds, and Wensheng's big video model opened a new era.

Sora can convert Prompts into 60-second videos, which is several levels higher than previous Wensheng video models Runway, Pika, and Stable Video. At the same time, in terms of video resolution and quality, Sora can generate 1080P resolution video, and can relatively fully understand and simulate the movement of the world and objects, and maintain stability in lens switching. In addition, Sora also supports image format input, video expansion, video splicing, etc., which is a breakthrough technological revolution in the Wensheng video field.

Investment advice: Sora is a “milestone” in the development of artificial intelligence, driving the advent of the AGI era. Demand for computing power will continue to explode, and we will continue to be optimistic about investment opportunities in the industrial chain. It is recommended to focus on domestic listed companies: 1) Domestic multi-modal models: iFLYTEK (002230.SZ), Hikvision (002415.SZ), Dahua (002236.SZ); 2) Computing power infrastructure: Industrial-Wealth Connect (601138.SH), Zhongke Shuguang (603019.SH), Softcom (301236.SZ), etc.; 3) AI application end: Wanxing Technology (300624.SZ), Jinshan Office (688111.SH), Chaotu Software (300036.SZ), etc.

Risk warning: Technology research and development progress falls short of expected risks; supply chain risks; risks where policy progress falls short of expectations; risks where consumer demand falls short of expectations; industry competition exacerbates risks, etc.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment