share_log

谷歌AI视频模型深夜炸场:4K分辨率+2分钟时长 又给OpenAI“来了一枪”

Google's AI video model caused a sensation late at night: 4K resolution + 2 minutes duration, hitting OpenAI "with a shot" again.

cls.cn ·  12:56

① Google stated that the technical advancements of Veo 2 are primarily reflected in the physical engine, photography technology, and character expressiveness; ② In performance evaluations, Veo 2 surpassed models such as Sora Turbo, Kelin, and MiniMaX; ③ Brokerage opinions suggest that as AI video generation tools continue to iterate, their penetration into various application scenarios is expected to accelerate in the future.

According to the Star Daily on December 17, Sora officially released the 2.0 version—Veo 2—just 8 days after its launch.

According to Google's official website, Veo 2 can create videos with realistic movements and high-quality output. Its maximum resolution can reach 4K, with durations exceeding 2 minutes. Google stated that Veo represents a significant advancement in the field of high-quality video generation.

In addition to improvements in video clarity, Veo 2's understanding of the physical world and lens control is also impressive:

Some of the prompt words for this scene are: low-angle tracking shot, 18mm lens. Autos drifting, leaving light trails and tire smoke. The camera tracks slowly, capturing the moment when a sleek olive green muscle car approaches the corner.

This scene's prompt words include: close-up shot focusing on a female DJ's face; her beautiful, thick black curly hair outlines her features. She closes her eyes, immersed in the rhythm, with a slight smile on her lips. As she nods and sways with the beat, the camera captures her subtle head movements.

From the above examples, it can be seen that Veo 2 performs well in recreating the real world and following Prompt instructions. Additionally, in the performance evaluation released on Google's official website, Veo 2 also defeated many domestic and foreign AI text-to-video models such as Sora Turbo, Kelin, and MiniMaX.

In summary, Google stated that the advancements of Veo 2 are mainly reflected in three aspects: first, optimization of the physical engine, which determines the AI video model's profound understanding of the physical laws of the real world. Second, integration of photography technology, able to produce richer visual effects. Lastly, an enhancement in character expressiveness, making character movements and expressions more realistic.

Currently, Veo 2 has been introduced to Google's video creation tool, VedioFX. Just last week, OpenAI officially launched Sora Turbo, which is open to paid ChatGPT users in the USA and other markets. The new tool Sora Turbo can generate videos lasting up to 20 seconds and can provide various versions of these videos.

Meanwhile, there has been continuous progress in domestic text-to-video generation. Since the beginning of this year, domestic companies have begun to accelerate the development and iteration of AI video generation products, with product capabilities continuously improving:

In June of this year, the Kuaishou AI team released the Kegong AI video generation large model, capable of generating videos up to 2 minutes long with a resolution of up to 1080p.

In July, Zhipu AI launched its video generation product, Zhipu Qingying, and upgraded it in November to support generating 10-second 4K Ultra HD videos.

In August, ByteDance launched the Dream AI one-stop creation platform, and in November announced the launch of the Dream AI video models S2.0Pro and P2.0Pro.

On December 12, Shanxi Securities' Research Reports pointed out that as AI video generation tools continue to iterate, their penetration into various application scenarios is expected to accelerate in the future. On one hand, the application layer suggests focusing on areas related to creativity, design, education, and particularly those closely related to video generation; on the other hand, the demand for computing power for video generation models is significantly higher than that for text, with a focus on AI computing power-related symbols.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment