Track the latest AI trends

The return of the king of Llama 3 can compete with GPT-4. Is the open source model about to catch up with the closed source model?

wallstreetcn · Apr 19 14:38

来源：华尔街见闻

“有史以来最强大的开源大模型”Llama 3引爆AI圈，马斯克点赞，英伟达高级科学家Jim Fan直言，Llama 3将成为AI大模型发展历程的“分水岭”，AI顶尖专家吴恩达称Llama3是他收到的最好的礼物。

4月18日，AI圈再迎重磅消息，$Meta Platforms (META.US)$带着号称“有史以来最强大的开源大模型”Llama 3登场了。

Meta本次开源了Llama 3 8B与70B两款不同规模的模型，供外部开发者免费使用，未来几个月，Meta 将陆续推出一系列具备多模态、多语言对话、更长上下文窗口等能力的新模型。其中，大版本的Llama 3将有超过4000亿参数有望与Claude 3“一较高下”。

与此同时，Meta首席执行官扎克伯格宣布，基于最新的Llama 3模型，Meta AI助手现在已经覆盖Instagram、WhatsApp、Facebook等全系应用，并单独开启了网站，还有一个图像生成器，可根据自然语言提示词生成图片。

Llama 3的出现直接对标OpenAI的GPT-4，与“并不Open”的OpenAI截然不同，在AI圈围绕开源或闭源的路线争论不休之时，Meta坚定沿着开源路线朝AGI的圣杯发起了冲锋，为开源模型扳回一局。

知情人士透露，研究人员尚未开始对Llama 3进行微调，还未决定Llama 3是否将是多模态模型。有消息称，正式版的Llama 3将会在今年7月正式推出。

Meta AI 首席科学家、图灵奖得主Yann LeCun一边为Llama 3的发布“摇旗呐喊”，一边预告未来几个月将推出更多版本，称Llama 3 8B和Llama 3 70B是目前同体量下，性能最好的开源模型。llama 3 8B在某些测试集上性能比llama 2 70B还要强。

就连马斯克也现身于该评论区，一句简洁的“Not bad”表达了对 Llama 3 的认可和期待。

英伟达高级科学家Jim Fan认为，Llama 3的推出已经脱离了技术层面的进步，更是开源模型与顶尖闭源模型可分庭抗礼的象征。

从Jim Fan分享的基准测试可以看出，Llama 3 400B 的实力几乎媲美 Claude“超大杯”以及新版 GPT-4 Turbo，将成为“分水岭”，相信它将释放巨大的研究潜力，推动整个生态系统的发展，开源社区或将能用上GPT-4级别的模型。

公布当天恰逢斯坦福大学教授，AI顶尖专家吴恩达的生日，吴恩达直言，Llama 3的发布是自己这辈子收到过的最好的礼物，谢谢你Meta！

OpenAI创始成员之一、特斯拉前AI总监Andrej Karpathy也对Llama 3表达了赞许。作为大语言模型领域的先驱之一，Karpathy认为Llama3的性能已接近GPT-4 的水平：

Llama3是Meta 发布的看起来非常强大的模型。坚持基本原则，在可靠的系统和数据工作上花费大量高质量时间，探索长期训练模型的极限。我也对 400B模型非常兴奋，它可能是第一个 GPT-4 级别的开源模型。我想很多人会要求更长的上下文长度。
我希望能有比 8B 更小参数，理想规模在0.1B到1B左右的模型，用于教育工作、(单元)测试、嵌入式应用等。

Rebuy公司AI总监、深度学习领域的博士Cameron R. Wolfe认为，Llama 3证明了训练优秀大语言模型的关键在于数据质量。他详细分析了Llama 3在数据方面做出的努力，包括：

1）15万亿个token的预训练数据: 比Llama 2多7倍，比DBRX的12万亿个还要多；
2）更多代码数据: 预训练过程中包含更多代码数据，提升了模型的推理能力；
3）更高效的tokenizer: 拥有更大的词汇表（128K tokens），提高了模型的效率和性能。

在Llama 3发布后，小扎向媒体表示，“我们的目标不是与开源模型竞争，而是要超过所有人，打造最领先的人工智能。”未来，Meta团队将会公布Llama 3的技术报告，披露模型更多的细节。

这场关于开源与闭源的辩论还远未结束，暗中蓄势待发的 GPT-4.5/5 也许会在今年夏天到来，AI领域的大模型之战还在上演。

编辑/tolk

Source: Wall Street News

Llama 3, the “most powerful open source model ever,” set off the AI community. Musk praised it. Nvidia senior scientist Jim Fan said bluntly that Llama 3 will be a “watershed moment” in the development process of the AI big model. Top AI expert Wu Enda said that Llama 3 was the best gift he received.

On April 18, the AI community welcomed another big news.$Meta Platforms (META.US)$Llama 3 came out with what it claims to be “the most powerful open source big model ever created.”

Meta has now open sourced two Llama 38B and 70B models of different sizes for external developers to use for free. Over the next few months, Meta will launch a series of new models with multi-modality, multi-language dialogue, and longer context windows. Among them, the larger version of Llama 3 will have more than 400 billion parameters and is expected to “compete” with Claude 3.

Meanwhile, Meta CEO Zuckerberg announced that based on the latest Llama 3 model, Meta AI Assistant has now covered all applications such as Instagram, WhatsApp, and Facebook, and has launched a separate website. There is also an image generator that can generate images based on natural language prompts.

The advent of Llama 3, which directly targets OpenAI's GPT-4, is quite different from OpenAI, which is “not open.” At a time when the AI community was constantly debating the open source or closed source route, Meta firmly followed the open source route to launch a charge towards the Holy Grail of AGI, bringing back the game for the open source model.

People familiar with the matter revealed that researchers have yet to fine-tune Llama 3 and have yet to decide whether Llama 3 will be a multi-modal model. According to some sources, the official version of Llama 3 will be officially launched in July of this year.

Meta AI chief scientist and Turing Award winner Yann LeCun is “rocking the flag” for the release of Llama 3, while predicting that more versions will be launched in the next few months, saying that Llama 3 8B and Llama 3 70B are currently the best performing open source models at the same size. llama 3 8b performed better than llama 2 70B on some test sets.

Even Musk appeared in this comment section, and the simple phrase “Not bad” expressed his recognition and expectations for Llama 3.

Jim Fan, a senior scientist at Nvidia, believes that the launch of Llama 3 has separated from technological progress, and is also a symbol that the open source model can be divided between the open source model and the top closed source model.

As can be seen from the benchmark tests shared by Jim Fan, the Llama 3 400B's strength is almost comparable to Claude's “Super Cup” and the new GPT-4 Turbo, and will become a “watershed”. It is believed that it will unleash huge research potential and drive the development of the entire ecosystem. The open source community may be able to use GPT-4 models.

The day of the announcement coincided with the birthday of Stanford University professor Wu Enda, a top AI expert. Wu Enda said bluntly that the release of Llama 3 was the best gift she has ever received in her life. Thank you Meta!

Andrej Karpathy, one of the founding members of OpenAI and former AI director of Tesla, also praised Llama 3. As one of the pioneers in the field of big language models, Karpathy believes that LLAMA3's performance is close to GPT-4's:

Llama3 is what appears to be a very powerful model released by Meta. Adhere to basic principles, spend a lot of quality time working with reliable systems and data, and explore the limits of long-term training models. I'm also very excited about the 400B model, which is probably the first GPT-4 level open source model. I think a lot of people would ask for a longer context length.
I would like a model with smaller parameters than 8B and an ideal scale of 0.1B to 1B for educational work, (unit) testing, embedded applications, etc.

Cameron R. Wolfe, AI director at Rebuy and a PhD in deep learning, believes that Llama 3 proved that the key to training excellent big language models is data quality. He gave a detailed analysis of Llama 3's data efforts, including:

1) Pre-training data of 15 trillion tokens: 7 times more than Llama 2, and more than DBRX's 12 trillion;
2) More code data: More code data is included in the pre-training process, improving the model's reasoning ability;
3) More efficient tokenizer: Having a larger glossary (128K tokens) has improved the efficiency and performance of the model.

After Llama 3 was released, Xiaoza told the media, “Our goal is not to compete with the open source model, but to surpass everyone and create the most advanced artificial intelligence.” In the future, the Meta team will release a technical report on Llama 3, revealing more details about the model.

This debate about open source and closed source is far from over. GPT-4.5/5, which is secretly poised, may arrive this summer, and the big model battle in the AI field is still going on.

editor/tolk

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

Track the latest AI trends

Llama 3王者归来，可与GPT-4分庭抗礼，开源模型即将追上闭源模型了？

The return of the king of Llama 3 can compete with GPT-4. Is the open source model about to catch up with the closed source model?

Risk Disclaimer

Statement