share_log

挑战GPT!Meta推出最强开源模型Llama 3,社交媒体全线配“最智能”免费AI助手

Challenge GPT! Meta launches Llama 3, the strongest open source model, with the “smartest” free AI assistant for all social media

硬AI ·  Apr 19 08:42

Source: Hard AI

The maximum parameter scale of Llama 3 exceeds 400 billion, and the training token exceeds 15 trillion yuan. Compared with GPT-3.5, the winning rate of various human evaluation tests is over 60%; Amazon, Microsoft, and Google Cloud will launch Llama 3, and Nvidia, Intel, and AMD hardware platforms will support Llama 3. According to Nvidia, Meta trains Llama 3 using a computer cluster with over 24,000 H100 chips. The Meta AI Assistant will launch an English version in 13 countries other than the US. It can be used on both mobile phones and computers. There is no need to switch apps to search. The Wensheng image function can update images in real time according to prompts and generate animated GIFs.

OpenAI's GPT ushered in a strong rival,$Meta Platforms (META.US)$The latest round of challenges is being launched.

On Thursday, April 18, EST, Meta announced the launch of its third-generation big language model (LLM), Llama 3, calling it “the most capable open source LLM so far” and upgraded the artificial intelligence (AI) assistant Meta AI based on Llama 3, calling it “the smartest AI assistant you can use for free now.”

Meta announced that Llama 3 will be launched on cloud platforms such as Amazon, Microsoft, and Google Cloud, and will receive hardware support from chip giants such as Nvidia and Dell. Nvidia revealed that Meta trains Llama 3 on a computer cluster with more than 24,000 H100 chips. Llama 3, which is supported by Nvidia products and services, is used in the fields of cloud, edge computing, robotics, and PCs.

The maximum parameter scale of Llama 3 exceeds 400 billion, and the training token exceeds 15 trillion

There are three versions of Llama 2 released by Meta in July last year. The largest version, 70B, has a parameter scale of 70 billion. According to Meta this Thursday, there are two versions of Llama 3, 8B and 70B. Meta CEO Zuckerberg said that the larger version of Llama 3 will have more than 400 billion parameters. Meta has not revealed whether it will open source Llama 3 with 400 billion parameters; it is currently undergoing training.

Compared to its predecessor, Llama 3 is a qualitative leap forward. Llama 2 uses 2 trillion tokens for training, while training the 3 major versions of Llama have more than 15 trillion tokens.

Meta said that due to pre-training and post-training improvements, its pre-training and instruction tuning model is currently the best model with two parameter scales of 8B and 70B. After the post-training program was improved, the model's erroneous rejection rate (FRR) dropped drastically, consistency improved, and the diversity of model responses increased. Llama 3 is a huge improvement over Llama 2 in terms of functions such as inference, code generation, and instruction tracking, making Llama 3 easier to manipulate.

As can be seen in the figure below, the 8B and 70B versions of the Llama 3 instruction tuning models all scored higher in terms of large-scale multitask language understanding data set (MMLU), graduate level expert reasoning (GPQA), mathematical evaluation set (GSM8K), programming multilingual test (HumanEval), etc. than Mistral, Google's Gemma and Gemini, and Anthropic's Cla3.

The 8B and 70B versions of pre-trained Llama 3 are superior to Mistral, Gemma, Gemini, and Mixtral in various performance tests.

According to Meta, it has developed a new set of high-quality human assessments, including 1,800 cues covering 12 key use cases, such as seeking suggestions, brainstorming, classification, closed-paper Q&A, open question and answer, coding, creative writing, extraction, and characterization, reasoning, rewriting, and summarization. As can be seen in the figure below, in the human assessment set, the 70B version of the instruction tuning Llama 3 was superior to Claude Sonnet, Mistral Medium, GPT-3.5, and Llama 2, winning rates of 52.9%, 59.3%, 63.2%, and 63.7%, respectively.

For future use cases in multiple languages, over 5% of the Llama 3 pre-trained data set is high-quality non-English data covering more than 30 languages. However, Meta anticipates that the performance for languages other than English will not match that of English.

Meta predicts. In the next few months, new features of Llama 3 will be launched. The context window will be longer, the performance will be stronger, and there will also be a new size version of the model. Meta will also share Llama 3 research papers.

Amazon and other cloud platforms will launch Llama 3 with over 24,000 Nvidia H100 chip training Llama 3

According to Meta, the Llama 3 model will soon be launched on Amazon Cloud AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM's cloud platforms WatsonX, Microsoft Cloud Azure, Nvidia's NIM and Snowflake, and will be supported by hardware platforms provided by AMD, AWS, Dell, Intel, and Nvidia.

Nvidia revealed on the same day that Meta engineers trained Llama 3 on a computer cluster containing 245.76 million Nvidia H100 Tensor Core GPUs connected to the Nvidia Quantum-2 InfiniBand network. With support from Nvidia, Meta adjusted the network, software, and model architecture for its LLM. Furthermore, in order to further advance the advanced level of generative AI, Meta recently announced plans to use 350,000 H100 chips in its infrastructure.

According to Nvidia, Llama 3, powered by Nvidia chips, is now available and can be used in the cloud, data centers, edge computing, and personal computers (PCs). Developers can try out Llama 3 through Nvidia's website ai.Nvidia.com, and enterprise users can use Nvidia's end-to-end cloud-native framework NEMO to tune Llama 3 using their own data.

Llama 3 can also run on Nvidia's Jetson Orin module for robotics development, for robots and edge computing devices, creating interactive agents like the Jetson AI lab. Additionally, NVIDIA RTX and GeForce RTX GPUs for workstations and PCs speed up Llama 3's inference.

Thirteen countries other than the US have launched the English version of Meta AI on mobile phones and computers, and can use the Wenshengtu function Image to update images and generate GIFs in real time

According to Meta, users can use Meta AI to work, learn, create, and connect with the things they value on its social media Facebook, Instagram, WhatsApp, and Messenger.

Meta said it will launch an English-language version of Meta AI in 13 countries other than the US, including Canada, Australia, New Zealand, Singapore, South Africa, Nigeria, Pakistan, Ghana, Jamaica, Malawi, Uganda, Zambia and Zimbabwe.

What can Meta AI do? Meta gave some examples, such as planning how to have fun with friends at night, recommending a restaurant where you can enjoy the sunset and offer vegetarian options, finding concerts on weekend nights, providing suggestions on places to picnic, and explaining how genetic traits play a role in schoolwork.

Meta also mentioned a new feature — an AI image generation feature called image, which allows users to generate images based on text in WhatsApp and Meta AI websites. Using this function, Meta AI can “imagine” and generate images according to the user's desired aesthetic requirements to provide inspiration for the user's actual shopping.

Zuckerberg said the Image service will update images in real time as users enter more detailed prompts, and can create custom animated GIFs.

Meta said that when users start typing in a prompt, they will see an image appear, and every time a few letters are entered, the image changes.

According to Meta, if users find an image they like, Meta AI can animate it or convert it into a GIF to share with friends.

In addition to mobile phone users, Meta also takes into account computer users and launched the meta.ai website, so that users can also use Meta AI when completing work on a computer, so that it can help solve math problems and make the content of work emails more professional. Users can also log on to the website to save conversations with Meta AI for future reference.

Meta AI can also perform real-time web searches on Facebook, Instagram, WhatsApp, and Messenger. Users can access real-time information via the web without having to switch between these social media apps. The fake user is planning how to arrange a ski trip in a group chat on Messenger. Using the search in Messenger, you can ask Meta AI to find flights from New York to Colorado and find weekends with relatively few travelers, all without leaving the Messenger app.

Users can also access Meta AI when scrolling through Facebook feeds. If they find an interesting post, users can directly ask Meta AI to obtain more relevant information after opening the post. For example, if you see a picture of the Northern Lights in Iceland, you can ask Meta AI what time of year is best to see the Northern Lights.

editor/tolk

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment