share_log

英伟达开源Nemotron-4 340B系列模型 用于训练LLM

NVIDIA has open-sourced the Nemotron-4 340B series model for training LLM.

Breakings ·  Jun 16 19:23
Recently, Nvidia has released open source Nemotron-4 340B (34 billion parameters) series models. Developers can use this series of models to generate synthetic data for training large language models (LLMs) for commercial applications in medical care, finance, manufacturing, retail, and other industries. Nemotron-4 340B includes base model Base, instruction model Instruct, and reward model Reward. Nvidia trained it with 9 trillion tokens (text units). Nemotron-4 340B-Base can compete with Llama-3 70B, Mixtral 8x22B, and Qwen-2 72B models in common sense reasoning tasks such as ARC-c, MMLU, and BBH benchmark tests.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment