share_log

微软出招,新模型数学超GPT-4o编程胜Llama3.3,训练新范式引热议:midtraining

Microsoft has launched a new model, mathematically surpassing GPT-4o in programming and outperforming Llama 3.3, sparking heated discussions about a new training paradigm: midtraining.

Breakings ·  Dec 17, 2024 03:51

Microsoft has introduced its latest small model Phi-4. With only 14 billion parameters, its MMLU performance is comparable to that of large models at the 70 billion level like Llama 3.3 and Qwen 2.5. In terms of mathematical ability, Phi-4 has surpassed many large models such as GPT-4o in the USA Math Competition AMC 10/12, scoring above 90. Its programming capabilities are also top-notch among open-source models, exceeding the 70 billion Llama 3.3 and 72 billion Qwen 2.5. Microsoft also proposed a new training paradigm in its technical report—midtraining. (Quantum Bit)

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment