Open source and equally capable as O1! Alibaba and Hifanty successively release heavyweight new products, large-scale reasoning model closely approaching OpenAI.

wallstreetcn · Nov 29, 2024 12:57

业内推理大模型兴起，为小型AI开发商提供赶超机会，而且推理模型的开发成本低于传统大模型，后进者在构建大模型时可借鉴OpenAI等的研究论文和数据。

在OpenAI发布具有突破性推理能力的模型后，人工智能的推理能力竞赛已然打响，阿里、幻方相继发布重磅新品，不仅性能比肩o1模型而且是开源！

周四，阿里通义千问推出QwQ-32B-Preview开源模型，包含325亿个参数，能够处理最长32000 个 tokens的提示词。在 AIME和MATH基准测试中，表现优于OpenAI推理模型o1-preview 和 o1-mini。

QwQ是少数能与o1匹敌的模型之一，其在数学和编程领域，尤其在需要深度推理的复杂问题上表现出色，而且它可以用于商业应用。

上周，量化巨头幻方DeepSeek-R1-Lite模型，Preview版在难度较高数学和代码任务上超越o1-preview，大幅领先GPT-4o等。在AIME测试基准中，随着计算时间增加，其得分稳步提升。

值得一提的是，官方还表示，目前模型仍在开发阶段，经持续迭代，正式版DeepSeek-R1模型将完全开源。

阿里、幻方模型崭露头角，预示着业内推理AI正在兴起，这可能为小型AI开发商提供赶超机会，打破目前由少数几家科技巨头主导的局面。

于今年第二季度开始研究推理模型的初创公司Fireworks，其联合创始人兼首席执行官Lin Qiao表示：

整个开源社区……将会以超快的速度推出推理模型。

此外，科技巨头们也加大了推理模型研发力度，谷歌已将其推理模型团队规模从 o1-preview 发布前的几十人扩大到 200 人左右，谷歌还为该团队提供了更多算力资源。

后进者更具成本优势，思维链成大模型关键

后进者在构建大模型方面更具成本优势。

后进者在开发OpenAI替代品时，似乎受益于斯坦福大学、谷歌、Meta Platforms和OpenAI自身研究人员近年来发布的关于推理的论文。推理模型的开发成本低于传统的LLMs，如GPT-4o，传统模型需要花费数亿美元在计算资源和训练数据上，并需要合法获取这些数据。

新模型可以帮助 OpenAI 及其竞争对手开发能够完成困难项目的编码助手。例如，微软和 Salesforce 等企业软件公司可以利用它们来改进代表客户采取行动的代理，例如安排预约。

值得一提的是，研究人员可以通过让其他模型生成解决问题的思维过程，然后将这些过程用于训练LLM，从而将推理能力融入现有的LLMs中。

一些研究人员还免费向其他开发人员开放了以推理为重点的数据集。例如，阿里巴巴表示，它使用了Open o1其中一个研究小组的数据来构建推理模型。

人工智能初创公司Anyscale和Databricks的联合创始人 Ion Stoica 表示：

在开发推理模型方面，OpenAI的竞争对手并没有明显的劣势。

编辑/lambor

The rise of large inference models in the industry provides an opportunity for smaller AI developers to catch up, and the development costs of inference models are lower than those of traditional large models. Latecomers can reference research papers and data from organizations like OpenAI when building large models.

After OpenAI released models with breakthrough reasoning capabilities, the competition for ai reasoning capabilities has begun, with alibaba and hengfang sequentially launching significant new products that not only match the performance of the o1 model but are also open source!

On Thursday, alibaba's Tongyi Qianwen launched the QwQ-32B-Preview open-source model, which includes 32.5 billion parameters and can handle prompts of up to 32,000 tokens. It outperforms OpenAI's reasoning models o1-preview and o1-mini in AIME and MATH benchmark tests.

QwQ is one of the few models that can compete with o1, excelling in mathematics and programming, especially in complex problems requiring deep reasoning, and it can be used for commercial applications.

Last week, the algo giant hengfang's DeepSeek-R1-Lite model preview surpassed o1-preview on challenging mathematics and coding tasks, significantly outpacing GPT-4o and others. In the AIME test benchmark, its score steadily increased with computation time.

It is noteworthy that the officials also stated that the model is still in development and, through continuous iterations, the official version of the DeepSeek-R1 model will be fully open-sourced.

The emergence of alibaba and hengfang's models signals the rise of reasoning ai in the industry, which may provide opportunities for small ai developers to catch up, breaking the current dominance of a few technology giants.

Fireworks, a startup that began researching reasoning models in the second quarter of this year, has its co-founder and CEO Lin Qiao stating:

The entire open-source community... will release inference models at a super fast pace.

In addition, technology giants are also increasing their efforts in the development of inference models. Google has expanded the scale of its inference model team from a few dozen people before the o1-preview release to around 200 people, and Google has also provided more computing resources for this team.

Latecomers have more cost advantages; the thinking chain is key to large models.

Latecomers have more cost advantages in building large models.

Latecomers seem to benefit from the research papers on inference published in recent years by Stanford University, Google, meta platforms, and OpenAI's own researchers when developing alternatives to OpenAI. The development cost of inference models is lower than that of traditional LLMs, such as GPT-4o, which require hundreds of millions of dollars on computational resources and training data, and must legally obtain that data.

The new models can help OpenAI and its competitors develop coding assistants capable of completing difficult projects. For example, enterprise software companies like microsoft and Salesforce can use them to improve agents that take action on behalf of clients, such as scheduling appointments.

It is worth mentioning that researchers can incorporate reasoning capabilities into existing LLMs by having other models generate the thinking processes for solving problems, which can then be used to train LLMs.

Some researchers have also made inference-focused datasets available for free to other developers. For example, alibaba stated that it used data from one of Open o1's research teams to construct its inference model.

Ion Stoica, the co-founder of the ai startup Anyscale and Databricks, stated:

In developing inference models, competitors of OpenAI do not have a clear disadvantage.

Editor/Lambor

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

DeepSeek message dynamic tracking

开源且能力比肩o1！阿里、幻方相继发布重磅新品，推理大模型直逼OpenAI

Open source and equally capable as O1! Alibaba and Hifanty successively release heavyweight new products, large-scale reasoning model closely approaching OpenAI.

后进者更具成本优势，思维链成大模型关键

Latecomers have more cost advantages; the thinking chain is key to large models.

Risk Disclaimer

Statement