Recently, Fu Sheng, Chairman and CEO of Cheetah Mobile, and Chairman of Cheetah Space, pointed out in the communication with media such as Sina Technology that whether discussing the battle of hundreds of models or the battle of thousands of models, the core of competition for all companies is essentially competition over data. In the entire artificial intelligence industry, what ultimately determines the basic capabilities of models is some annotated data. Identifying high-quality data and combining it highly is the key to winning.
"In the competition of large models, the real competitive barrier comes from data. It's not that chips are unimportant, or algorithms are unimportant, but it's hard for everyone to differentiate on these two points." In Fu Sheng's view, "The technical content and differentiation of large models are actually two different things, and it is difficult to differentiate chips and algorithms at present."
At the conference, Cheetah Space released its independently developed open-source hybrid architecture expert large model Orion-MoE8×7B, and cooperated with its subsidiary Juyun Technology to launch a data service product based on this large model called AI Data Treasure AirDS (AI-Ready Data Service). According to the introduction, AI Data Treasure AirDS can provide comprehensive large model data services, covering data collection, cleaning, annotation, keyword engineering, and evaluation processes. In practical applications, it has served leading outbound brands in multiple areas such as mobile communication terminals, internet entertainment, and new energy vehicles.
"Today, you can see some models with good quality, all of which are based on data. If you read the LIama2 paper, you will find a large part discussing how to improve the quality of data." Fu Sheng said, "We want to break through this barrier because our essence is not to make money by models or by calling model interfaces. We hope to help customers with applications, using applications to help them meet a certain need, and we earn money through applications."
"Cheetah is the only company in the industry that has trained large models and opened up data annotation and data service capabilities. This is the unique aspect (differentiation) of the company in the industry at the moment." According to Fu Sheng and Sun Mingyan, Senior Vice President of Cheetah Mobile, revealed to Sina Technology, "Currently, Cheetah is more focused on applications, based on its understanding of industry data 'knowhow' and the service capabilities accumulated through long-term application practices, providing users with customized models, data annotation, and application development services throughout the process."
"To truly talk about a service, what inherent differences does it have? Each will say their service is good, has unique features, but ultimately, results speak. According to Fu Sheng, due to its complete application development and data service closed-loop capabilities, the services provided by Cheetah are higher than the level of some existing data service providers in the market."
"We collaborated with a large company, who hired a PhD and had about twenty to thirty people working on applications themselves, and the accuracy was only 40% to 50%. When we came in, we helped them achieve 90%." Fu Sheng stated. In his view, in the era of large models, scene application development, the code is not actually the most important thing; the core is still being able to train data into the model.