share_log

AI大模型训练数据版权问题凸显 优质训练数据库价值有望重估

Copyright issues with AI big model training data highlight that the value of high-quality training databases is expected to be re-evaluated

cls.cn ·  Feb 23 07:49

① People familiar with the matter said that the US social media platform Reddit has reached an agreement with Google to use its content to train the latter's artificial intelligence model. ② Artificial intelligence will need to pay media brands when using media brand content to train large models, which means that AI big models may pay data providers for intellectual property rights and become an industry trend.

People familiar with the matter said that the US social media platform Reddit has reached an agreement with Google to use its content to train the latter's artificial intelligence model. According to reports, the value of the agreement is about 60 million US dollars per year. Reddit has publicly submitted US IPO documents, with Morgan Stanley, Goldman Sachs, J.P. Morgan Chase, and Bank of America as the lead banks.

Recently, news publishing giant Axel Springer (Axel Springer) signed an agreement with ChatGPT development agency OpenAI to become the first publishing organization in the world to cooperate with OpenAI to further integrate journalism and artificial intelligence technology. Galaxy Securities pointed out that the agreement signed between Open AI and Axel Springer indicates that artificial intelligence will need to pay media brands when using media brand content to train big models, which means that AI big models may pay the data provider's intellectual property rights may become an industry trend. Currently, AI policies have been introduced intensively, and the copyright issues of high-quality data sets and training data have received attention, and the value of high-quality training databases will be highlighted in the future. Most companies in the publishing industry have rich electronic graphic resources, which can be used as an important data set for training large models at home and abroad. The resource advantages of publishing industry companies in terms of copyright and IP are expected to help them become a key support for the development of large AI models at home and abroad.

According to the Finance Federation's theme library, among the relevant listed companies:

CITIC Publishing has tried to cooperate with authors and big model companies in language training to develop smart reading application products. For example, the company's knowledge service platform and Baidu jointly released the “CITIC Academy AI Reading Assistant” plug-in.

Palm Reading Technology is the industry leader in copyright reserves in the field of literature and reading. These Chinese materials can be used for training vertical models in the online writing industry. Currently, the company and Byte have carried out in-depth cooperation mainly in various areas such as copyright for digital reading, content production, and advertising commercialization.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment