share_log

腾讯COO任宇昕官宣!腾讯发布首个AI药物研发平台「云深智药」

Tencent COO Ren Yuxin made an official announcement! Tencent Releases First AI Drug Development Platform “Yunshen Zhiyao”

雷锋网 ·  Jul 9, 2020 15:07

Original title: Tencent COO Ren Yuxin's official announcement! Tencent Releases First AI Drug Development Platform “Yunshen Zhiyao”

c318-iwasyei5881141.jpg

According to Lei Feng Network, on July 9, the 2020 World Artificial Intelligence Conference Cloud Summit opened. At the conference, Tencent Chief Operating Officer Ren Yuxin announced the latest developments in the field of using AI to aid drug development — the first AI-driven drug discovery platform “iDrug (iDrug)” independently developed by Tencent was officially unveiled to the public.

The launch of the Yunshen Intelligent Pharmaceutical platform will help developers improve the efficiency of preclinical drug discovery. It is expected to ease the pain points where the pharmaceutical industry urgently needs to carry out drug development quickly and at low cost under the threat of COVID-19.

Tencent has reached cooperation with a number of pharmaceutical companies to apply AI models to actual drug development projects. At present, more than 10 projects, including research and development of anti-coronavirus drugs, are running steadily on the Yunshen Intelligent Pharmaceutical platform.

The name “Yunshen Intelligent Medicine” comes from the Tang poem “Hidden Seekers Never Meet”, “Only in this Mountain, Yunshen Knows No Place”, implying the similar process behind the development of new drugs.

The platform aims to cover the entire process of pre-clinical drug development, including five major modules, including protein structure prediction, virtual screening, molecular design/optimization, ADMET property prediction (soon to be open source), and synthesis route planning.

2d9c-iwasyei5881139.jpg

Protein structure prediction, as the basis for drug design, is critical to understanding the interactions between molecules in living organisms. Previously, pharmaceutical companies, scientific research institutions, etc. carried out experimental measurements of protein structures through traditional methods, which were often difficult, long, and expensive.

However, after predicting protein structures and functions through deep learning models, computers can quickly and in a targeted manner find potential emerging compounds from hundreds of millions of massive small molecules, effectively improving R&D efficiency.

This time, on the Yunshen Intelligent Pharmaceutical platform, Tencent AI Lab applied a new algorithm for predicting protein structures. According to the data, Tencent's new algorithm improved significantly in difficult cases (hard), a 10% increase over Robetta, an authoritative method recognized by the industry.

Since joining CAMEO, the world's authoritative testing platform for protein structure prediction, in 2020, the Tencent AI Lab team has won the monthly championship five times within six months with this self-developed algorithm.

The innovative idea of this algorithm has also been applied to the Yunshen Zhiyao platform, which will further exert its application value in the discovery of new targets and disease mechanism research.

In terms of virtual drug screening and prediction of ADMET properties, Tencent AI Lab also achieved high accuracy in several public data sets, breaking industry standards. The subsequent ADMET prediction module will open source the large-scale self-supervised molecular map pre-training GX model, and the molecular generation model is also expected to be open source in the second half of the year.

I learned from Lei Feng Network that currently,Two tool modules for virtual screening and ADMET nature predictionIt is already open to the public for free use. Modules such as protein structure prediction, molecular design/optimization, and synthesis route planning will also be launched one after another in the next few months. Subsequent platforms will also develop more drug discovery modules and analysis functions.

In addition to being able to use the core functions of the platform for free, pharmaceutical companies and scientific research institutions can also jointly develop customized AI tools with Tencent.

The Yunshen Intelligent Pharmaceutical platform combines the advantages of Tencent AI Lab and Tencent Cloud in cutting-edge algorithms, optimized databases, and computing resources. Users no longer need to deploy on their own; they can quickly introduce AI capabilities into existing R&D processes, and research can be carried out more conveniently.

The following is a detailed technical explanation

The platform provides integrated database-algorithm-computing power services

AI helps drug development. The three elements of algorithm, computing power, and data are indispensable and complement each other. Advanced algorithms can dig deep into existing big data and analyze implicit relationships between data.

This process not only directly assists in the discovery of new drugs, but also integrates a large number of existing databases, while promoting the generation and accumulation of new data to better optimize algorithms. The optimized algorithm can in turn reduce the model's dependence on the amount of data and improve the model's versatility.

Tencent's computing power support speeds up database storage searches, algorithm iterations, and greatly shortens the computational time for using models.

23a8-iwasyei5881187.jpg

In addition to continuously innovating in the field of algorithms, the Yunshen Intelligent Pharmaceutical platform also provides integrated service support for computing power and databases.

On the data side, molecular big data is the infrastructure in drug development.

Existing public data sets of drug molecules, represented by PubChem and ChemBL, have diverse sources. However, since the data comes from different experimental environments of different institutions, there are problems where the data is difficult to align, many fields are missing, and the overall quality is poor, making it difficult to directly use it to develop predictive models.

The molecular big data used by the Yunshen Intelligent Pharmaceutical Platform has carried out detailed cleaning and collation work in multiple stages based on existing public data sets, obtained drug molecular big data sets that can be used to directly construct deep learning models, and has been applied and verified in various drug development projects. The cleaning process has greatly improved the results of many projects.

Large data sets that have been cleaned and have opened up multiple databases have been launched one after another.

In terms of computing power, Tencent Cloud provides computing resources for the Yunshen Intelligent Pharmaceutical platform. Pharmaceutical companies and scientific research institutions can log on to the platform to carry out research. There is no need to deploy on their own, and AI capabilities can be quickly introduced into the existing R&D process.

Platform functions cover the whole process of new drug discovery

The pre-clinical drug discovery process goes through discovery and verification of targets, discovery of emerging compounds, discovery and optimization of lead compounds, to confirmation and development of clinical candidate compounds. The “Yunshen Intelligent Medicine” platform covers the entire process of pre-clinical drug discovery.

The first step in the discovery of a new drug is target identification and confirmation. Finding the drug's action site in the body and determining the structure of the target protein is a key task, and is regarded as an important cornerstone of drug development.

For example, if a protein is involved in a certain disease and becomes an important link in a critical pathway, then when researchers understand the protein's structure, they can design drug molecules in a targeted manner to regulate the function of the protein.

Experimental determination of protein structure is often difficult, long, and expensive; after predicting protein structure and function through deep learning models, computers can quickly and in a targeted manner find potential emerging compounds from hundreds of millions of small molecules.

Lei Feng network(Official account: Lei Feng Network)I learned that the protein structure prediction method adopted by the “Yunshen Intelligent Pharmaceutical” platform reached an international leading level in terms of accuracy, thanks to breakthroughs in two key technologies.

The first is a protein folding method based on self-supervised learningIt does not rely on homologous sequences, but rather learns coevolutionary patterns directly from the sequence database through self-supervised learning, so that pseudo-homologous sequences containing coevolutionary information can be generated from nothing, and ultimately these proteins can be effectively folded;

The second is through an iterable method based on deep learningIt effectively integrates template modeling and free modeling. For the first time, dynamic and iterable amino acid constraints on specificity were proposed, which significantly improved the accuracy of modeling, thereby improving the folding protein.

Screening emerging compounds for targets is the second step in the discovery of new drugs. Compared with traditional experimental screening, virtual screening by computational methods does not consume compound samples and can greatly save human and material resources.

Ligand-based drug design (LBDD) is one of the common methods of virtual screening. It refers to learning and establishing models of relationships between molecular structure and activity from known small ligand structures to predict the activity of new compounds.

Since the measured compound activity data for many targets is very limited, the accuracy of the prediction model is seriously limited.

AI methods are expected to solve this problem: for example, the virtual screening module of the “Yunshen Zhiyao” platform uses meta-learning and deep neural network algorithms for LBDD tasks for the first time, and applies knowledge learned from other targets through AI (such as the influence of local molecular structures on target binding strength) to improve model prediction accuracy.

Currently, the algorithm's median prediction accuracy (correlation between predictive activity and experimental measured activity) on thousands of experimental data sets has increased from the current record of 0.36 to 0.42, and the percentage of models that can be screened has increased from 56% to 60%, breaking industry standards.

Entering the later stages of drug development, it is particularly important to predict the ADMET properties of molecules (including drug absorption, distribution, metabolism, excretion, and toxicity). According to statistics, the proportion of late drug failure due to ADMET problems is as high as 60%.

Therefore, early detection and elimination of molecules with poor drug efficacy can greatly reduce the risk of failure in later drug development. AI-based prediction of ADMET properties enables pharmaceutical chemists to quickly modify molecular structures, optimize molecular physico-chemical properties, shorten the drug development cycle, and reduce experimental testing costs.

The “Yunshen Zhiyao” platform's drug small molecule ADMET attribute prediction module has improved 3% to 11% in multiple data sets compared to the best models available in academia; in feedback from partners, the accuracy of the platform's self-developed algorithms ranged from 6% to 37% over existing commercial software.

At the same time, the platform uses mechanisms such as attention to visualize the effects of substructures in molecules on the results and provides interpretability of the model. In addition, the platform can also provide flexible deployment forms such as local versions to ensure user data security.

An original article from Lei Feng.com. Unauthorized reprinting is prohibited. See reprint instructions for details.

The translation is provided by third-party software.


The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.
    Write a comment