生物医学工程学杂志

生物医学工程学杂志

基于AdaBoost算法的药物—靶向蛋白作用预测算法

查看全文

对靶向蛋白的药物作用进行预测可以促进药物新作用的发现。新近的研究更倾向于单独将特定的矩阵填补算法应用在靶向蛋白和药物的相互作用预测中。单模型的矩阵填补算法准确度较低,因此应用在药物—靶向蛋白作用预测方面也难以获得满意的结果。AdaBoost 算法是一种由多分类器组合生成强分类器的算法框架,其在分类应用领域的实用性和有效性已被证明。靶向蛋白的药物作用预测是一个矩阵填补问题,即是一种评分预测过程,因此本文在使用 AdaBoost 算法对药物—靶向蛋白作用进行预测前,将药物—靶向蛋白作用预测的矩阵填补问题转化为分类问题,将 AdaBoost 算法应用在靶向蛋白的药物作用预测评分上,充分利用 AdaBoost 算法框架对多个弱分类器进行融合从而提升性能,进行准确的药物—靶向蛋白作用预测。基于公测数据集的实验结果表明,本文提出的算法在预测准确度方面超过了大多数经典算法和新近算法,较好地克服了新近基于机器学习方法单算法的局限性,更好地挖掘隐含因素,有效提升了预测准确度。

The drug-target protein interaction prediction can be used for the discovery of new drug effects. Recent studies often focus on the prediction of an independent matrix filling algorithm, which apply a single algorithm to predict the drug-target protein interaction. The single-model matrix-filling algorithms have low accuracy, so it is difficult to obtain satisfactory results in the prediction of drug-target protein interaction. AdaBoost algorithm is a strong multiple classifier combination framework, which is proved by the past researches in classification applications. The drug-target interaction prediction is a matrix filling problem. Therefore, we need to adjust the matrix filling problem to a classification problem before predicting the interaction among drug-target protein. We make full use of the AdaBoost algorithm framework to integrate several weak classifiers to improve performance and make accurate prediction of drug-target protein interaction. Experimental results based on the metric datasets show that our algorithm outperforms the other state-of-the-art approaches and classical methods in accuracy. Our algorithm can overcome the limitations of the single algorithm based on machine learning method, exploit the hidden factors better and improve the accuracy of prediction effectively.

关键词: 靶向蛋白; 药物作用预测; 评分预测; AdaBoost 算法

Key words: target protein; drug effect prediction; score prediction; AdaBoost algorithm

登录后 ,请手动点击刷新查看全文内容。 没有账号,
登录后 ,请手动点击刷新查看图表内容。 没有账号,
1. Paul S M, Mytelka D S, Dunwiddie C T, et al. How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat Rev Drug Discov, 2010, 9(3): 203-214.
2. Law V, Knox C, Djoumbou Y, et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Research, 2014, 42(Database issue): D1091-D1097.
3. Kuhn M, Szklarczyk D, Pletscher-Frankild S A, et al. STITCH 4: integration of protein-chemical interactions with user data. Nucleic Acids Res, 2014, 42(D1): D401-D407.
4. 周福家, 张宏伟, 李卫国. 分子网络多靶标筛选的粒子群数值模拟法. 计算力学学报, 2015, 32(2): 269-273.
5. Campillos M, Kuhn M, Gavin A C, et al. Drug target identification using side-effect similarity. Science, 2008, 321(5886): 263-266.
6. Chen Xing, Liu Mingxi, Yan G Y. Drug-target interaction prediction by random walk on the heterogeneous network. Mol Biosyst, 2012, 8(7): 1970-1978.
7. Yamanishi Y, Araki M, Gutteridge A A, et al. Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics, 2008, 24(13): I232-I240.
8. Keiser M J, Setola V, Irwin J J, et al. Predicting new molecular targets for known drugs. Nature, 2009, 462(7270): 175-181.
9. Cobanoglu M C, Liu Chang, Hu Feizhuo, et al. Predicting drug-target interactions using probabilistic matrix factorization. J Chem Inf Model, 2013, 53(12): 3399-3409.
10. Bleakley K, Yamanishi Y. Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics, 2009, 25(18): 2397-2403.
11. Takács D, Pilászy I, Németh B. Major components of the gravity recommendation system. ACM SIGKDD Explorations Newsletter, 2007, 9(2): 80-83.
12. Wu Zhenhua, Chen Xiaosu, Xiao Daoju. Offline Chinese signature verification based on segmentation and RBFNN classifier. Acta Automatica Sinica, 2007, 345(1): 995-1001.
13. Li Dongsheng, Chen Chao, Lv Qin, et al. An algorithm for efficient privacy-preserving item-based collaborative filtering. Future Generation Computer Systems, 2016, 55(C): 311-320.
14. Bilal M, Israr H, Shahid M, et al. Sentiment classification of Roman-Urdu opinions using Na?ve Bayesian, Decision Tree and KNN classification techniques. Journal of King Saud University-Computer and Information Sciences, 2016, 28(3): 330-344.
15. Patel T B, Patil H A. Cochlear filter and instantaneous frequency based features for spoofed speech detection. IEEE J Sel Top Signal Process, 2017, 11(4): 618-631.
16. 刘晓峰, 张雪英, Wang Z J. Logistic核函数及其在语音识别中的应用. 华南理工大学学报:自然科学版, 2015, 43(5): 100-106.
17. Schapire R E. The strength of weak learnability. Mach Learn, 1990, 5(2): 197-227.
18. Zhu Ji, Zou Hui, Rosset S, et al. Multi-class AdaBoost. Stat Interface, 2009, 2(3): 349-360.
19. Iskar M, Zeller G, Zhao Xingming, et al. Drug discovery in the age of systems biology: the rise of computational approaches for data integration. Current opinion in biotechnology, 2012, 23(4): 609-616.
20. Hopkins A L, Groom C R. The druggable genome. Nat Rev Nature reviews Drug discovery, 2002, 1: 727-730.
21. Drews J. Drug discovery: A historical perspective. Science, 2000, 287(5460): 1960-1964.
22. Overington J P, Al-Lazikani B, Hopkins A L. How many drug targets are there? Nature Reviews Drug discovery, 2006, 5: 99.-996.
23. Landry Y, Gies J P. Drugs and their molecular targets: an updated overview. Fundamental & clinical pharmacology, 2008, 22(1): 1-18.