Evaluation of knowledge graph embedding approaches for drug-drug interaction prediction in realistic settings

Celebi, Remzi; Uyar, Hayri; Yasar, ERKAN; Gumus, Ozgur; Dikenelli, Oguz; Dumontier, Michel

doi:10.1186/s12859-019-3284-5

Evaluation of knowledge graph embedding approaches for drug-drug interaction prediction in realistic settings

Celebi R., Uyar H. T., Yasar E., Gumus O., Dikenelli O., Dumontier M.

BMC Bioinformatics, cilt.20, sa.1, 2019 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 20 Sayı: 1
Basım Tarihi: 2019
Doi Numarası: 10.1186/s12859-019-3284-5
Dergi Adı: BMC Bioinformatics
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Anahtar Kelimeler: Disjoint cross-validation, Drug-drug interaction, Paired data, Realistic evaluation
Hakkari Üniversitesi Adresli: Hayır

Özet

Background: Current approaches to identifying drug-drug interactions (DDIs), include safety studies during drug development and post-marketing surveillance after approval, offer important opportunities to identify potential safety issues, but are unable to provide complete set of all possible DDIs. Thus, the drug discovery researchers and healthcare professionals might not be fully aware of potentially dangerous DDIs. Predicting potential drug-drug interaction helps reduce unanticipated drug interactions and drug development costs and optimizes the drug design process. Methods for prediction of DDIs have the tendency to report high accuracy but still have little impact on translational research due to systematic biases induced by networked/paired data. In this work, we aimed to present realistic evaluation settings to predict DDIs using knowledge graph embeddings. We propose a simple disjoint cross-validation scheme to evaluate drug-drug interaction predictions for the scenarios where the drugs have no known DDIs. Results: We designed different evaluation settings to accurately assess the performance for predicting DDIs. The settings for disjoint cross-validation produced lower performance scores, as expected, but still were good at predicting the drug interactions. We have applied Logistic Regression, Naive Bayes and Random Forest on DrugBank knowledge graph with the 10-fold traditional cross validation using RDF2Vec, TransE and TransD. RDF2Vec with Skip-Gram generally surpasses other embedding methods. We also tested RDF2Vec on various drug knowledge graphs such as DrugBank, PharmGKB and KEGG to predict unknown drug-drug interactions. The performance was not enhanced significantly when an integrated knowledge graph including these three datasets was used. Conclusion: We showed that the knowledge embeddings are powerful predictors and comparable to current state-of-the-art methods for inferring new DDIs. We addressed the evaluation biases by introducing drug-wise and pairwise disjoint test classes. Although the performance scores for drug-wise and pairwise disjoint seem to be low, the results can be considered to be realistic in predicting the interactions for drugs with limited interaction information.