id author title date pages extension mime words sentences flesch summary cache txt cord-298301-p1zj6jg9 Dey, Lopamudra Machine Learning Techniques for Sequence-based Prediction of Viral-Host Interactions between SARS-CoV-2 and Human Proteins 2020-09-03 .txt text/plain 6298 387 49 title: Machine Learning Techniques for Sequence-based Prediction of Viral-Host Interactions between SARS-CoV-2 and Human Proteins A total of 1326 potential human target proteins of SARS-CoV-2 have been predicted by the proposed ensemble model and validated using gene ontology and KEGG pathway enrichment analysis. In this article, we have tried to predict the target human proteins of the SARS-CoV-2 virus based on their protein sequences combining amino acid composition, pseudo amino acid composition, and conjoint triad features using machine learning techniques. Subsequently, after feature reduction, we have used some popular supervised learning algorithms such as Support Vector Machine (SVM), Naive Bayes (NB), Random Forest (RF) and K-Nearest Neighbor (KNN) along with a deep multi-layer perceptron model and ensemble techniques (Voting classifier, XGBoost, AdaBoost) for classification and prediction. A total of 3 sets of sequence-based features, namely, amino acid composition, conjoint triad, and pseudo amino acid composition of the human proteins are considered to train the machine learning models. ./cache/cord-298301-p1zj6jg9.txt ./txt/cord-298301-p1zj6jg9.txt