Drug Target Interaction Prediction Using Machine Learning Techniques
Drug Target Interaction Prediction Using Machine Learning Techniques
8, Nº6
Received 13 August 2021 | Accepted 4 January 2022 | Early Access 10 November 2022
Abstract Keywords
Drug discovery is a key process, given the rising and ubiquitous demand for medication to stay in good shape Chemogenomics,
right through the course of one’s life. Drugs are small molecules that inhibit or activate the function of a Drug Databases, Drug
protein, offering patients a host of therapeutic benefits. Drug design is the inventive process of finding new Discovery, Drug Target
medication, based on targets or proteins. Identifying new drugs is a process that involves time and money. Interactions, Machine
This is where computer-aided drug design helps cut time and costs. Drug design needs drug targets that Learning, Targets, Target
are a protein and a drug compound, with which the interaction between a drug and a target is established. Databases.
Interaction, in this context, refers to the process of discovering protein binding sites, which are protein pockets
that bind with drugs. Pockets are regions on a protein macromolecule that bind to drug molecules. Researchers
have been at work trying to determine new Drug Target Interactions (DTI) that predict whether or not a given
drug molecule will bind to a target. Machine learning (ML) techniques help establish the interaction between
drugs and their targets, using computer-aided drug design. This paper aims to explore ML techniques better
for DTI prediction and boost future research. Qualitative and quantitative analyses of ML techniques show that
several have been applied to predict DTIs, employing a range of classifiers. Though DTI prediction improves
with negative drug target pairs (DTP), the lack of true negative DTPs has led to the use a particular dataset
of drugs and targets. Using dynamic DTPs improves DTI prediction. Little attention has so far been paid to DOI: 10.9781/ijimai.2022.11.002
developing a new classifier for DTI classification, and there is, unquestionably, a need for better ones.
I. Introduction helps understand the biological process, recognize novel drugs, and
offer improved therapeutic medicine for illnesses of all sorts. Drug
- 86 -
Regular Issue
The new drugs developed today, though based on knowledge of of proteins is involved, as in, for instance, the G-Couple Protein
existing ones, could still have adverse side effects. Incidentally, a drug Receptor and ion channel, whose structures are far too complex to be
developed for a particular disease may be used, quite unexpectedly, obtained. The simulation is significant in regard to the time taken and
to treat another disease with no side effects whatsoever, a process its overall efficiency.
referred to as drug repurposing [2], [3]. It is essential in drug discovery
to establish the interaction between a drug and a target gene. The 2. Ligand-Based Approach
docking-based method needs a 3D structure of the target protein A ligand-based approach works on the premise that a drug can be
or gene for the process to work. The success of a newly developed predicted without the 3-Dimensional structure of targets and with the
drug depends on how well it fares in the market, particularly in existing knowledge of drugs and its targets.
terms of whether the purpose for which it was originally designed
3. Chemogenomics-Based Approach
is being fulfilled. The possibility of successfully identifying DTI is
enhanced by working on binding factors or interacting sites. This is a A chemogenomics-based approach integrates both the chemical
difficult process, given the limited information on drugs and targets. space of drugs and the genomic space of targets into a single
Bioinformaticians have tried to draw information from factors driving pharmacological space. The challenge here is that there are too few
drugs and targets. The automated tools employed to improve the DTI pairs and too many unknown interaction pairs.
success rate by discovering more interactions or binding sites between
C. Motivation and Justification
drugs and their targets are intended to actively assist doctors and
bioinformaticians. Scientists today work in drug development using The in-vitro prediction of DTI from biological data calls for a lot of
ML predictive analysis techniques to understand drugs and targets, effort in the search for new drugs and targets. Identifying potential
thus boosting DTI success prediction. drugs and targets is a painstaking step in initiating drug discovery.
Despite the plethora of research on DTI prediction in the recent
A. Drug Developing Procedure past, prediction is still material-intensive and protracted. Predicting
Drugs are synthesized chemicals that control, prevent, and cure and interaction between DTPs continues to challenge researchers.
diagnose illnesses. Disease diagnosis is carried out through reading the The motivation for this review is to help researchers in the drug
body’s reactions to drug molecules in the form of positive biological development domain access state-of-the-art methods used in ML for
responses. In pharmacological terms, the biomolecule whose function DTI predictions, and so enhance the quality of research. To this end,
and activity are modified by a specific drug is termed the drug target. several insightful articles on DTI procedures and methods that help
Biomolecules can be proteins, nucleic acids, receptors, enzymes, and discover new drugs and targets differently are reviewed. The machine
ion channels. The DTI process interacts or binds the drug molecule learning (ML) techniques used to predict DTIs are studied, each with
to the active biomolecule site with the same structural or functional its strengths and limitations. The research is categorized, based on the
properties as the drug molecule, culminating in the creation of a new ML techniques used in the prediction. Thereafter, it is qualitatively and
product as in Fig.1. The human body assimilates the product, resulting quantitatively analyzed to understand ML and DTI better so the latter
in a cure. can be improved.
The contributions of this paper are as follows, Articles related to
Compound ML and DTI in drug development are studied in detail and categorized,
Binding site
Active site based on the machine learning techniques deployed as in section III.
The feature selection techniques used in DTI prediction suggest the
best features for use. Articles on DTI prediction using ML techniques
Biomolecule Biomolecule Biomolecule
have described how ML manages datasets from miscellaneous
Biomolecule binds with Integrates in active site New product
databases, balances imbalanced data, handles large-scale datasets and
compoundin active site to create a new product features and, finally, examines at length the ML algorithms used in DTI
prediction. Articles that are qualitatively analysed in section V based
Fig.1. Drug Developing Procedure.
on ML techniques to understand their strengths and weaknesses. A
Drugs are developed in three phases. In the first phase, a drug and quantitative analysis in section VI follows to find the most appropriate
its target are discovered by means of the interacting or binding site, classifiers for DTI predictions.
using substrate on the active site of protein. In the second phase, the D. Organization of the Paper
drug is subjected to animal testing for safety’s sake. In the third phase,
the drug has human trials, following which it is marketed. The paper is organized as follows. Section II provides an overview
of state of the art methods involved in DTI prediction using ML
B. In-Silico Approaches in Drug Discovery techniques. In Section III, Machine learning techniques used for DTI
In-vitro is a technique where the process of drug discovery takes prediction are summarized. In Section IV, databases used for DTI
place in a controlled environment but not within a living organism. prediction are discussed. In Section V, a qualitative analysis of the
Here a pool of potential compounds is identified and narrowed down ML techniques used for DTI is presented. In Section VI, a quantitative
to find most reliable compound for treatment. In-vivo is a technique analysis of DTI prediction methods is offered. Section VII discusses
where the process of drug discovery takes place within a living DTI prediction. Section VIII concludes the study and offers new
organism by giving the reliable compounds to the human trials. Both directions for future research.
the data collected from in-vitro and in-vivo are given as input features
to the in-silico methods for drug prediction, which is a computational III. Machine Learning (ML) Techniques Used for DTI
method. The computational DTI prediction method is categorized into Prediction
the three approaches [4].
Computational models use ML techniques for prediction because
1. Docking-Based Approach they optimize data better and perform better as well. ML techniques,
A docking-based approach in DTI prediction requires a 3D structure which learn data without relying on previously defined formulas, are
for simulation. Consequently, it is not applicable where a large number grouped into two – supervised and unsupervised learning. Supervised
- 87 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 8, Nº6
learning predictions are based on observed existing knowledge from A. Chemogenomics-Based Machine Learning (ML) Techniques for
known data, while unsupervised learning predictions do the same DTI Prediction
without. Predictions are guesses based on existing knowledge from
The chemogenomics-based prediction approach is computationally
the data at hand. On the other hand, Classification refers to the
predicted using ML-based, graph-based or network-based methods.
process of differentiating between known and unknown labels.
ML-based methods are explained below in Fig 3.
The objective of this paper is to explore ML techniques involved
in improving DTP identification to find DTIs. The identification
of a new drug involves the drug and its target. Because of large Similarity based Methods
number of features of both drugs and targets manually extracting
them would be a time taking process, so the researchers use only
Matrix based Methods
tools like ChemCPP, EDragon, CDK, Open Babel, RDkit, PADEL for
extracting the features from drugs and Protr, SPICE, Propy, ProtDcal, Chemogenomics
ProtParam for extracting features from targets. Drug and target based ML Feature based Methods
techniques
features are extracted and concatenated with each other to form
DTPs. The pairs are analyzed for interaction prediction; specifically, Network based Methods
to observe whether or not the DTPs interact. The ML techniques
analyzed are explained qualitatively and quantitatively and the
Deep Learning based Methods
classifier used for DTI prediction is found. The DTI prediction
here mainly uses a static database. Prediction can be improved
when there are more targets and drugs with the interaction Fig.3. Chemogenomics based ML Techniques.
between them yet to be ascertained. In recent times, CADD has
been used to develop drugs for immunodeficiency syndrome, 1. Similarity-Based Methods
influenza virus infection, glaucoma and lung cancer [5]. CADD The most commonly used DTI prediction methods use drug and
helps in pharmacological, Pharmacodynamics and in-silico toxicity target similarity measures in tandem with the distance between each
prediction, which identifies or filters inactive or toxic molecules pair of drugs and its targets [11]-[18]. These methods use the drug,
[6] and naturally gets ML involved in DTI prediction strategies [7]- target and drug-target interaction similarity scores based on prior
[10]. Thus to improve drug development various methods based on knowledge of their interaction similarity. The similarity is obtained
drugs and targets are developed using ML techniques. Fig.2 shows using a distance function like the Euclidean. For instance, if the
DTI prediction through ML techniques with targets and drugs taken following function is employed for the nearest neighbor algorithm,
from diverse databases. Drug and target features are extracted using assuming two vectors x1 and x2, the distance between the vectors is
a slew of tools or web servers. Subsequently, the most influential found using equation (1) as D(x1, x2) where
features alone are selected and used for DTI prediction with several
ML classifiers to complete the process. (1)
and the same dimension and distance are calculated using the
Targets Drugs
Euclidean norm and the inner product. The similarity between a
drug and a target is given through the pharmacological similarity
of the drug, the genomic similarity of the protein sequence, and the
Feature Extraction topological properties of a multipartite network of previously known
drug-target interaction knowledge. The disadvantage of these methods
Feature Selection is that they use knowledge drawn from a small quantum of labelled
data, while there exist large quanta of unlabeled data.
Drug Target Pairs 2. Matrix-Based Methods
Several studies [19]-[24] have shown that matrix-based methods
Training Set Testing Set outperform the rest in DTI prediction. The interaction matrix is
Classifier
(2)
Prediction of DTI For i=1: m and j=1: n,
In-silico methods include Machine learning, Data mining, Network The first move in DTI prediction is to break down matrix Xmxn into
analysis tool and data analysis tool, Quantitative Structure Analysis two matrices, Ymxk and Znxk, where X ~ YZT with k < m, n, and where ZT
Relationship (QSAR), pharmacophores, homology modeling, Here denotes the swapped matrix of Z. This process of factorizing matrices
Machine learning technique is more feasible than all other methods in lower order makes it easier for matrix-based approach to deal with
for working with drug discovery data for analysis. The trending the missing data. With these methods, however, the distance between
research in drug discovery is “Identification of screening hits the drug and target appears to be the same and establishes the
(compounds)” which helps in finding the particular compounds target strength of the interaction between them, embedding them in a low-
with more potency at different level like binding, reducing the side dimensional matrix. The reliability of these methods is affected when
effects, efficiency, and also increases the life of patients by changing the drug and target data increase in volume, impacting the capacity to
the function of the biomolecule. find their interaction.
- 88 -
Regular Issue
- 89 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 8, Nº6
- 90 -
Regular Issue
Pre processing/
Feature
Source ML Tech Dataset Feature Validation Strength Weakness Outcome
Selection
Extraction
Logistic 250 Wrapper
Reference Lists the selected Only 10 features are Targets of 307
Regression Proteins, - Feature 10 Fold CV
[11] (2011) features considered drugs are predicted
(LR) 315 Drugs Selection
Bipartite Whenever new drug or
NII procedure for
Reference Local LOOCV and target is given as input 57 % of DTI has
BM Dataset - - finding drugs and
[12] (2012) Model-NII 10 Fold CV it is not considered as been predicted
targets
(BLM-NII) there is no training data
Pre processing/
Feature
Source ML Tech Dataset Feature Validation Strength Weakness Outcome
Selection
Extraction
Incorporates target DTI leads to Drug
More survey based on
Reference 5 Trials of bias and context repurposing and
BRDTI BM Dataset - - DTI is to be done for
[19] (2009) 10 Fold CV alignment for drug adverse drug
better prediction.
and target similarities reaction prediction
Interaction score
Reference Better for only 12 low Similarity based
KBMF BM Dataset - - 5 Fold CV is generated using
[20] (2012) dimensional projection DTIs.
factorization methods
Predict interaction
608 protein, Structural view
based on chemical
Reference 326 and chemical Preserving the point Noisy observation leads
MLRE - 5 Fold CV view with SVM
[21] (2017) drugs, 114 view of drug are wise linear regression to disagreement data
and graph based
interactions extracted
methods
DTI matrices
Reference VB-MK- 5 Trials of are linked to Works well for mid-
BM Dataset - - DTI predicted
[22] (2017) LMF 10 Fold CV weighted common sized datasets
observations
Uses extremely
randomized tree
Reference Pseudo Extraction of Uses only Pseudo AAC Predicted 15
BM Dataset - 5 Fold CV methods and it is
[23] (2018) SMR Pseudo AAC Descriptors. Potential DTIs.
computationally
more efficient
BM Dataset - Bench Mark Dataset, CV- Cross Validation, AAC –Amino Acid Composition.
- 91 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 8, Nº6
Pre processing/
Feature
Source ML Tech Dataset Feature Validation Strength Weakness Outcome
Selection
Extraction
LOO CV
Combining GIP with Increase kernel with 15 known
Reference Regularized and 5 Trials
BM Dataset - - target kernel and more information about interaction was
[25] (2011) Least Square of 10 Fold
drug kernel DTI predicted
CV
Incorporates both
Krons- Replaced missing known and unknown Prediction of
Reference Balancing the data is
Regularized BM Dataset values with - 5 Fold CV interaction and make interval as measure
[26] (2016) not considered
Least Square mean of data a general purpose of confidence
learner
Finds some unlabeled
Structural
sample as negative Asks for using structure
similarity, Predicts Interaction
Reference Weighted sample and also but we cannot get
BM Dataset Gene Function - 5 Fold CV and listed 3 top
[27] (2016) SVM considers positive structure for all the
similarity was known interaction
samples beneath targets
extracted
unlabeled samples
PROFEAT for
5877Drugs Ensemble learning to
Reference Ensemble Target Oversampling is done Predicted more
3348Targets - 5 Fold CV address issues of class
[28] (2016) learning and which increases noise than 20 Known DTI
12674DTI imbalance
Rcpi for Drug
Sequential
PSSM for target
Forward Balanced Data
Reference and SMILE Not considered domain Listed top 10
Adaboost BM Dataset Feature 5 Fold CV using RUS and CUS
[32] (2017) for drug were features known interaction
Selection techniques
extracted
(SFFS)
Considered class
imbalance and used
PROFEAT for
Bagging 5877Drugs Neighbourhood 14 out of 16 known
Reference Target Not discussed about
based 3348Targets - 10 Fold CV balanced bagging for interactions have
[33] (2018) and Features
ensemble 12674 DTI balancing the data been detected.
Rcpi for Drug
and active learning
strategy is used
BM dataset - Bench Mark dataset, CV- Cross Validation, LOOCV-Leave One Out Cross Validation, PROFEAT-PROtein FEATures, AAC- Amino Acid Composition,
Rcpi-R package for extracting features for compound protein interaction..
Target Clustering (STC). Buza K [15] proposed a K-nearest neighbor B. Review of Literature for Matrix-Based Methods
(KNN)-based method with hubness-aware classification and error Matrix-based methods use matrix similarity for DTI prediction.
correction to maximize the detrimental effect of bad hubs (EcKNN- Rendle et al. [19] proposed an algorithm based on the Bayesian
KNN with error correction). Zhang et al. [16] posited a framework that Personalized Ranking (BPR) matrix factorization which incorporates
develops a drug-drug linear neighbourhood, calculates the similarities, drug and target similarities to predict DTIs (BPRDTI). Gonen [20]
and predicts drug-target interaction profile and label propagation proposed a method to factorize the matrices with interaction score
(LPLNI-Label Propagation with Linear Neighbourhood Information). matrix so as to find new drugs and targets and determine their
Zhang et al. [17] developed a clustering algorithm by incorporating interaction using kernelized Bayesian matrix factorization (KBMF). Li
drug and target data from structural and chemical viewpoints with et al. [21] introduced an algorithm to find a low-rank representation
existing knowledge of interactions (MDTI- Multiview DTI). Shi and Li embedding (LRE) technique and fix errors in point wise linear
[18] advanced an improved Bayesian ranking DTI method that adds reconstruction. This was done to obtain a different view of the
weights for unknown drugs and targets using weighted neighboring structural and chemical features of drugs and targets as Single view
drugs and targets (WBRDTI–Weighted Bayesian Ranking DTI).
- 92 -
Regular Issue
Pre processing/
Feature
Source ML Tech Dataset Feature Validation Strength Weakness Outcome
Selection
Extraction
Network-based
Used RWR to get Leaves the target
Reference Random Walk
BM potential DTI which has no drug it 29 new DTI were
[35] with Restart on - - LOOCV
Dataset using bipartite is considered ass zero predicted
(2012) the Heterogeneous
graph network matrix
network (NRWRH)
Network-
Reference Not DTI predicted
Consistency-based BM Considered as zero Listed out several
[36] - - discussed using bipartite
Prediction Method Dataset matrix DTI
(2013) Properly graph network
(Net CBP)
In order to improve
Integrates
Reference Normalized Not performance more
BM robust PCA Predicts
[37] Multi information - - discussed negative dataset to
Dataset with biological interaction
(2015) Fusion properly be built to find the
information
interactions.
RWR on
Reference Considered only 110 drugs
Random Walk 467Targets heterogeneous
[38] - - - fingerprints features for predicted for 3419
Restart (RWR) 544Drugs network using
(2015) drugs targets
chemical features
Principal Predicts
Reference 12015 Drug Used both labelled
IN - Random Walk Component interaction
[39] 1895445 - 5 Fold CV and unlabeled data Data is imbalanced
with Restart (RWR) Analysis between drug and
(2018) Target for prediction
(PCA) targets
Neighbourhood Calculates
Reference Predicts
Regularized Logistic BM similarities Improved using Not more parameters
[40] - 10 Fold CV interaction but
Matrix Factorization Dataset of drugs and rescoring matrix are considered
(2019) not listed
(NRLMF) targets
BM Dataset - Bench Mark Dataset, CV- Cross Validation, LOOCV-Leave One Out Cross Validation.
LRE and Multiview LRE, respectively (LRE). Bolgar et al. [22] developed an ensemble-based approach for a random projection ensemble (RPE)
a method integrating multiple kernels, weights, and graphs, all of the REP tree algorithm (Drug RPE). Rayhan et al. [32] developed a
regularized to model the probability of DTI prediction (VB-MK-LMF). model using targets in the form of a matrix (position-specific scoring
Huang et al. [23] propounded an extension of the structure activity matrix - PSSM) and drug molecules features for DTI prediction
relationship classification by implementing the extremely randomized using the AdaBoost classifier (iDTI-EsBoost). Sharma and Rani [33]
tree (ERT) using the pseudo substitution matrix representation (SMR) proposed an ensemble (Bagging-Ensemble) model that uses active
of the target (Pseudo-SMR). Marta et al. [24] proposes a local model- learning methodology to predict DTIs (BE-DTI).
agnostic for interaction prediction.
D. Review of Literature for Network-Based Methods
C. Review of Literature for Feature-Based Methods These methods use networks of similar drugs and targets for DTI
Feature-based methods consider drug and target features for DTI prediction. Cheng et al. [34] proposed a bipartite Network Based
prediction. Van Laarhoven et al. [25] proposed an algorithm that Inference (NBI) method for DTI prediction. Chen et al. [35] developed an
integrates the DTI network information with the Gaussian Interaction RWR framework to get potential DTIs using a bipartite graph network
Profile kernel using the Regularized Least Square (RLS). Ezzat et al. (NRWRH-Network-based Random Walk with Restart on Heterogenous
[26] developed a framework for DTI prediction using the voting of network). Chen et al. [36] used this method for both labelled and
the decision tree, random forest, STACK and Laplacian Eigen base unlabeled data DTI prediction (NETCBP-Network Consistency-based
classifiers, and also considered imbalanced classes for prediction. Prediction). Peng et al. [37] proposed a method that incorporates the
Nascimento et al. [27] advanced a method that incorporates both PCA to reduce dimensions and integrate data from multiple drug and
known and unknown interaction data using the RLS. Lan et al [28] target sources for DTI prediction (NMIF-Normalized Multi-Information
developed a framework for DTI prediction by taking unlabeled Fusion). Seal et al. [38] proposed a model that needs matrix inversion
samples using the weighted SVM (PUDT-Positively Unlabeled Drug and score of relevance between two nodes in a weighted graph of
Targets). Li et al. [29] proposed a method to find DTIs as a structure DTIs (RWR-Random Walk with Restart). Huang et al. [39] proposed
activity relationship (SAR) classification with the principal component a 2-network-based rank algorithm that involves the random walk and
analysis (PCA), using the Discriminative Vector Machine (DVM). bipartite graph (IN-RWR-intra network with Random Walk). Ban et al.
Ohue et al. [30] proposed an approach that uses virtual screening [40] developed a method based on improving the NRLMF algorithm by
and the Pairwise Kernel Method (PKM). Zhang et al. [31] proposed calculating the NRLMF scores as the expected beta distribution values.
- 93 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 8, Nº6
TABLE VII. Qualitative Analysis of the Articles Using Deep Learning-Based Methods
Pre processing/
Feature
Source ML Tech Dataset Feature Validation Strength Weakness Outcome
Selection
Extraction
t-distributed
Reference Deep 3675Targets stochastic Similarity acts
Considers only CTD
[45] Convolution- 11950Drugs - neighbor 5 Fold CV as a informative Predicts interaction
descriptors of targets
(2019) DTI 32,568 DTI embedding descriptors
(t-SNE)
BM Dataset - Bench Mark Dataset, CV- Cross Validation, CTD – Composition, Transition and Distribution, PSSM - Position Specific Scoring Matrix, PubChem
– PubChem is a Chemical Information database.
Beta distribution value is calculated using the interaction information Table IX the performance metrics used. Integrates here refers to drugs
and NRLMF score (NRLMF-beta). that produce a positive DTP result, that is, the integrating drug can be
used to treat a target it integrates with. The converse is true with non
E. Review of Literature for Deep Learning-Based Methods integrates, which refers to drugs that produce a negative DTP result,
Deep learning-based methods use the drug and target features for that is, the non integrating drug cannot be used to treat a target it does
DTI prediction. Wen et al. [41] proposed a method that takes raw target not integrate with.
and drug features using a deep belief network (DBN) and predicts DTI
in drugs approved by the Food and Drug Association (DeepDTIs). TABLE VIII. Confusion Matrix
Ozturk et al. [42] proposed a DTI prediction model using target Integrates Non Integrates
sequences and drug molecule to predict drug target binding affinity
Integrates True Positive False Positive
(DeepDTA). Wang et al [43] developed a computational model using a
stacked auto encoder for DTI prediction (AUTO-DNP). You et al. [44] Non integrates False Negative True Negative
presented a method based on protein and drug features with LASSO
regression model in tandem with the deep neural network (DNN) to TABLE IX. Performance Metrics Used in DTI Prediction
predict DTI (LASSO-DNN). Lee et al. [45] proposed a DTI prediction
S. Metrics
model using local protein residue patterns in DTI (DeepConv-DTI). Formula Metrics Description
No Used
Accuracy is the ratio
VI. Quantitative Analysis of Machine Learning of correct prediction
1. Accuracy (TP+ TN)/(TP+TN+FP+FN) out of total number of
Techniques in DTI Prediction predictions
Quantitative analysis is applied to determine the best prediction Sensitivity/
2. TP/(TP+FN) Measure of quantity
performance method, using different ML techniques with appropriate Recall
metrics. The prediction method must deal with the steps of data pre- 3. Precision TP/(TF+FP) Measure of quality
processing and feature selection, as well as drug and target integration.
Curve shows the
The best machine learning prediction method includes the hyper
relation between False
parameters and association index for DTI prediction. Of the various 4. AUC False Positive vs. True Positive Positive and True
ML techniques [11]-[44] available, the best is chosen for prediction. Positive
Tables X-XIV depict the quantitative analysis of the results of several Curve shows the
ML methods in DTI prediction that help enhance performance. 5. AUPR Precision vs. Recall relationship between the
Precision and Recall
A. Performance Metrics
A confusion matrix is used to calculate performance measures from Mathew’s Correlation
6. MCC Coefficient
test set values in terms of true positives, true negatives, false positives
and false negatives among classes that are to be classified as integrates Harmonic average of
7. F1 Score TP/(TP+1/2+TP/(FP+FN))
or not integrates. Table VIII shows the confusion matrix for DTI and Precision and Recall
- 94 -
Regular Issue
E-Enzyme, IC- Ion Channel, G- G-Protein Coupled Receptor (GPCR), N- Nuclear Receptor, AUC-Area Under Curve, AUPR- Area Under Precision Recall, PPV-
Positive Predicted Values, MCC- Mathew’s Correlation Coefficient, nDCG-normalized Discounted Cumulative Gain.
TABLE XII. Quantitative Analysis of the Feature-Based Methods Used in DTI Prediction
E-Enzyme, IC- Ion Channel, G- G-Protein Coupled Receptor (GPCR), N- Nuclear Receptor, AUC-Area Under Curve, AUPR- Area Under Precision Recall, PPV-
Positive Predicted Values, MCC- Mathew’s Correlation Coefficient.
- 95 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 8, Nº6
TABLE XIII. Quantitative Analysis of the Network-Based Methods Used in DTI Prediction
TABLE XIV. Quantitative Analysis of the Deep Learning-Based Methods Used in DTI Prediction
VII. Discussion techniques are, generally speaking, not used on the data because they
are curated when collected from different sources. When the data are
The analysis shows that the chemogenomics-based approach to incorporated, however, values may go missing or are replaced, and
DTI prediction is ideally suited to interaction prediction. A review there is thus a need for preprocessing. The preprocessing employed
of the qualitative and quantitative analyses offers an overview of the in [26] to replace missing values uses the mean values of the data.
dataset, preprocessing, feature selection techniques, validation and Employing preprocessing techniques like data cleaning enhances the
ML classification techniques used in DTI prediction, all of which are quality of the data for further processing.
discussed in this section. From the qualitative analysis tables III-VII, it is found that the
A. The Dataset dataset used in the prediction process is unbalanced and may affect
the performance of the classifiers. Balancing techniques include
The benchmark Yaminishi et al. dataset [71] is invariably used in
balancing the data using oversampling [26], [32], [33], though it
DTI prediction, with its four enzyme (E), ion channel (IC), G-protein
increases negative outcomes. For DTI prediction, undersampling can
coupled receptor (GPCR) and nuclear receptor (NR) classes and the
be suggested to improve the positive outcomes.
DTI positive pairs of each class. Apart from the benchmark dataset
above, others are used as well [11], [17], [21], [26], [31]. Deep learning- C. Feature Extraction Methods
based prediction works with more dynamic data. An attempt has been Feature Extraction is done to reduce the dimensionality of the input
made in [44] to construct a negative DTI dataset, which is significant features by creating a new set of features from the original features
in that it facilitates the assimilation of targets not taken into the which gains the important features of the data and also reduces the
prediction process. The number of instances used, which ranges from dimension of the features, which increases the speed of learning
250 to 5500, may be increased or decreased, depending on the purpose and generalization of machine learning. It can also be done through
of the research. various tools available for it. In drug discovery researchers use several
B. Preprocessing and Balancing Techniques tools for feature extraction, the trending tools are PROFEAT and
Protr for protein feature extraction, Rcpi and PADEL Descriptor for
Major issues in DTI prediction are brought on by the data obtained
drug feature extraction. The research work which uses these tools for
from miscellaneous sources, which may have a different range of
feature extraction are [28], [33].
values or none at all. Missing values from known data are inferred,
based on the observed values in the data structure. Preprocessing
- 96 -
Regular Issue
D. Feature Selection Methods for improved accuracy [33]. Logistic regression [11], [16] operates data
Feature selection is of fundamental importance, because the integration strategies effectively. The DVM [29] influences features
extracted features increase data dimensions and result in problems strongly in its handling of outliers. As far as feature-based methods
with over fitting. Feature selection techniques reduce the number of are concerned, the random forest outperforms the rest, while the
features by selecting the most important ones from the given input. It regularized least square (RLS) performs well in tandem with more
is clear from the analysis that target features can be categorized into influential features. In terms of performance, the WBR-DTI, VB-MK-
three –structural, evolutionary and sequence. While the drug feature LMF, NRLMF-beta and CNN find the best features for DTI prediction.
is structural, the number of target features considered varies from From the quantitative analysis table X-XIV, the progress made is
1080 to 1498. Likewise, drug features vary, depending on whether they evaluated using AUC values, with marked improvements in the SVM
are 1D or 2D and on the fingerprint of the drugs selected. Tables III- from 61.7% [19] to 96.34% [22], the KNN from 92.3% [18] to 95.4% [20],
VII in [11]-[18] that showcase similarity-based methods only consider and LR from 85.1% [11] to 95.32% [16]. Among the classifiers used
similarities between drug-drug, target-target and drug-target for DTI in DTI prediction, the SVM gives the best prediction results with an
prediction, which means that only similar drugs interact with similar improvement of 34.64%. The random forest and decision tree used in
targets. So in similarity based methods, drug-based and target-based ensemble learning give an AUC value of 90%. Adaptive Boosting and
features are considered unimportant for DTI prediction. Further, RLS give AUC values of 88.7% and 97%, respectively. The WBR-DTI
similarity-based methods do not handle large-scale datasets. Matrix- and VB-MK-LMF give an AUC value of 98%, while the NRLMF-beta
based methods [19]-[23] consider only drug and target similarities, and gives 96%.
no other features are taken for prediction. Also, matrix-based methods However, the results are based on the data given as input. The
only handle small-scale datasets. Of the feature-based methods used in new model developed may perform poorly, with imbalanced data
[25]-[33], the Sequential Forward Feature Selection (SFFS) technique and missing values. The qualitative analysis tables III-VII show that
is applied in [33], where the different feature sets considered are the dataset has more negative than positive predictions, owing to the
added sequentially, one by one, to evaluate the dataset. It is observed nature of the dataset used for DTI prediction. The quantitative analysis
that the structural feature, which is one of the most influential target tables X-XIV depict that matrix factorization-based methods perform
features, plays a significant role in DTI prediction, and may vary with best for DTI prediction, though deep learning-based methods handle
the dataset taken. Finding the most influential features is important large-scale data and find the most influential features and some of the
to feature selection. The network-based methods in [34]-[40] take papers gives light to other process like detecting adverse reaction of
different sets of features and handle them appropriately by selecting drugs [72]. This review has thus laid out a thorough understanding
the most important drug and target features. Compact feature learning of datasets, feature selection methods and validations, as well as a
is undertaken in [39] by applying the Diffusion Component Analysis comparison of the classifiers used for DTI prediction
(DCA), which constructs a low-dimensional vector representation for
each drug and target using diffusion distribution. It helps find the best
interpretable features. The deep learning-based methods discussed VIII. Conclusion and Future Scope
[41]-[45] use the t-distributed Stochastic Neighbor Embedding (t-SNE) It is concluded from the review that much research has focused
technique to reduce input feature dimensionality. Deep learning- chiefly on chemogenomics, and this is because DTI based on drug and
based methods consider dynamic data and dynamic features. The target features and similarities may be found without their structures.
Convolution Neural Network (CNN) used in [45] handles features The method works well by finding the most influential features
with ease and finds the most potent ones. Given that deep learning- using a range of classifiers for DTI prediction. The classifiers use
based methods deal with large-scale datasets well, future research that only known static interaction for training the model, given that the
applies deep learning will execute DTI prediction better. interaction data is static. Though static data has largely been used as
E. Validation Methods a benchmark dataset for interaction prediction, dynamic data may be
considered so the problem of new DTI is resolved. Several studies have
The qualitative analysis depicts that the 10-fold Cross-Validation
only considered target features (like the AAC, CTD and pseudo AAC)
(CV) and 5-fold cross-validation offer better results than other CV
and the PubChem fingerprint for drugs. There are, therefore, plenty of
techniques like the Leave-One-Out CV (LOOCV) and jackknife.
research opportunities to predict drugs using the influence of all the
Approaches using the LOOCV have problems with over fitting. DTI
features. Influential features may vary from one technique to another.
predictions are evaluated using AUC and AUPR values. The AUC
There is, however, a delay in finding influential features, since one
values of the classifiers show better results when the 10-fold CV is
feature may not be as important for prediction as another. More data
used to validate the methods. AUC is chosen because it distinguishes
are to be considered for finding the most influential features, which
between classes and validates the model’s capacity even when the
is possible with the introduction of big data for prediction. The ML
dataset is imbalanced.
techniques used by the deep learning-based and matrix-based methods
F. ML Techniques Engaged in DTI Prediction were found to predict DTI better than others. It is recommended,
The qualitative analysis table III-VII, depicts the various classifiers considering the above, that future researchers focus on building a
used, one outclasses the rest at DTI prediction. Ranking algorithms negative dataset for interaction prediction. Feature scaling or feature
like Bayesian ranking are used to rank DTI [20]. The SVM [19], [22] engineering techniques may be applied to enhance the dataset. New
classifier, which handles target and drug features by calculating them databases can be created by collecting data from numerous sources
separately and reducing prediction complexity cannot determine the and incorporating appropriate parameters or influential features for
relationship between the features and may produce a large number future research. Further, future models developed for DTI prediction
of false positives. The KNN [18], [20] falls short, performance-wise, must consider every feature for drug prediction. The model developed,
in its inability to handle features and large-scale datasets. Ensemble based on ML techniques, should be able to update information on
learning [27] handles large-scale and high-dimensional data. The drugs and targets constantly for new interaction prediction. Thus, the
Adaboost classifier separates the data and classifies them to get the model must be able to predict interaction, based on prior knowledge,
most appropriate features [32]. The decision tree manages missing without having to be trained on every occasion. Such a model is likely
data thoroughly and uses diversity to learn features based on instances to offer the best interaction prediction.
- 97 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 8, Nº6
- 98 -
Regular Issue
- 99 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 8, Nº6
T. Idhaya
T. Idhaya received M.Sc., degree in Computer Science
from St. Xavier’s College (Autonomous), Tirunelveli,
India, in 2016. She has completed her M.phil degree in
Manonmaniam Sundaranar University, Tirunelveli, India,
in 2017. She is currently pursuing her Ph.D degree in
Manonmaniam Sundaranar University, Tirunelveli, India.
Her area of interest is Image processing, Machine learning
and Big data.
S. P. Raja
S. P. Raja is born in Sathankulam, Tuticorin District,
Tamilnadu, India. He completed his schooling in Sacred
Heart Higher Secondary School, Sathankulam, Tuticorin,
Tamilnadu, India. He completed his B. Tech in Information
Technology in the year 2007 from Dr. Sivanthi Aditanar
College of Engineering, Tiruchendur. He completed his
M.E. in Computer Science and Engineering in the year
2010 from Manonmaniam Sundaranar University, Tirunelveli. He completed
his Ph.D. in the year 2016 in the area of Image processing from Manonmaniam
Sundaranar University, Tirunelveli. Currently he is working as an Associate
Professor in the School of Computer Science and Engineering in Vellore
Institute of Technology, Vellore, Tamilnadu, India. He published 75 papers
in International Journals, 24 in International conferences and 12 in national
conferences. Dr. Raja is an Associate Editor of the Journal of Circuits, Systems
and Computers, Computing and Informatics, International Journal of Interactive
Multimedia and Artificial Intelligence, Brazilian Archives of Biology and
Technology, International Journal of Image and Graphics, and International
Journal of Biometrics.
- 100 -