0% found this document useful (0 votes)
6 views

Deep Learning in Proteomics

Deep learning in proteomics

Uploaded by

Louzanne Swart
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Deep Learning in Proteomics

Deep learning in proteomics

Uploaded by

Louzanne Swart
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

REVIEW

www.proteomics-journal.com

Deep Learning in Proteomics


Bo Wen,* Wen-Feng Zeng, Yuxing Liao, Zhiao Shi, Sara R. Savage, Wen Jiang,
and Bing Zhang*
data-independent acquisition (DIA)
Proteomics, the study of all the proteins in biological systems, is becoming a method. In contrast, targeted proteomics
data-rich science. Protein sequences and structures are comprehensively only detects selected proteins of interest
catalogued in online databases. With recent advancements in tandem mass using the multiple reaction monitor-
spectrometry (MS) technology, protein expression and post-translational ing (MRM) method (also known as
selected reaction monitoring) or parallel
modifications (PTMs) can be studied in a variety of biological systems at the reaction monitoring (PRM) method.
global scale. Sophisticated computational algorithms are needed to translate With advancements of both LC and
the vast amount of data into novel biological insights. Deep learning MS technologies in recent years, large
automatically extracts data representations at high levels of abstraction from volumes of MS/MS data have been gen-
data, and it thrives in data-rich scientific research domains. Here, a erated. A typical DDA or DIA experiment
can produce hundreds of thousands
comprehensive overview of deep learning applications in proteomics,
of MS/MS spectra. Sophisticated algo-
including retention time prediction, MS/MS spectrum prediction, de novo rithms and tools are required for raw
peptide sequencing, PTM prediction, major histocompatibility data processing, data quality control,
complex-peptide binding prediction, and protein structure prediction, is peptide and protein identification and
provided. Limitations and the future directions of deep learning in proteomics quantification, post-translational modifi-
are also discussed. This review will provide readers an overview of deep cation (PTM) detection, and downstream
analyses. Due to these computational
learning and how it can be used to analyze proteomics data. requirements, machine learning meth-
ods have been widely used in many
aspects of proteomics data analysis.[1–3]
1. Introduction Deep learning is a sub-discipline of machine learning. It
Mass spectrometry (MS) has been widely used for both un- has advanced rapidly during the last two decades and has
targeted and targeted proteomics studies. For untargeted pro- demonstrated superior performance in various fields including
teomics, all proteins extracted from a sample are digested computer vision, speech recognition, natural-language process-
into peptides and then injected into a liquid chromatography- ing, bioinformatics, and medical image analysis. Deep learning is
tandem mass spectrometry (LC-MS/MS) system for detection based on artificial neural networks with representation learning
using the data-dependent acquisition (DDA) method or the that aim to mimic the human brain. The key difference between
deep learning and traditional machine learning algorithms such
as support vector machine (SVM) and random forests (RF) is
B. Wen, Dr. Y. Liao, Dr. Z. Shi, Dr. S. R. Savage, W. Jiang, Prof. B. Zhang that deep learning can automatically learn features and patterns
Lester and Sue Smith Breast Center from data without handcrafted feature engineering. Therefore,
Baylor College of Medicine
Houston, TX 77030, USA deep learning is particularly suited to scientific domains where
E-mail: [email protected]; [email protected] large, complex datasets are available.
B. Wen, Dr. Y. Liao, Dr. Z. Shi, Dr. S. R. Savage, W. Jiang, Prof. B. Zhang Deep learning has already been applied to various aspects of
Department of Molecular and Human Genetics biological research, including analyses of medical image data,
Baylor College of Medicine gene expression data, DNA and protein sequence data.[4] A num-
Houston, TX 77030, USA
ber of reviews have been published to provide an overview of deep
Dr. W.-F. Zeng
Key Lab of Intelligent Information Processing of Chinese Academy of
learning applications in biomedicine,[5] clinical diagnostics,[6]
Sciences (CAS) bioinformatics,[7] and genomics.[8]
Chinese Academy of Sciences The aim of this paper is to provide the proteomics community
Institute of Computing Technology a comprehensive overview of deep learning applications for the
Beijing 100190, China analysis of proteomics data. We first introduce fundamental
The ORCID identification number(s) for the author(s) of this article
concepts in deep learning. We then present a survey of major
can be found under https://doi.org/10.1002/pmic.201900335 applications including retention time (RT) prediction, MS/MS
© 2020 The Authors. Proteomics published by Wiley-VCH GmbH. This is spectrum prediction, de novo peptide sequencing, PTM predic-
an open access article under the terms of the Creative Commons tion, major histocompatibility complex (MHC)-peptide binding
Attribution License, which permits use, distribution and reproduction in prediction, and protein structure prediction (Figure 1). Finally,
any medium, provided the original work is properly cited. we discuss future directions and limitations of deep learning in
DOI: 10.1002/pmic.201900335 proteomics.

Proteomics 2020, 20, 1900335 1900335 (1 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

Figure 1. Overview of the key components of deep learning and its applications in proteomics.

2. Basic Concepts in Deep Learning a neural network requires the following components: training
samples with input data (e.g., peptide sequences) and matching
Deep learning seeks to learn the representation of data through targets (e.g., retention times of the peptides), a network model, a
a series of successive layers of increasing abstraction.[9] These loss function, and an optimization method. The network model,
layered representations are learned via models called artificial with multiple layers connected together, maps the input data to
neural networks (ANNs). ANNs, where many simple units called predictions. A loss function then computes a loss value which
neurons are connected to each another with different weights, measures how well the network’s predictions match the expected
simulate the mechanism of learning in the human brain. These outcomes by comparing these predictions with the targets. The
weights serve the same role as the indication of strengths be- optimization method uses this loss value as a feedback signal
tween synaptic connections in biological organisms. Training to incrementally adjust the weights of the network connections

Proteomics 2020, 20, 1900335 1900335 (2 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

in order to optimize the model. This method of finding optimal filters can capture different patterns in the input data. CNNs have
weights for the neural network is called backpropagation.[9] The been widely used in the analysis of medical image data and have
target variable can be categorical or continuous. Whereas the for- also been applied to DNA and protein sequence data.[4] Unlike
mer corresponds to classification problems, the later corresponds CNNs, RNNs process an input sequence one element at a time
to regression problems. step by using recurrent and cyclic connection units, and the out-
One important aspect of deep learning, or machine learning put for each step depends not only on the current element but
in general, is data preprocessing or input encoding to make raw also on previous elements. RNNs can capture long-range interac-
data, such as peptide or protein sequences, more amenable to the tions within the sequence and are well-suited to model sequential
models. Typically, all input and output variables are required to be data such as DNA or protein sequences. For example, if an input
numeric. MS/MS spectra can be simply discretized to produce an sequence is a peptide or protein sequence, each element could be
intensity vector.[10] For sequence-based data such as peptide and an amino acid.
protein sequences, the sequence is first segmented into tokens Conventional RNNs typically suffer from what are called the
(amino acids) and then each token is associated with a numeric vanishing and exploding gradient problems when the sequence
vector. There are multiple ways to associate a vector with a to- is very long.[17] Although it is theoretically capable of retaining
ken (Figure 1). One of the simplest and most widely used meth- the information about inputs seen many time steps earlier at
ods is called one-hot encoding where each amino acid is repre- time step t, in practice, such long-term dependencies are diffi-
sented by a unit binary vector of length n, containing a single one cult to learn. This happens when the gradients used to update
and n-1 zeros (e.g., [1,0,0, …, 0] for one amino acid and [0,1,0, …, the weights become extremely small or large and do not con-
0] for another amino acid). This solution treats all amino acids tribute to the learning process or render the model too unstable
equally without using any prior knowledge. Another approach is for continued learning. In other words, the RNNs become un-
to use the BLOcks SUbstitution Matrix (BLOSUM) for encoding, trainable. To overcome this, novel network architectures such as
representing each amino acid by its corresponding row in the long short-term memory units (LSTMs)[18] and gated recurrent
BLOSUM matrix.[11] Instead of treating all amino acids indepen- units (GRUs)[19] were proposed. They have internal mechanisms
dently, the BLOSUM matrix derived from protein sequence align- called gates that can regulate the flow of information. These gates
ments keeps the evolutionary information about which pairs of can learn which data in a sequence is important to keep or dis-
amino acids are easily interchangeable during evolution. This in- card, thus preventing older signals from gradually vanishing or
formation may be useful in certain applications such as MHC- exploding during processing. To allow RNNs to have both back-
peptide binding prediction. Another way to encode amino acid se- ward and forward information about the sequence at every time
quences is the use of dense numeric vectors, also called word em- step, two independent RNNs can be used together to form a new
bedding, which is widely used in natural language processing.[12] network called bidirectional RNN (BiRNN). The input sequence
Unlike the sparse vectors obtained via one-hot encoding where is fed in normal order for one RNN, and in reverse order for the
most elements are zero, these vectors could be learned from other one. The outputs of the two RNNs are then concatenated
large unlabeled protein datasets, such as all sequences pulled at each step. If LSTMs or GRUs are employed, it is then called
from the UniProt database, in an unsupervised manner.[13] These bidirectional long short-term memory (BiLSTM) or bidirectional
vectors could also be learned jointly with the main task (e.g., gated recurrent unit (BiGRU), respectively.
RT prediction or MHC-peptide binding prediction) in the same Other newer network architectures are being continuously
way that the weights of the neural network of the main task are developed. For example, capsule networks (CapsNets)[20] group
learned.[14] This type of encoding method has been demonstrated neurons in each layer into multiple capsules, allowing better
to be extremely useful in certain tasks.[12,14–16] Before encoding a modeling of hierarchical relationships inside a neural network. A
sequence as dense numeric vectors, the sequence is typically rep- deep learning algorithm may also combine different types of ar-
resented as an integer vector in which each token is represented chitectures in a network. For example, combining CNN and RNN
by a unique integer. The final method is to design handcrafted (LSTM or GRU) in one network could leverage the strengths of
features and then take these features as input for modeling. This both architectures to achieve better performance than just using
is the most common method used in traditional machine learn- one of them.
ing and is different from the previous three methods, in which Deep learning has already been used in a number of pro-
handcrafted feature engineering is typically not required. teomics applications (Figure 1), in which the overall workflow de-
The behavior of neural networks is largely shaped by its net- scribed above is generally applicable. However, individual tasks
work architecture. A network’s architecture can generally be char- may require additional customization.
acterized by: 1) number of neurons in each layer, 2) number
of layers, and 3) types of connections between layers. The most 3. Deep Learning for Retention Time Prediction
well-known architectures include: deep neural networks (DNNs),
convolutional neural networks (CNNs), and recurrent neural net- In MS-based proteomics experiments, peptide mixtures are
works (RNNs) (Figure 1). In this review, DNNs refer to networks typically separated via an LC system prior to analysis by MS. The
that consist of an input layer, multiple hidden layers and an out- retention time of a peptide refers to the time point when the pep-
put layer. Nodes from adjacent layers are fully connected with tide elutes from the LC column in an LC-MS/MS system, which
each other. CNNs mainly consist of convolutional layers and pool- is recorded by the instrument. Retention time of a peptide is
ing layers, frequently followed by a number of fully connected determined by the degree of the peptide interaction with the sta-
layers. One of the key processes of CNNs is to slide a filter over tionary and mobile phases of the LC system. The retention time
the input (such as an image or a sequence), where different of peptides is highly reproducible under the same LC conditions.

Proteomics 2020, 20, 1900335 1900335 (3 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

Table 1. List of deep learning-based retention time prediction tools.

a)
No. Software Framework Core network model Input encoding Usability Year Reference

1 DeepRT PyTorch CNN Word embedding O,C,P,T 2018 [44]


2 Prosit Keras/TensorFlow RNN Word embedding O,C,W,P,T 2019 [29]
3 DeepMass Keras/TensorFlow RNN One-hot - 2019 [45]
4 Guan et al. Keras/TensorFlow RNN One-hot O,C,P,T 2019 [46]
5 DeepDIA Keras/TensorFlow CNN+RNN One-hot O,C,P,T 2020 [30]
6 AutoRT Keras/TensorFlow CNN+RNN One-hot O,C,P,T 2020 [25]
7 DeepLC TensorFlow CNN One-hot, global features, amino/diamino acids composition O,G,C,P,T 2020 [47]
a)
O, open-source; G, graphical user interface; C, command line; P, provide trained model for prediction; W, web interface; T, provide option for model training. The link of
each tool could be found at https://github.com/bzhanglab/deep_learning_in_proteomics.

Accurately predicted retention times have several applications similar to DeepMass; however, it uses two BiLSTM layers, and a
in MS-based proteomics, including 1) improving sensitivity of masking layer is used to discard padding sequences during train-
peptide identification in database searching,[21–24] 2) serving as ing and prediction.
a quality evaluation metric for peptide identification,[25–28] 3) Both DeepRT and DeepLC use CNN-based architectures, with
building spectral libraries for DIA data analysis,[29–33] and 4) DeepRT specifically using a CapsNet which is a variant of CNN.
facilitating targeted proteomics experiments. Similar to Prosit, DeepRT includes an embedding layer as the
Studies of peptide RT prediction can be tracked back to the first layer of the neural network. In contrast, DeepLC uses a stan-
1980s[34,35] with studies continuing to focus on improving RT pre- dard CNN framework. A unique feature of DeepLC, compared
diction to this day.[21,25,29,36–39] Methods for peptide RT prediction with all other tools in Table 1, is the ability to predict RT for pep-
can be divided into two primary categories: index-based methods, tides with modifications that are not present in the training data.
such as SSRCalc,[40,41] and machine learning-based methods. This is mainly achieved by using a new peptide encoding based
Machine learning-based methods can be further divided into on atomic composition. Specifically, each peptide is encoded as a
two sub groups: traditional machine learning-based methods matrix with a dimension of 60 for the peptide sequence by 6 for
including Elude[42,43] and GPTime,[38] and deep learning-based the atom counts (C, H, N, O, P, and S). For a peptide with a length
methods including DeepRT,[44] Prosit,[29] DeepMass,[45] Guan shorter than 60 amino acids, it will be padded with the charac-
et al.,[46] DeepDIA,[30] AutoRT,[25] and DeepLC.[47] As shown ter “X” without atomic composition to make it the same length
in Table 1, deep learning-based tools can be divided into three of 60. For modified amino acids, the atomic composition of the
groups based on the type of neural network architecture used: modification is added to the atomic composition of the unmodi-
RNN-based, CNN-based, and hybrid networks, with RNN as the fied residue. In addition to this encoding, a peptide is further en-
dominant architecture because it was developed for sequential coded in three additional ways to capture other position-specific
data modeling. Several of these tools also have a separate module information and global information. The four encoding results
for MS/MS spectrum prediction (see next Section). are fed into the network through different paths. The last part of
Prosit is a representative tool of the RNN-based group. In the network consists of six connected dense layers, which take
Prosit, a peptide sequence is represented as a discrete integer as input the outputs from the previous paths. DeepLC showed
vector of length 30, with each non-zero integer mapping to one comparable performance to the state-of-the-art RT prediction al-
amino acid and padded with zeros for sequences shorter than 30 gorithms for unmodified peptides and achieved similar perfor-
amino acids. The padding operation forces all encoded peptides mance for unseen modified peptides to that for the unmodified
to have the same length. The deep neural network for RT predic- peptides.[47]
tion in Prosit consists of an encoder and a decoder. The encoder Other RT prediction models, such as DeepDIA and a model
encodes the input peptide sequence data into a latent representa- we developed called AutoRT, combine both CNN and RNN
tion, whereas the decoder decodes the representation to predict in the same networks. In DeepDIA, one-hot encoded peptide
RT. The peptide encoder consists of an embedding layer, a Bi- sequences are fed into a CNN network, which is followed by a
GRU layer, a recurrent GRU layer, and an attention layer.[48] The BiLSTM network. AutoRT uses a similar strategy to combine
learned representation of the input peptides captures the intrin- CNN and RNN networks, but GRU rather than LSTM is used.
sic relations of different amino acids. The decoder connects the One unique feature of AutoRT is the use of a genetic algorithm to
latent representation learned from the encoder to a dense layer to enable automatic deep neural network architecture search (NAS),
make predictions. Prosit was shown to outperform SSRCalc and through which the ten best-performing models are identified
Elude for RT prediction in the original study.[29] The RT predic- and ensembled for RT prediction. NAS is a fast growing research
tion method proposed in DeepMass is also based on RNN archi- area. The architectures from NAS have been demonstrated to be
tecture. DeepMass uses one-hot encoding for peptide sequence on par with or outperform hand-designed architectures in many
representation, and the network includes a BiLSTM layer and an- tasks.[49,50] Another feature of AutoRT is the use of transfer
other LSTM layer followed by two dense layers. DeepMass was learning. Specifically, base models are trained using a large pub-
compared to SSRCalc in the original study and showed supe- lic dataset (>100 000 peptides), and then the trained base models
rior performance.[45] The RT model proposed by Guan et al.[46] is are fine-tuned using data from an experiment of interest to

Proteomics 2020, 20, 1900335 1900335 (4 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

develop experiment-specific models. By leveraging large public 4. Deep Learning for MS/MS Spectrum Prediction
datasets, transfer learning makes it possible to obtain a highly
accurate model even with a small size of experiment-specific In a typical MS/MS-based proteomics experiment, hundreds of
training data (≈700 peptides). This is very useful because only a thousands of MS/MS spectra can be generated. Information
few thousand peptides may be identified in a single run in many in an MS/MS spectrum generated from bottom-up proteomics
experiments.[25] consists of mass-to-charge ratios (or m/z) and intensities of a
Accurate RT predictions from deep learning models have set of fragment ions generated from digested peptides using
led to promising applications. For example, we used the dif- methods like collision induced dissociation (CID), higher-energy
ference (ΔRT) between AutoRT predicted RT and experimen- collisional dissociation (HCD) or electron-transfer dissociation
tally observed RT for each identified peptide as an evalua- (ETD).[52] The patterns of an MS/MS spectrum for a peptide
tion metric for comparing different quality control strategies (the m/z and intensities of fragment ions, and their types) are
for variant peptide identification.[25] The evaluation results pro- mainly determined by a few key factors including: 1) The type of
vide insights and practical guidance on the selection of qual- MS instrument as well as the fragmentation method (e.g., CID,
ity control strategies for variant peptide identification. Simi- HCD, or ETD) used to fragment peptides and its setting, such
larly, Li et al.[51] used ΔRT derived from AutoRT prediction as as normalized collision energy (NCE), 2) peptide sequence, and
a feature to rescore peptide spectrum matches (PSMs) in the 3) the precursor charge state of the peptide.[29,53] Peptide identi-
analysis of immunopeptidomics data. Interestingly, rescoring fication relies primarily on the patterns of these fragment ions.
with AutoRT led to significantly improved sensitivity of pep- Although the mechanism underlying peptide fragmentation is
tide identification, while rescoring with the ΔRT feature derived complicated and still not well-understood, these patterns are re-
from the traditional machine learning-based tool GPTime only producible and, in general, predictable as demonstrated by many
showed minor improvement.[51] Deep learning-based RT predic- studies.[54–57]
tion can also be used together with MS/MS spectrum predic- A number of tools have been developed to predict MS/MS
tion to build an in silico spectral library for DIA data analysis, spectra from peptide sequences. These methods can be divided
as demonstrated in a few recent studies.[30–32] Deep learning- into hypothesis-driven methods and data-driven methods. Sev-
based RT prediction has not been used in any published tar- eral hypothesis-driven algorithms have been developed based
geted proteomics studies, but we expect this to change in the near on the mobile proton hypothesis, which is a widely accepted
future. hypothesis to study peptide fragmentation pathways in tan-
Although significant improvement has been made for peptide dem mass spectrometry.[58–61] MassAnalyzer is a popular tool
RT prediction using deep learning, RT prediction for peptides in this category.[58] Data-driven methods, or more generally
with modifications remains a major challenge. Some existing machine learning-based methods, include traditional machine
models consider a few common artifactual modifications, such learning-based tools, such as PeptideART,[55,62] MS2 PIP,[63–65]
as oxidation of methionine.[25,46] In these models, modified and MS2 PBPI,[66] and other tools,[67,68] and deep learning-based
unmodified amino acids are processed equally. These models tools as shown in Table 2, such as pDeep,[57,69] Prosit,[29]
can be used to predict RT for peptides containing these modi- DeepMass:Prism,[45] MS2 CNN,[70] DeepDIA,[30] Predfull[56] and
fications, but the prediction errors are likely to be higher than the model proposed in Guan et. al[46] (Figure 2). Deep learning
those for peptides without modification due to the relative low models have been demonstrated to outperform both traditional
frequency of modified amino acids in the training data. DeepLC machine learning models and hypothesis-driven methods. The
is the only model that can predict RT for peptides containing spectra predicted by deep learning models are highly similar to
modifications not present in the training data. However, the the experimental spectra. Remarkably, the similarities between
performance of RT prediction for the modifications that are deep learning predicted spectra and corresponding experimental
chemically very different from anything encountered in the spectra are very close to the average similarities between repli-
training set, such as phosphorylation, is obviously lower than cated experimental spectra for the same peptides.[56,57]
others. Moreover, peptide encoding considering atomic com- pDeep consists of two BiLSTM layers followed by a time-
position cannot differentiate between isomeric structures that distributed fully connected output layer, and it takes a one-hot
are physicochemically different. RT prediction for peptides with encoded peptide sequence and corresponding precursor charge
complicated modifications such as glycosylation is even more state of the peptide as inputs and outputs intensities of differ-
difficult. There is no deep learning-based tool reported to predict ent fragment ion types at each position along the input peptide
RTs for intact glycosylated peptides yet. Thus, new training sequence[57] (Figure 2). pDeep was first developed based on built-
strategies or deep learning networks are needed to improve RT in static LSTM APIs in Keras that only accept an input peptide se-
prediction for peptides with modifications. Moreover, all existing quence with a predefined fixed length (20 in the original paper).
deep learning-based tools are developed for RT prediction of When a peptide sequence is shorter than the predefined length,
linear peptides. They cannot be used for RT prediction for “zeros” are padded into the sequence and masked by a masking
cross-linked peptides generated using cross-linking mass spec- layer. On the other hand, peptides that are longer than the pre-
trometry, in which two peptides are typically connected to form defined length will be discarded with no prediction. pDeep2 im-
a cross-linked peptide. RT prediction of cross-linked peptides proves the original version by using the dynamic BiLSTM API in
using deep learning will require the design of new frameworks TensorFlow, which dynamically unrolls the recurrent cell based
as well as new peptide encoding methods. Moreover, it may also on the length of the input sequences to overcome the length
be difficult to generate enough cross-linked peptides for model limitation.[69] Dynamic LSTM can also avoid the calculation for
training. extra “zero” padded sequences, which potentially improves the

Proteomics 2020, 20, 1900335 1900335 (5 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

Table 2. List of deep learning-based MS/MS spectrum prediction tools. The fragment ion type supported by each tool is summarized based on its original
publication and available trained models.

a)
No. Software Framework Core network Fragment ion type Usability Year Reference
model

1 pDeep/pDeep2 Keras/TensorFlow RNN b/y; c/z O,C,P,T 2017/2019 [57, 69]


2 Prosit Keras/TensorFlow RNN b/y O,C,W,P,T 2019 [29]
3 DeepMass:Prism Keras/TensorFlow RNN b/y W 2019 [45]
4 Guan et al. Keras/TensorFlow RNN b/y O,C,P,T 2019 [46]
5 MS2 CNN Keras/TensorFlow CNN b/y O,C,P 2019 [70]
6 DeepDIA Keras/TensorFlow CNN+RNN b/y O,C,P,T 2020 [30]
7 Predfull TensorFlow CNN All possible ions at O,C,W,P,T 2020 [56]
all m/z axises
a)
O, open-source; C, command line; P, provide trained model for prediction; W, web interface; T, provide option for model training. The link of each tool could be found at
https://github.com/bzhanglab/deep_learning_in_proteomics.

prediction speed. In order to predict MS/MS spectrum for modi- sequences. The features used in MS2 CNN include peptide com-
fied peptides without sufficient training data, pDeep2 uses trans- position (similar to amino acid composition), mass-to-charge
fer learning to train PTM models on top of the base model devel- ratio (m/z), and peptide physicochemical properties such as
oped for unmodified peptides. The prediction performance for isoelectric point, instability index, aromaticity, secondary struc-
modified peptides is comparable to that for unmodified peptides. ture fraction, helicity, hydrophobicity, and basicity. Because
In pDeep2, a modification is represented as a feature vector of peptide associated metadata are not used in the modeling, the
length eight based on its chemical composition (e.g., the chemi- models can only be applied to data generated under matched
cal composition of phosphorylation which often occurs on serine experiment conditions.
(S), threonine (T), or tyrosine (Y) is HPO3, thus it is encoded DeepDIA uses a hybrid CNN and BiLSTM network for MS/MS
as a feature vector [1,0,0,3,0,1,0,0]). This is similar to how mod- spectra prediction. This model is similar to the one used for RT
ifications are encoded in DeepLC. With this encoding scheme, prediction in DeepDIA. A peptide sequence is encoded using
pDeep2 models can be used to predict spectra for peptides with one-hot encoding. Separate models are required to be trained for
modifications that are not present in the training data. However, different MS conditions and peptide precursor charge states.
the prediction performance for those peptides is very low with- All of the aforementioned methods aim to predict the intensi-
out using transfer learning. In addition to one-hot encoded pep- ties of expected backbone fragment ion types (e.g., b/y ions for
tide sequences, associated feature vectors of modifications, and CID and HCD spectra, c/z ions for ETD spectra, as well as their
corresponding precursor charge states of the peptides, other as- associated neutral losses). However, besides the backbone frag-
sociated metadata including the instrument type and the colli- ment ions, MS/MS spectra could contain many additional frag-
sion energy are also encoded as inputs. Including peptide asso- ment ions that are derived from peptide fragmentation rather
ciated metadata in the modeling process allows the application than background noise.[56,72] These fragment ions are typically
of the resulted models to different MS instruments and settings, ignored in spectra annotation and PSM scoring. A recent study
thus avoiding the need to train models for each combination of showed that these fragment ions could account for ≈30% of to-
MS experiment parameters. The model used by Guan et al.[46] tal ion intensities in HCD spectra.[56] Some of the ignored ions
for MS/MS spectrum prediction is similar to pDeep except for with high intensity may be informative and thus can be used
slightly different input and output structures. to improve peptide identification. Predfull utilizes a generalized
Both Prosit and DeepMass:Prism are also BiRNN-based net- sequence-to-sequence model based on the structure of the resid-
works. Prosit uses a BiGRU network, whereas DeepMass:Prism ual CNN and a multitask learning strategy to predict the intensi-
uses a BiLSTM network. Similar to pDeep2, peptide sequences ties for all possible m/z from peptide sequences without assump-
along with associated metadata are encoded as input. A pep- tions or expectations on which kind of ions to predict.[56] Each
tide sequence is encoded using one-hot encoding in Deep- MS/MS spectrum in the training data is represented as a sparse
Mass:Prism, whereas it is represented as a discrete integer vector 1D vector by binning the m/z range between 180 and 2000 with
for feeding into an embedding layer of the network in Prosit. Both a given bin width so that all the peaks in an MS/MS spectrum
tools use a fixed length of peptide encoding. In other words, the are used in the training. This is fundamentally different from
trained models from the two tools cannot make predictions for other tools in which only the annotated backbone ions are used
any peptides with a length exceeding the longest peptide in the for training. In addition, a multitask learning strategy is used in
training data. Predfull to improve the prediction accuracy for spectra with in-
MS2 CNN is based on CNN rather than RNN (LSTM or GRU). sufficient training data (e.g., 1+ and 4+ HCD spectra and ETD
A single CNN model based on the network structure of LeNet- spectra of all charges). Predfull showed better performance than
5[71] is constructed to predict MS/MS spectra for peptides of the backbone-only spectrum predictors (pDeep, Prosit and Deep-
a specific length and precursor charge state. Unlike the above Mass:Prism).
models, MS2 CNN uses handcrafted features of peptides as input Accurately predicted MS/MS spectra from peptide sequences
instead of learning peptide representation directly from peptide have promising applications. First, they can be used to improve

Proteomics 2020, 20, 1900335 1900335 (6 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

Figure 2. Brief network architectures of the deep learning tools for MS/MS spectrum prediction. FC layer refers to the fully connected layer, BiLSTM
refers to bidirectional LSTM, and BiGRU refers to bidirectional GRU. For different models, metadata may include precursor charge state, precursor mass,
collision energy, instrument type, etc. “∼” is the cleavage site.

protein identification in DDA data analysis. For database search- analysis or the method development in targeted proteomics
ing, the predicted MS/MS spectra can be used either in the experiments (e.g., MRM or PRM experiments). A spectral library
scoring of PSMs by a search engine[45] or in PSM rescoring mainly contains the peptide RT and peptide fragment ions and
using post-processing tools such as Percolator.[29,51,73] For their intensities, and both can be predicted accurately using deep
spectral library searching, accurately predicted MS/MS spec- learning methods. Traditionally, such a spectral library is built
tra can lead to comprehensive high quality spectra libraries. based on peptide identifications from conventional DDA exper-
For de novo peptide sequencing, deep learning-based MS/MS iments, which often involve offline pre-fractionation of peptide
spectrum prediction could be useful in ranking candidate samples to improve the coverage of the library.[75–77] Therefore, it
peptides.[74] requires extra instrument time and cost to generate such a library
Next, predicted MS/MS spectra combined with RT prediction for DIA data analysis because of the complex mixtures of peptides
can be used to build a spectral library in silico in DIA data in DIA MS2 scans. Moreover, such a library still suffers from the

Proteomics 2020, 20, 1900335 1900335 (7 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

limitations of DDA experiments for peptide identification. In for MS/MS spectrum prediction of single peptides in which a
addition, generating such a library for peptides with PTMs such predicted spectrum corresponds to a single linear peptide. As
as phosphorylation is challenging. Recently, a few studies have with RT prediction, MS/MS spectra prediction for cross-linked
demonstrated the potential of deep learning-based MS/MS spec- peptides will require new frameworks and new peptide encoding
trum prediction in DIA library generation.[29-32,78] We expect in methods.
silico spectral library generation using deep learning will become
increasingly popular in DIA data analysis. In targeted proteomics 5. Deep Learning for De Novo Peptide Sequencing
experiments, the predicted spectral library is especially useful
to guide the method development (e.g., transition list design Another breakthrough application of deep learning in the field
in MRM assays) for detecting proteins with low abundance of proteomics is de novo peptide sequencing, as demonstrated
or novel proteins that are typically difficult to detect in DDA in DeepNovo.[10] In de novo peptide sequencing, the peptide se-
experiments. quence is directly inferred from an MS/MS spectrum without
Finally, deep learning-based MS/MS spectrum prediction can relying on a protein database. If we regard an MS/MS spec-
enhance our understanding of the principles behind peptide frag- trum as an image and the peptide sequence as an image descrip-
mentation. For example, Tiwary et al.[45] reported that the out- tion, de novo peptide sequencing bears some similarity to deep
puts from DeepMass:Prism can indicate the fragmentation effi- learning-based image captioning,[81,82] which is the task of gen-
ciencies between different amino acid pairs. In addition, in order erating a description in a specific language for a given image.
to study the influence between each amino acid in the peptide Encoder-decoder architecture is one of the widely used architec-
sequence and the predicted intensity of each peak, the method tures in deep learning-based image captioning, where an image-
of integrated gradients[79] was used to attribute predictions from CNN layer is typically used to encode the image into a hidden
DeepMass:Prism to specific input amino acids. Zhou et al.[57] representation, and an RNN (e.g., LSTM) layer is used to decode
showed that accurate prediction of a spectrum by deep learning and predict the words one by one to form sentences of a language
enables discrimination of isobaric amino acids, such as I versus (Figure 3A). DeepNovo views the input spectrum as an image and
L, GG versus N, AG versus Q, KR versus RK, etc. Furthermore, the output peptide sequence as a sentence of a protein language.
Guan et al.[46] showed that the discriminative power for isomeric More specifically, DeepNovo first discretizes a spectrum into
peptides is higher when isobaric amino acid-related local ion sim- an intensity vector with length 500 000 (high-resolution data,
ilarities are considered. 0.01 Da per pixel/bin, up to 5000 Da) or 50 000 (low-resolution
Each tool has its own strengths and weaknesses. Compre- data, 0.1 Da per pixel/bin, up to 5000 Da). It then uses a spectrum-
hensive independent benchmarking of existing tools is essen- CNN as an encoder for the intensity vector, and an LSTM as
tial to guide the selection of methods for real applications. Re- a decoder. In order to capture amino acid signals in the spec-
cently, Xu et al.[80] benchmarked three deep learning-based tools trum, the intensity vector is further processed using an amino
(Prosit, pDeep2 and Guan’s work46 ) and one traditional machine acid-mass shift operation before being fed into the spectrum-
learning-based tool (MS2 PIP) for MS/MS spectra prediction. The CNN. The outputs of the spectrum-CNN are then passed into
results showed that the deep learning-based tools outperform the decoder. The LSTM model aims to predict the probabili-
MS2 PIP and the performance of deep learning-based tools may ties of all considered amino acids at each position of a pep-
vary across different datasets and different peptide precursor tide sequence. Specifically, at position t, LSTM takes the previ-
charge states. ously predicted amino acid and previous hidden state at posi-
Although significant improvements have been made for tion t − 1 as input to predict the probabilities of the next amino
MS/MS spectrum prediction using deep learning from peptide acids. In addition, an ion-CNN model is used to learn features
sequences, there is still much room for improvement in the of fragment ions in a spectrum, and the outputs are combined
prediction for peptides with modifications. Most of the current with the LSTM model and run step by step starting from an
MS/MS spectrum prediction models are mainly developed for empty sequence and ending with the full peptide sequence. Be-
unmodified peptides, and most of the existing models cannot be cause the model has no way to restrict the mass of the predicted
directly used to predict peptides with modifications not present peptide sequence within the tolerance window of the precursor
in the training data. Although some of the current tools can be mass, DeepNovo uses the knapsack dynamic programming al-
trained with peptides containing modifications of interest, spe- gorithm to quickly filter out any “precursor-unreachable” amino
cific training strategies such as transfer learning are required to acid prediction at position t, and only considers “precursor-
achieve satisfactory prediction performance because of the small reachable” predictions (informally, a prediction at any position is
size of available training data with specific modifications. This “precursor-unreachable”, if the predicted subsequence can never
situation is similar to RT prediction for peptides with modifica- reach the precursor mass by considering any amino acid combi-
tions. In pDeep2, Zeng et al.[69] have demonstrated that the trans- nations, otherwise it is “precursor-reachable”). After combining
fer learning strategy could significantly improve the prediction the spectrum-CNN, LSTM, ion-CNN, and knapsack, DeepNovo
for modified peptides with limited training examples. However, outperforms PEAKS,[83] Novor,[84] and PepNovo[85] in terms of
transfer learning is only well supported in pDeep2. Furthermore, recall on both the peptide and amino acid levels, showing the
for some PTMs like glycosylation, the prediction of MS/MS spec- extraordinary ability of DeepNovo.
tra for intact glycopeptides would be more challenging due to DeepNovo was later extended to DeepNovo-DIA to perform
the complexity of intact glycopeptides and the lack of large exper- de novo sequencing on DIA data.[86] The basic framework of
imental high quality MS/MS spectra from intact glycopeptides. DeepNovo-DIA is similar to DeepNovo. In DIA data, there are
In addition, all current deep learning-based tools are developed multiple MS/MS spectra associated with a given precursor ion,

Proteomics 2020, 20, 1900335 1900335 (8 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

Figure 3. From image captioning to DeepNovo. A) A typical neural network architecture of image captioning. B) The neural network architectures of
DeepNovo and DeepNovo-DIA.

and each MS/MS spectrum typically contains fragment ions from by querying a protein sequence database, resulting in higher
multiple peptides. DeepNovo-DIA stacks these spectra along the sequencing accuracies. It has been shown that this strategy
retention time dimension to form a 2D intensity vector. For a could effectively improve the accuracy of SMSNet. Since it
given precursor ion, besides its associated MS/MS spectra, its relies on the protein sequence database, the performance of
MS1 intensity profile is also encoded to feed into the network. SMSNet may be limited by the quality and completeness of
The correlation between the precursor and its fragment ions the database provided. SMSNet outperforms DeepNovo on a
could be learned in the ion-CNN module of DeepNovo-DIA. For few different datasets and has shown promising application in
the traditional de novo sequencing algorithm, it is not an easy immunopeptidomics.[87]
task to redesign the algorithm to support DIA data analysis due The purely data-driven deep learning model is not the only
to the high complexity of DIA spectra. In contrast, data-driven way to improve the performance of de novo peptide sequencing.
DeepNovo can perform DIA data analysis after only redesigning By considering the predicted spectra based on deep learn-
the partial architecture to utilize the extra dimensionality of DIA ing, pNovo3 re-ranks the peptide candidates generated by
data (m/z and retention time). However, validating the results of pNovo+ (a spectrum-graph and dynamic programming based
de novo sequencing from DIA data is quite difficult and is still an algorithm[89] ) using a learning-to-rank framework, leading to
open problem. higher accuracies than DeepNovo.[74] This example shows that
Recently, Karunratanakul et al.[87] developed SMSNet to fur- coupled with deep learning, traditional de novo sequencing can
ther improve de novo peptide sequencing using deep learning. be improved as well. In conclusion, deep learning has opened a
The deep learning architecture of SMSNet is similar to that new perspective for de novo peptide sequencing.
used in DeepNovo. A key innovation in SMSNet is the use Clear improvement has been achieved using deep learning
of the multi-step Sequence-Mask-Search strategy. For tradi- compared with previous de novo peptide sequencing methods.
tional de novo sequencing algorithms, Muth et.al. showed that This has led to a few promising applications using de novo
most of incorrect peptide predictions are from locally incorrect peptide sequencing including complete de novo sequencing of
short subsequences.[88] Local incorrectness is also a problem antibody sequences and discovery of new the human leukocyte
in deep learning-based de novo sequencing. The multi-step antigen (HLA) antigens.[10,87,90] However, there is still a huge
Sequence-Mask-Search strategy addresses this problem. More gap between de novo peptide sequencing and database search
specifically, SMSNet further uses a rescoring network after based peptide identification methods in terms of accuracy of
the encoder-decoder network to estimate the confidence score peptide identification, thus resulting in its limited applications
of each amino acid of the predicted sequence. If unconfi- in proteomics studies. Further improvement could be achieved
dent local amino acids are detected, SMSNet corrects them from, but not limited to, the following aspects. First, training

Proteomics 2020, 20, 1900335 1900335 (9 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

using larger datasets may further improve the models.[91] The phosphosite prediction tool, DeepPhos, uses densely connected
second aspect is training species-specific models using transfer CNN blocks including both intra block concatenation layers
learning by leveraging large datasets from other species. Since and inter block concatenation layers to make final phosphory-
the protein sequences from different species may have different lation predictions.[97] This network architecture aims to capture
patterns, the deep learning models trained using MS/MS data multiple representations of sequences. Similar to MusiteDeep,
from one species may not generalize well to another species, DeepPhos also utilizes the transfer learning strategy to perform
but the patterns and rules learned from other species with large kinase-specific prediction. DeepPhos was shown to outperform
datasets could benefit the training for a specific species with a MusiteDeep in both general site and kinase-specific predictions.
relatively small dataset. The third aspect is extensively optimizing In contrast to MusiteDeep and DeepPhos in which a model is
the current deep learning architectures using hyperparameter trained for each kinase independently, in a more recent tool, EM-
tuning methods or designing more efficient architectures using BER, a single unified multi-label classification model was trained
neural architecture search algorithms. to predict phosphosites for multiple kinase families using deep
learning.[99] In EMBER, a 15-amino acid sequence centered
6. Deep Learning for Post-Translational at the prediction site is first encoded as input not only using
Modification Prediction one-hot encoding but also using embedding based on a Siamese
neural network.[124] The Siamese network, which comprises of
Over 300 types of PTMs are known to occur physiologically two identical LSTM networks with identical hyperparameters as
across different proteins.[92] PTMs tremendously increase the well as learned weights, is used to learn a semantically meaning-
complexity of cellular proteomes, diversify protein functions, ful vector representation for each sequence. Each LSTM network
and play important roles in many biological processes.[93,94] takes a different peptide as input and the two networks are joined
PTMs can be experimentally identified in both low-throughput at the final layer. The two types of encoded sequences are then
experiments and high-throughput MS-based experiments.[95] In fed into respective CNNs, which have identical hyperparameters.
addition, computational algorithms can also be used to predict The CNNs are concatenated in the final layer, followed by a
PTM sites. Machine learning is the primary approach used for series of fully connected layers. The output is an eight-element
PTM prediction because of its flexibility and performance. The vector, in which each value corresponds to the probability of the
prediction of a specific type of PTM site, such as phosphory- input site being phosphorylated by a kinase family. In order to
lation, can be formulated as two classification tasks. The first, leverage evolutionary relationships between kinase families in
referred to as general site prediction, is to predict whether a the modeling, a kinase phylogenetic metric is calculated and
given site can be modified, such as by being phosphorylated. used via a kinase phylogeny-based loss function.
The second is to predict whether a given site can be modified by A common limitation of MusiteDeep, DeepPhos, and EMBER
a specific enzyme, such as a specific kinase for phosphorylation, is that they only predict phosphorylation sites for a limited num-
referred to as enzyme-specific prediction. ber of kinases with sufficient numbers of known substrate phos-
Deep learning has been used in the prediction of PTM sites for phosites. However, among over 500 protein kinases described in
phosphorylation,[96–100] ubiquitination,[101,102] acetylation,[103–106] the human proteome, only a small fraction have more than 30
glycosylation,[107] malonylation,[108,109] succinylation,[110,111] annotated substrate phosphosites,[98,100,125] and more than 95% of
glycation,[112] nitration/nitrosylation,[113] crotonylation[114] and the known phosphosites have no known upstream kinases.[125,126]
other modifications[115-117,224] as shown in Table 3. MusiteDeep, In order to address this limitation, a few deep learning meth-
the first deep learning-based PTM prediction tool, provides both ods have been developed to enable the prediction of phosphoryla-
general phosphosite prediction and kinase-specific phosphosite tion sites for kinases characterized by limited or no experimental
prediction for five kinase families, each with more than 100 data.[98,100] Inspired by the pan-specific method for MHC-peptide
known substrates.[96] MusiteDeep uses one-hot encoding of a binding prediction (see next Section), Fenoy et al.[98] proposed a
33-amino acid sequence centered at the prediction site (i.e., 16 CNN framework, NetPhosPan, to develop a pan-kinase-specific
amino acids flanking on each side of the site) as input where prediction model to enable kinase-specific predictions for any ki-
phosphorylation sites on S, T, or Y annotated by UniProt/Swiss- nase with a known protein sequence. In addition to requiring
Prot are used as positive data, whereas the same amino acid peptide sequences with the amino acids S, T, or Y in the cen-
excluding annotated phosphorylation sites from the same pro- ter as input, NetPhosPan also requires kinase domain sequences
teins are regarded as negative data. The input data are fed into to train a single model for kinase-specific predictions. Both pep-
a multi-layer CNN for classification. For kinase-specific site tide sequences and kinase domain sequences are encoded using
prediction, transfer learning from the base general phosphosite the BLOSUM matrix. In this way, the model can leverage infor-
model is used to train models for each kinase. The kinase- mation between different kinases to enable the predictions for
specific models make use of the general feature representations kinases without known sites and improve the predictions for ki-
learned from the model developed on general phosphorylation nases with few known sites.
data. This approach could also reduce possible overfitting caused DeepKinZero[100] is the first zero-shot learning (ZSL) tool to
by the limited numbers of kinase-specific substrates. In the predict the kinase which can phosphorylate a given site for ki-
original study, MusiteDeep was shown to outperform a few non- nases without known substrates or unseen kinases in train-
deep learning-based tools for general site prediction, including ing. Zero-shot learning is a type of machine learning method
Musite,[118] NetPhos 3.1,[119] ModPred,[120] and PhosPred-RF.[121] that can deal with recognition tasks for classes without training
It also outperformed Musite, NetPhos 3.1, GPS 2.0,[122] and GPS examples.[127] The key idea underlying DeepKinZero is to rec-
3.0[123] for kinase-specific prediction in most cases. Another ognize a target site of a kinase without any known site through

Proteomics 2020, 20, 1900335 1900335 (10 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
Proteomics 2020, 20, 1900335
Table 3. List of deep learning-based PTM prediction tools.
www.advancedsciencenews.com

No. Software PTM Framework Core network model Groupb) Window size Input encoding Usabilityc) Year Reference

1 DeepAce Acetylation Keras/Theano CNN+DNN G 53/29 One-hot, handcrafted features O,C,T 2018 [103]
2 Deep-PLA Acetylation Keras/TensorFlow DNN E 31 Handcrafted features W,P 2019 [104]
3 DeepAcet Acetylation TensorFlow DNN G 31 One-hot, BLOSUM, handcrafted features O 2019 [106]
4 DNNAce Acetylation Keras/TensorFlow DNN G 13/17/21 One-hot, BLOSUM, handcrafted features O 2020 [105]
5 pKcr Crotonylation Keras/TensorFlow CNN G 29 Word embedding W,P 2020 [114]
a)
6 DeepGly Glycation - CNN+RNN G 49 Word embedding - 2019 [112]
7 Long et al. Hydroxylation MXNet CNN+RNN G 13 Handcrafted features - 2018 [115]
8 MUscADEL Lysine PTMs - RNN G 27/1000 Word embedding W,P 2018 [224]
9 LEMP Malonylation TensorFlow RNN+RF G 31 Word embedding, handcrafted features W,P 2019 [108]
10 DeepNitro Nitration/Nitrosylation Deeplearning4j DNN G 41 handcrafted features C,W,P 2018 [113]
11 MusiteDeep Multiple Keras/TensorFlow CNN E, G 33 One-hot O,C,W,P,T 2017/2020 [96, 128]
12 NetPhosPan Phosphorylation Lasagne/Theano CNN E 21 BLOSUM C,W,P 2018 [98]

1900335 (11 of 21)


13 DeepPhos Phosphorylation Keras/TensorFlow CNN E, G 15/33/51 One-hot O,C,P,T 2019 [97]
14 EMBER Phosphorylation PyTorch CNN+RNN E 15 One-hot O,C,T 2020 [99]
15 DeepKinZero Phosphorylation TensorFlow ZSL E 15 Word embedding O,C,P,T 2020 [100]
16 CapsNet_PTM Multiple Keras/TensorFlow CapsNet G 33 Handcrafted features O,C,T 2018 [117]
17 GPS-Palm S-palmitoylation Keras/TensorFlow CNN G 21 Handcrafted features G,P 2020 [116]
18 CNN-SuccSite Succinylation - CNN G 31 Handcrafted features W,P 2019 [111]
19 DeepUbiquitylation Ubiquitination Keras/Theano CNN+DNN G 49 One-hot, handcrafted features O,C,P,T 2018 [102]
20 DeepUbi Ubiquitination TensorFlow CNN G 31 One-hot, handcrafted features O 2019 [101]
a) c)
Framework used in the tool is not available in the original study;b) G, general site prediction; E, enzyme-specific site prediction; O, open-source; G, graphical user interface; C, command line; P, provide trained model for
prediction; W, web interface; T, provide option for model training. The link of each tool could be found at https://github.com/bzhanglab/deep_learning_in_proteomics.
www.proteomics-journal.com

© 2020 The Authors. Proteomics published by Wiley-VCH GmbH


16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

transferring knowledge from kinases with many known sites to MHC class I molecules, whereas peptides presented by MHC
this kinase by establishing a relationship between the kinases class II molecules are usually of extracellular origin.[132] MHC
using relevant auxiliary information such as functional and se- genes are highly polymorphic, and it is important to know which
quence characteristics of the kinases. Similar to NetPhosPan, peptides can be presented by a specific MHC allele. There are
both substrate sequences and kinase sequences are required as two types of experimental assays for identifying MHC-binding
input, and they are encoded and fed into the zero-shot learning peptides, in vitro peptide binding assays and MS/MS analysis
framework for training. The kinases are encoded using a few dif- of MHC-bound peptides (immunopeptidomics).[133] The primary
ferent methods including kinase taxonomy and distributed rep- database for in vitro binding assay data is the Immune Epitope
resentation of their kinase domain sequences using ProtVec,[12] Database (IEDB),[134] and immunopeptidomics data can be found
which is different from the BLOSUM matrix encoding used in in IEDB, SysteMHC Atlas,[135,136] or new publications describing
NetPhosPan. large multi-allelic or single-allelic datasets.[137–139] Based on these
Although phosphorylation is the most widely studied PTM, data, many computational methods have been developed to pre-
deep learning has also been applied to other PTMs as shown in dict MHC-binding peptides.[140]
Table 3. Most of the tools for the other PTMs are general PTM Computational methods for MHC-peptide binding prediction
site prediction tools rather than enzyme-specific tools. A major can be grouped into allele-specific and pan-specific methods. Be-
advantage of deep learning is to learn representation efficiently cause biological samples used for immunopeptidomics analysis
from peptide sequences without handcrafted features for PTM typically carry multiple MHC alleles, allele-specific models are
site prediction. However, handcrafted features can also be fed typically trained with in vitro peptide binding assay data, and one
into deep neural networks just as for traditional machine learn- prediction model is constructed for each MHC allele separately.
ing algorithms for classification. For example, He et al.[102] pro- Allele-specific models usually perform well for common MHC
posed a multimodal deep architecture in which one-hot encod- alleles for which a large amount of experimental data is available
ing of peptide sequences as well as physicochemical properties for model training; however, models for alleles with limited ex-
and sequence profiles were fed into a deep neural network for ly- perimental data are less reliable. To address this data scarcity is-
sine ubiquitination prediction. The authors showed that the mul- sue, pan-specific methods have been proposed. Typically, a single
timodal model outperformed the model using one-hot encoding pan-specific model is trained using data from all alleles, and the
of peptide sequences alone as input. Additionally, Chen et al.[108] trained model can be applied to alleles with few training samples
found that combining a word-embedded LSTM-based classifier and even alleles not included in the training data. Allele-specific
with a traditional RF model encoding the amino acid frequency models are more accurate when restricted to certain alleles with a
improved the prediction of malonylation sites. Most of the tools large number of training samples, while pan-specific models de-
shown in Table 3 are developed for the prediction of one type of liver more stable and better overall performances when applied
PTM site. A few tools are developed for the prediction of multi- to MHC alleles with limited or no in vitro peptide binding assay
ple types of PTM sites. One is CapsNet_PTM which uses a Cap- data.[141]
sNet for seven types of PTMs prediction.[117] Most recently, Musit- During the past few years, a number of deep learning-
eDeep has been extended to incorporate the CapsNet with ensem- based methods have been developed that outperform tradi-
ble techniques for the prediction of more types of PTM sites.[128] tional machine learning methods, including shallow neural
The tool could be easily extended to predict more PTMs given networks, for peptide-MHC binding prediction (Table 4).
enough number of known sites for training. Among these algorithms, 14 (ConvMHC,[142] HLA-
Advances in MS-based PTM profiling have enabled the identi- CNN,[143] DeepMHC,[144] DeepSeqPan,[145] MHCSeqNet,[146]
fication and quantification of PTMs at the proteome scale,[95,129] MHCflurry,[147] DeepHLApan,[148] ACME,[149] EDGE,[137] CNN-
and PTM profiling datasets are growing rapidly.[130] The large NF,[150] DeepNeo,[151] DeepLigand,[152] MHCherryPan,[153] and
number of sites identified in these studies will eventually lead DeepAttentionPan[141] ) are specific for MHC class I bind-
to accurate general site prediction models for many PTM types. ing prediction, three (DeepSeqPanII,[154] MARIA,[138] and
The accuracy of these models relies on high-quality site identi- NeonMHC2[139] ) are specific for MHC class II binding predic-
fications in these high-throughput experiments, an area for fu- tion, and four (AI-MHC,[155] MHCnuggets,[156] PUFFIN,[157] and
ture development. Moreover, MS-based profiling cannot provide USMPep[158] ) can make predictions for both classes. All four
direct evidence for enzyme-substrate relationships for PTMs. It types of peptide encoding approaches illustrated in Figure 1 are
remains a big challenge to experimentally generate a large num- used in these tools, with one-hot encoding and BLOSUM matrix
ber of high-quality enzyme-substrate relationships for different encoding being the most frequently used methods (Table 4).
types of PTMs to facilitate the training and evaluation of enzyme- In terms of the neural network architecture, 13 out of the 21
specific prediction models. tools use CNN, five use RNN, two use both CNN and RNN, and
one uses DNN. Remarkably, 13 out of the 21 make pan-specific
7. Deep Learning for MHC-Binding Peptide predictions, in which both the peptide and MHC protein se-
Prediction quence (either the pseudo sequence or the full sequence) are fed
into the neural networks simultaneously for modeling. The in-
MHC (called human leukocyte antigen, or HLA, in humans) class teraction specificity between the peptide and the MHC molecule
I and class II genes encode cell surface proteins that present self is thus learned during the training process.
and foreign peptides for inspection by T cells and thus play a As shown in Table 4, most of the effort in the field has focused
critical role in generating immune responses.[131] Peptides de- on using in vitro binding assay data to predict binding affinity
rived from intracellular proteins are predominantly presented by between an MHC molecule and a given peptide sequence. There

Proteomics 2020, 20, 1900335 1900335 (12 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

Table 4. List of deep learning-based MHC-peptide binding prediction tools.

b)
No. Software MHC Type Core network Framework Group Data Typea) Input encoding Usability Year Reference
model

1 ConvMHC MHC Class I CNN Keras Pan-specific BA Handcrafted features W,P 2017 [142]
2 HLA-CNN MHC Class I CNN keras/Theano Allele-specific BA Word embedding O,C,T 2017 [143]
3 DeepMHC MHC Class I CNN - Allele-specific BA One-hot - 2017 [144]
4 DeepSeqPan MHC Class I CNN Keras/Tensorflow Pan-specific BA One-hot O,C,P,T 2019 [145]
5 AI-MHC MHC Class I/II CNN TensorFlow Pan-specific BA Word embedding W,P 2018 [155]
6 DeepSeqPanII MHC Class II CNN+RNN PyTorch Pan-specific BA One-hot + BLOSUM O,C,P,T 2019 [154]
7 MHCSeqNet MHC Class I RNN Keras/Tensorflow Pan-specific BA Word embedding O,C,P,T 2019 [146]
8 MARIA MHC Class II RNN Keras/Tensorflow Pan-specific BA + MS One-hot W,P 2019 [138]
9 MHCflurry MHC Class I CNN Keras/Tensorflow Allele-specific BA + MS BLOSUM O,C,P,T 2018 [147]
10 DeepHLApan MHC Class I RNN Keras/Tensorflow Pan-specific BA + MS Word embedding O,C,W,P 2019 [148]
11 ACME MHC Class I CNN Keras/Tensorflow Pan-specific BA BLOSUM O,C,P,T 2019 [149]
12 EDGE MHC Class I DNN Keras/Theano Allele-specific MS One-hot O 2019 [137]
13 CNN-NF MHC Class I CNN MXNet Allele-specific BA + MS Handcrafted features O 2019 [150]
14 MHCnuggets MHC Class I/II RNN Keras/Tensorflow Allele-specific BA + MS One-hot O,C,P,T 2019 [156]
15 DeepNeo MHC Class I CNN Theano Pan-specific BA 2D interaction map - 2020 [151]
16 DeepLigand MHC Class I CNN PyTorch Pan-specific BA + MS Word embedding + O,C,P,T 2019 [152]
BLOSUM + One-hot
17 PUFFIN MHC Class I/II CNN PyTorch Pan-specific BA One-hot + BLOSUM O,C,P,T 2019 [157]
18 NeonMHC2 MHC Class II CNN Keras/Tensorflow Allele-specific MS Handcrafted features O,C,W,P,T 2019 [139]
19 USMPep MHC Class I/II RNN PyTorch Allele-specific BA Word embedding O,T 2020 [158]
20 MHCherryPan MHC Class I CNN+RNN Keras/Tensorflow Pan-specific BA BLOSUM - 2019 [153]
21 DeepAttentionPan MHC Class I CNN PyTorch Pan-specific BA BLOSUM O,C,P,T 2019 [141]
a) b)
BA, binding assay data; MS, eluted ligand data from mass spectrometry experiments; O, open-source; C, command line; P, provide trained model for prediction; W, web
interface; T, provide option for model training. The link of each tool could be found at https://github.com/bzhanglab/deep_learning_in_proteomics.

are multiple upstream biological processes involved in the gener- then combined with immunopeptidomics data for training us-
ation of these peptides. For example, cytosolic proteins need to be ing a CNN network. In DeepHLApan,148 both types of data are
degraded by the 26S proteasome to create peptide fragments of used in a similar way to CNN-NF. In MHCnuggets,[156] an LSTM
an appropriate size. Only a subset of these peptides can be trans- model is first trained using binding affinity data for each MHC
ported into the endoplasmic reticulum through transporter asso- allele. A new network initiated with weights transferred from the
ciated with antigen processing proteins, where they may be fur- first step is further trained with immunopeptidomics data when
ther trimmed by the aminopeptidases ERAP1 and ERAP2 before available. In DeepLigand,152 two modules are combined to pre-
loading onto MHC class I molecules.[159] Therefore, even if a pep- dict MHC-I peptide presentation rather than binding affinity pre-
tide has strong MHC binding affinity in an in vitro binding assay, diction. The first module is a pan-specific binding affinity predic-
it may be not presentable without appropriate upstream config- tion module based on a deep residual network while the second
urations. Immunopeptidomics addresses this limitation by ana- one is a peptide embedding module based on a deep language
lyzing the naturally presented MHC binding peptides (also called model (ELMo[160] ). The peptide embedding module is trained us-
eluted ligands). Here we focus on the tools that leverage im- ing immunopeptidomics data separately to capture the features
munopeptidomics data for predictive modeling. of eluted ligands. The outputs from the two modules are con-
Six of these tools use immunopeptidomics data in com- catenated and then fed into a fully connected layer. Finally the
bination with binding assay data. In MHCflurry,[147] MHC-I affinity module and the fully connected layer are jointly trained
immunopeptidomics data are used in either model selection using both binding affinity data and immunopeptidomics data to
(MHCflurry 1.2) or as training data in combination with binding predict MHC-I peptide presentation.
assay data (MHCflurry train-MS). In MARIA,[138] binding affin- The other two tools use immunopeptidomics data alone.
ity data are used to train a pan-specific RNN model to generate EDGE[137] is trained on eluted ligand data bound to HLA class I
peptide-MHC binding affinity scores, and immunopeptidomics molecules from MS experiments on multi-allelic cancer samples.
data are used to train a DNN model to estimate peptide cleavage The algorithm also incorporates other information including
scores. The full MARIA model is then trained using immunopep- gene expression levels, proteasome cleavage preferences (flank-
tidomics data combined with gene expression data as well as ing sequences), protein and sample information. EDGE does
the two types of scores for predicting the likelihood of antigen not require explicit eluted ligand-MHC allele paired data. More
presentation in the context of specific MHC-II alleles. In CNN- specifically, a peptide associated with multiple MHC alleles
NF,[150] binding affinity data are converted to binary data first and from a given biological sample is taken as a sample during the

Proteomics 2020, 20, 1900335 1900335 (13 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

training. For the tool NeonMHC2,[139] allele-specific models 8-state prediction (Q8), and most methods achieved overall ac-
based on CNN are trained using eluted ligand data from individ- curacy above 70% on the CB513 dataset, with the maximum re-
ual MHC alleles for more than 40 HLA-II alleles. The authors ported Q8 accuracy of 74% using ensembles.[166,169–171] It is worth
specifically generated mono-allelic data for model training and pointing out that the performance in terms of accuracy and recall
showed that the models trained on the mono-allelic data are differs drastically for different secondary states.[166] As expected,
superior to allele-specific binding predictors on deconvoluted the most common alpha helix and beta strand states perform
multi-allelic MS data and NetMHCIIpan.[139] much better than others, while pi helices essentially cannot be
Superior performance has been shown for each of the pub- predicted as the sample size is too small in training datasets. Fu-
lished MHC-binding peptide prediction tools in the original stud- ture research is needed for improving the prediction accuracy of
ies. However, because each of these tools has its own strengths less common secondary structure states. Although the growing
and weaknesses, a systematic evaluation of these tools is urgently data in the PDB may help to alleviate the problem, better methods
needed to guide method selection for real applications. As it to handle the data imbalance are required.
has been demonstrated that incorporating immunopeptidomics Tertiary structure prediction commonly has two different
data could significantly improve the performance of MHC pep- approaches. For proteins whose homologs have known struc-
tide binding prediction, we expect that model performance will tures, they can be used as a template to jump-start structure
be further improved with rapidly growing immunopeptidomics modeling since proteins with high sequence similarity also tend
data. For example, applying deep learning to a recently published to show structure similarity (the same fold). Folding a protein
large MHC-I peptidome dataset from 95 HLA-A, -B, -C, and in silico from scratch with physics or empirical energy potential
-G mono-allelic cell lines[161] may enable accurate allele-specific assumes that a folded protein is at its native state with the lowest
predictions for many MHC alleles. However, most of the public free energy. This is challenging because the search space is
immunopeptidomics data are from samples with multiple MHC enormously large.[162] Approaches like fragment assembly take
alleles. How to make full use of this type of data in the training advantage of existing peptides from PDB to help conformational
of allele-specific MHC peptide binding prediction models is an sampling.[172] Current state-of-the-art methods to predict protein
interesting, yet not well studied question. Besides increasing the structures mostly utilize evolutionary information from a mul-
size of training examples for individual MHC alleles, it has been tiple sequence alignment (MSA), and ab initio folding with the
shown that both source gene expression and cleavage preference first principles still seems far-fetched.
information of antigen peptides are useful in MHC peptide bind- Tertiary structure prediction has recently shown success in
ing prediction.[137,138,161] We expect these information will be uti- large protein families with co-evolution methods.[173,174] Deep
lized in more tools in the future. learning has further exploited co-evolutionary information and
significantly improved the prediction performance of protein
8. Deep Learning for Protein Structure Prediction structures without known homologous structures (free mod-
eling or FM), which is highlighted in the breakthroughs in
Protein structures largely determine their functions. Predicting the latest round of Critical Assessment of protein Structure
spatial structure from amino acid sequence has significant appli- Prediction (CASP).[175] CASP performance has entered a new
cations in protein design and drug screening, among others.[162] era since CASP11, when residue-residue contact predictions
Structural genomics/proteomics projects were initiated to exper- were introduced to assist structure modeling as constraints.[176]
imentally solve 3D structures of proteins on a large scale, and Initially inferred from MSA by global statistics models,[177–179]
aimed to increase the coverage of structure space by targeting contact prediction has been found to be a suitable task for deep
unrepresented families.[163] Although over 13 500 structures have learning, especially CNN.[180–182] CASP12 and the latest CASP13
been deposited in Protein Data Bank (PDB) from the multi-center have witnessed remarkable improvements, and the prominent
joint effort, it is still a time-consuming process. In silico protein success of AlphaFold last year raised considerable interest even
structure prediction has the potential to fill the gap and we will from the general public.[183] A major advancement in CASP13
focus on the application of deep learning in the prediction of pro- is to predict distances in finer bins instead of binary contact,
tein secondary and tertiary structures here. and highly accurate structure models were generated for a few
Secondary structure refers to the regular local structure pat- targets from different groups.[175]
terns that can usually be defined in three types, namely alpha he- At the core of AlphaFold is a highly complex dilated residual
lix, beta strand, and coiled coil. They can be further divided into a neural network (ResNet) with 220 blocks to predict the C𝛽 dis-
more detailed classification of 8 types.[164] Secondary structure tances of residue pairs given the amino acid sequence and many
prediction is a residue-level prediction problem and is usually MSA-derived features.[184] AlphaFold tried the more conven-
aided by alignment of homologous sequences. The application tional fragment assembly approach to generate structure models
of deep learning in secondary structure prediction has been re- initially in CASP13, but later found using gradient descent
viewed recently.[165] Use of a sliding window is a popular method directly on the predicted protein-specific potentials can produce
to extract short to intermediate non-local interactions, but ar- similar results.[185] The distance potential was first normalized
chitectures like CNN and BiLSTM can learn long-range inter- with a universal reference distribution and then combined with
actions through hierarchical representations. Applying different backbone torsion angle distributions predicted with a similar
deep neural network architectures including hybrid networks has neural network and also with potentials from Rosetta to prevent
pushed the boundary of accuracy to around 85% for 3-state (Q3) steric clashes.[172,186] Iterative optimization of the torsion angles
secondary structure prediction on the commonly used CB513 based on the combined potential converged quickly to gener-
benchmark dataset.[166–168] Recent research efforts focus more on ate the backbone structure. Removing the reference potential

Proteomics 2020, 20, 1900335 1900335 (14 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

Figure 4. Workflow and network architectures of common protein structure prediction methods. A) Schematic summary of contact-guided structure
prediction methods. Different methods may use different kinds of features and network architectures, but co-evolutionary information is essential for
good contact prediction. Contact or distance between residue pairs and other predicted geometry constraints are fed into various methods for structure
modelling or converted to protein-specific potentials for direct optimization. SS, secondary structure; SASA, solvent accessible surface area. B) End-to-
end recurrent geometric network predicts structure without co-evolutionary information. First two BiLSTM layers predict backbone torsion angles and
second a geometric layer adds residues one by one to construct the structure using torsion angles and atoms in the last residue.

subtraction or other terms slightly affected the performance, and and 5th in template-based modeling, for which targets are eas-
further refining the structure model with a Rosetta relaxation ier to predict and have homologs with known structure.[192] They
protocol improved the accuracy slightly.[184] continued to improve their I-TASSER and QUARK pipelines by
AlphaFold ranked at the top overall, but for some targets, other carefully constructing MSA, integrating multiple contact predic-
groups were able to get the best models.[175] The RaptorX soft- tion methods, and designing a new contact energy potential.[190]
ware suite also predicts distances independently.[187] The entry The mainstream direction in the field of structure prediction
for contact and FM predictions, RaptorX-Contact, ranked first now usually includes steps of MSA selection, contact/distance
in the contact prediction category of CASP13.[188] Although Al- prediction, and structure modeling. The common workflow of
phaFold did not submit contact predictions, it was reported to popular methods is summarized in Figure 4.[184,187,190,191,193–196]
perform similarly.[187] RaptorX-Contact also used residual neural Using metagenomics databases and careful selection of deep
networks consisting of a 1D ResNet followed by a 2D ResNet, MSA built from different algorithms and parameters help to ob-
although the number of layers is much smaller compared to Al- tain enough information to start.[196,197] Prediction of residue to
phaFold. Another major difference is in the folding pipeline, only residue geometry from co-evolution by deep learning has proven
the most likely distances are converted to constraints for Crys- to be crucial to limit the conformation search space. The latest
tallography and NMR System (CNS) to fold the protein, which advancement from Baker’s group highlighted that predicting
is a software for experimentally solving structures.[189] Other residue-residue orientation together with distance in a multi-task
top groups all seem to benefit from contact prediction via deep ResNet and integrating them in the Rosetta pipeline (trRosetta)
learning and used ResNet to predict the contact map from se- outperformed top groups in CASP13.[196] MSA selection and
quence, profile, and co-evolutional information like the covari- data augmentation via MSA sampling were shown to greatly
ance matrix.[190,191] For example, Zhang’s group ranked 1st , 3rd contribute to the improvement as well. Finally, converting good

Proteomics 2020, 20, 1900335 1900335 (15 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

geometry predictions to a good model is an important factor, Because this method does not require peptide or protein identifi-
and there are various approaches available (Figure 4A). For cation and quantification, many precursors that may not be iden-
example, it was shown that when feeding predicted distances tified in a typical protein identification workflow could be used
by Raptor-Contact in CASP13 to trRosetta, the model accuracy in the modeling and thereby contribute to the classification. The
increased significantly compared to CNS.[196] performance of this model was shown to be superior to four tra-
Co-evolution dependent methods dominated recent CASP ex- ditional machine learning methods including SVM, RF, and gra-
periments, partly due to the rapid growth of sequence databases. dient boosting decision tree on three large-scale public datasets.
Although deep learning performs better on shallow MSA, very More recently, Zhang et al.[213] proposed an MS data representa-
shallow MSA still poses a challenge for co-evolution dependent tion method called DIA tensor and developed a deep neural net-
methods.[175] It is arguable that co-evolution methods learn and work (ResNet) to work with DIA tensor for phenotype prediction
construct a family average structure, which may have a resolu- on DIA-MS data. Similar to the Dong et al. study,[212] this method
tion limit for functional insights.[198] Recurrent geometric net- does not require protein identification and quantification either.
work (RGN) is a complementary method that just relies on pri- The performance of this method was demonstrated on two large
mary sequence and position-specific scoring matrices (PSSM) scale DIA datasets. Despite these exciting developments, appli-
(evolutionary but not co-evolutionary information) and uses RNN cation of deep learning to biological sample classification is typi-
to build an end-to-end deep learning framework (Figure 4B).[199] cally limited by the sample size of clinical cohorts. Moreover, such
However, the local structure of predicted models may not be good application has a higher requirement on model interpretability
enough and its performance in CASP13 is lower than the top than other applications described in the paper.
groups. Another CNN-based end-to-end method, NEMO, also
only uses sequence and PSSM inputs and interestingly applies 10. Conclusion and Perspectives
deep learning to the folding process.[200] Ideally, the primary pro-
tein sequence should have all the information to fold a protein. Deep learning has great potential in many areas of proteomics
Interestingly, structure predictions of designed proteins from research. With continuous improvements to deep learning tech-
one single sequence by trRosetta actually exhibited higher accu- niques and generation of high-quality proteomics data, we expect
racy than naturally occurring proteins of similar sizes.[196] The deep learning will have a profound impact in the application ar-
authors suggested that de novo proteins are ideal versions of eas reviewed in this study and beyond. It may revolutionize how
natural proteins and the neural network can learn the general we analyze proteomics data in the near future.
principle of protein structure. Structure prediction from very Although deep learning has been highly successful in predict-
shallow MSA and even a single amino acid sequence remains ing many peptide or protein properties, some properties are still
a fundamental challenge, and the quality of models for larger difficult to predict. Predicting RT and MS/MS spectrum for pep-
proteins and proteins of multiple domains still has room for tides with complicated modifications like glycosylation remains
improvement. challenging. There is clearly room for improvement for deep
learning-based peptide de novo sequencing. For cross linked pep-
tides, there is no published deep learning-based tool for either
9. Other Applications of Deep Learning in RT or MS/MS spectrum prediction. For PTMs with limited or
Proteomics no known sites, deep learning-based prediction is almost impos-
sible. For many MHC class I and II alleles with limited num-
Besides the applications of deep learning described in the above bers of known binding peptides, there is still large room for
sections, deep learning has also been used in many other se- improvement. Another interesting topic without any published
quence related applications in proteomics, including protein deep learning-based tools is the relationship prediction between
subcellular localization prediction,[201,202] protein-protein inter- actual peptide amount and peptide intensity in mass spectrom-
action prediction,[203,204] protein function prediction,[205,206] pep- etry experiments. Because the signal response in MS for differ-
tide charge state distribution prediction,[46] peptide detectability ent peptides is different, the abundance levels generated using
prediction,[30,210] and mutation impact on protein stability, func- MS for different peptides are not directly comparable. Thus, it
tion, and protein-protein interaction.[16,207–209] Among these pre- is almost impossible to directly estimate absolute quantification
dictions, for example, charge state distribution prediction[46] and for proteins from MS data with similar accuracy to RNA-Seq for
peptide detectability prediction[30,210] have achieved highly accu- gene quantification. However, if sufficient numbers of proteins
rate results using deep learning-based methods with peptide se- with known actual amount and their abundance data from mass
quences. spectrometry are available, it may be possible to train a model
In addition to predicting peptide or protein properties using for proteome wide absolute protein quantification with high ac-
sequence data, deep learning has also been used to classify bi- curacy. This may open the door to a lot of applications.
ological samples based on MS measurements in clinical pro- The generalizability of models is an important consideration
teomics. Kim et al.[211] proposed a deep neural network-based in model development and application. Both the RT and MS/MS
model to classify patients with pancreatic cancer using MRM- spectrum of a peptide are highly associated with experiment
MS data. The deep learning-based model outperformed five tradi- conditions. The RT and MS/MS spectrum prediction are typi-
tional machine learning methods including RF and SVM. Dong cally associated with a specific experiment setting. For example,
et al.[212] developed a CNN-based model to discriminate tumor a small change to the LC conditions (e.g., pore size, column
from normal samples. The model uses precursors and their ex- material, column setup, dead volume, sample loading, etc.) may
tracted ion chromatograms from raw DDA MS/MS data as input. lead to a drastic change of the RT of a peptide. Therefore, an

Proteomics 2020, 20, 1900335 1900335 (16 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

RT model trained using the data from one experiment may have self-supervised fashion, with applications in a variety of down-
large prediction error when applied to another experiment with stream tasks.[220,221] Besides a linear sequence of amino acids,
different LC conditions. Therefore, it would be difficult to develop proteins can also be modeled as a graph to capture both structure
a generic predictor that could be applied to multiple experiments and sequence information. Graph neural networks[222] are pow-
with different LC settings. One solution for deriving a generic erful deep learning architectures for learning representations of
model is to use data based on indexed retention time (iRT)[214] ; nodes and edges from such data.[223]
however, this requires adding iRT peptides in all experiments. Another promising direction is the use of NAS to aid the de-
MS/MS spectrum prediction is less affected by experimental sign of deep learning models. Developing a high performing
settings compared to RT prediction. In general, only the type of deep neural network requires significant architecture engineer-
MS instrument, peptide fragmentation method, and collision ing including the selection of a basic neural network architecture
energy require consideration in MS/MS spectrum prediction. A (such as CNN, RNN, or combination of them) and hyperparam-
generic model could be developed by considering these factors in eter tuning. However, due to the huge search space, without us-
the model training as implemented in pDeep2. Because changes ing carefully designed neural architecture search algorithms, it is
in any of these conditions may alter the spectrum pattern of generally very time-consuming and inefficient and also requires
a peptide, these conditions also need to be considered in deep extensive knowledge about deep learning to manually design a
learning-based de novo peptide sequencing during both model high performance model. NAS has been demonstrated to be a
training and application. In addition, peptide sequence patterns powerful approach to the design of neural architectures in many
could be learned during model training in deep learning-based other research areas.[49,50] We expect this technique will have a
de novo peptide sequencing, but different species may have broader application in proteomics in the future.
different sequence patterns. Thus, species is another factor for Despite superior performance, deep learning models are typi-
consideration in model training and application. A generic model cally considered to be black-boxes because how the models make
may be trained by considering all these factors in an efficient way. the prediction and what the models learn from the input data are
Other predictions, including PTM site prediction, MHC-binding largely unknown. Interpretability in deep learning is still a big
prediction, and protein structure prediction, are typically not challenge. Different algorithms and tools have been developed
associated with a specific experiment. These predictors tend to be to tackle this challenge, such as algorithms including integrated
generalizable. Public data repositories[135,215,216] are valuable re- gradients[79] and tools including Captum (https://captum.ai/).
sources of training data for most of these predictors. Comprehen- However, few of them have been applied to deep learning appli-
sive meta data for these pubic data sets is critical for data reuse. cations in proteomics. Adoption of these algorithms will help re-
In general, evaluation accompanies each tool to demonstrate searchers better understand how the deep learning models work
its performance by comparing with other similar tools. In many and will provide new insights into the mechanisms underlying
studies, pre-trained models from previous studies were used for the proteomic problems under investigation.
comparison. This type of comparison is likely to be biased de-
pending on the training data. Sometimes it is difficult to train the
models using the same training data from scratch due to a vari- Acknowledgements
ety of reasons. In many cases the tool cannot be retrained due to B.W. and W.Z. contributed equally to this work. The authors thank Jonathan
the availability of source code or lack enough documentation in T. Lei and Eric Jaehnig for proofreading the manuscript. This study was
the original publications for researchers to reproduce the train- supported by the National Cancer Institute (NCI) CPTAC award U24
ing method using user provided data. Even so, independent com- CA210954, the Cancer Prevention & Research Institutes of Texas (CPRIT)
prehensive evaluation of the performance of these deep learning award RR160027, and funding from the McNair Medical Institute at The
Robert and Janice McNair Foundation. B.Z. is a Cancer Prevention & Re-
tools is critical to provide guidance to users for method selection search Institutes of Texas Scholar in Cancer Research and McNair Medical
since one tool may have significantly different performance on Institute Scholar.
different datasets or using different evaluation metrics. Rigorous
documentation of methods for training in addition to functions
for retraining models when tools are published would benefit the Conflict of Interest
independent evaluation process.
The authors declare no conflict of interest.
Some applications are limited by the size of training data.
A close collaboration between data scientists and experimen-
talists could help generate appropriate experimental datasets Keywords
for model training. Technically, transfer learning and semi-
supervised learning can also be used to partially overcome the bioinformatics, deep learning, proteomics
problem of small training data. In addition, both proteomics data
and the outcome variables may be noisy for some applications. Received: May 27, 2020
Revised: September 14, 2020
Designing deep learning models that are robust to noise in the Published online: October 30, 2020
training data would be particularly useful.
An active research direction focuses on novel representation
methods for protein sequence data. Recent studies show that
models based on natural language processing inspired tech- [1] P. Kelchtermans, W. Bittremieux, K. De Grave, S. Degroeve, J. Ra-
niques such as Transformer,[217] BERT,[218] and GPT-2[219] can mon, K. Laukens, D. Valkenborg, H. Barsnes, L. Martens, Proteomics
learn features from a large corpus of protein sequences in a 2014, 14, 353.

Proteomics 2020, 20, 1900335 1900335 (17 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

[2] R. Bouwmeester, R. Gabriels, T. Van Den Bossche, L. Martens, S. [31] R. Lou, P. Tang, K. Ding, S. Li, C. Tian, Y. Li, S. Zhao, Y. Zhang, W.
Degroeve, Proteomics 2020, e1900351. Shui, iScience 2020, 23, 100903.
[3] L. L. Xu, A. Young, A. Zhou, H. L. Rost, Proteomics 2020, e1900352. [32] B. C. Searle, K. E. Swearingen, C. A. Barnes, T. Schmidt, S. Gessulat,
[4] T. Ching, D. S. Himmelstein, B. K. Beaulieu-Jones, A. A. Kalinin, B. B. Küster, M. Wilhelm, Nat. Commun. 2020, 11, 1548.
T. Do, G. P. Way, E. Ferrero, P.-M. Agapow, M. Zietz, M. M. Hoff- [33] B. Van Puyvelde, S. Willems, R. Gabriels, S. Daled, L. De Clerck, S. V.
man, W. Xie, G. L. Rosen, B. J. Lengerich, J. Israeli, J. Lanchantin, S. Casteele, A. Staes, F. Impens, D. Deforce, L. Martens, S. Degroeve,
Woloszynek, A. E. Carpenter, A. Shrikumar, J. Xu, E. M. Cofer, C. A. M. Dhaenens, Proteomics 2020, 20, e1900306.
Lavender, S. C. Turaga, A. M. Alexandari, Z. Lu, D. J. Harris, D. De- [34] J. L. Meek, Proc. Natl. Acad. Sci. U S A. 1980, 77, 1632.
Caprio, Y. Qi, A. Kundaje, Y. Peng, L. K. Wiley, M. H. S. Segler, S. M. [35] D. Guo, C. T. Mant, A. K. Taneja, J. M. R. Parker, R. S. Rodges, J.
Boca, S. J. Swamidass, A. Huang, A. Gitter, C. S. Greene, J. R. Soc. Chromatogr. A 1986, 359, 499.
Interface. 2018, 15, 20170387. [36] D. Gussakovsky, H. Neustaeter, V. Spicer, O. V. Krokhin, Anal. Chem.
[5] C. Cao, F. Liu, H. Tan, D. Song, W. Shu, W. Li, Y. Zhou, X. Bo, Z. Xie, 2017, 89, 11795.
Genomics Proteomics Bioinf. 2018, 16, 17. [37] W. Lu, X. Liu, S. Liu, W. Cao, Y. Zhang, P. Yang, Sci. Rep. 2017, 7,
[6] R. Dias, A. Torkamani, Genome. Med. 2019, 11, 70. 43959.
[7] S. Min, B. Lee, S. Yoon, Brief. Bioinform. 2017, 18, 851. [38] H. Maboudi Afkham, X. Qiu, M. The, L. Kall, Bioinformatics 2017, 33,
[8] G. Eraslan, Z. Avsec, J. Gagneur, F. J. Theis, Nat. Rev. Genet. 2019, 508.
20, 389. [39] K. Petritis, L. J. Kangas, P. L. Ferguson, G. A. Anderson, L. Pasa-Tolic,
[9] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, The MIT Press, M. S. Lipton, K. J. Auberry, E. F. Strittmatter, Y. Shen, R. Zhao, R. D.
Cambridge, MA 2016. Smith, Anal. Chem. 2003, 75, 1039.
[10] N. H. Tran, X. Zhang, L. Xin, B. Shan, M. Li, Proc. Natl. Acad. Sci. [40] O. V. Krokhin, R. Craig, V. Spicer, W. Ens, K. G. Standing, R. C. Beavis,
USA 2017, 114, 8247. J. A. Wilkins, Mol. Cell. Proteomics. 2004, 3, 908.
[11] S. Henikoff, J. G. Henikoff, Proc. Natl. Acad. Sci. USA 1992, 89, [41] O. V. Krokhin, Anal. Chem. 2006, 78, 7785.
10915. [42] L. Moruz, D. Tomazela, L. Kall, J. Proteome. Res. 2010, 9, 5209.
[12] E. Asgari, M. R. Mofrad, PLoS One 2015, 10, e0141287. [43] L. Moruz, A. Staes, J. M. Foster, M. Hatzou, E. Timmerman, L.
[13] K. K. Yang, Z. Wu, C. N. Bedbrook, F. H. Arnold, Bioinformatics 2018, Martens, L. Käll, Proteomics 2012, 12, 1151.
34, 2642. [44] C. Ma, Y. Ren, J. Yang, Z. Ren, H. Yang, S. Liu, Anal. Chem. 2018, 90,
[14] H. ElAbd, Y. Bromberg, A. Hoarfrost, T. Lenz, A. Franke, M. Wen- 10881.
dorff, BMC Bioinf. 2020, 21, 235. [45] S. Tiwary, R. Levy, P. Gutenbrunner, F. Salinas Soto, K. K. Palaniap-
[15] N. Strodthoff, P. Wagner, M. Wenzel, W. Samek, Bioinformatics 2020, pan, L. Deming, M. Berndl, A. Brant, P. Cimermancic, J. Cox, Nat.
36, 2401. Methods. 2019, 16, 519.
[16] E. C. Alley, G. Khimulya, S. Biswas, M. AlQuraishi, G. M. Church, [46] S. Guan, M. F. Moran, B. Ma, Mol. Cell. Proteomics. 2019, 18, 2099.
Nat. Methods 2019, 16, 1315. [47] R. Bouwmeester, R. Gabriels, N. Hulstaert, L. Martens, S. Degroeve,
[17] Y. LeCun, Y. Bengio, G. Hinton, Nature 2015, 521, 436. bioRxiv 2020. https://doi.org/10.1101/2020.03.28.013003
[18] S. Hochreiter, J. Schmidhuber, Neural. Comput. 1997, 9, 1735. [48] D. Bahdanau, K. Cho, Y. Bengio, Neural Machine Translation by
[19] J. Chung, C. Gulcehre, K. Cho, Y. Bengio, arXiv preprint, Jointly Learning to Align and Translate, arXiv:1409.0473, 2014.
arXiv:1412.3555, 2014. [49] T. Elsken, J. H. Metzen, F. Hutter, J. Mach. Learn. Res. 2019, 20, 1.
[20] S. Sabour, N. Frosst, G. E. Hinton, Advances in Neural Information [50] K. O. Stanley, J. Clune, J. Lehman, R. Miikkulainen, Nat. Mach. Intell.
Processing Systems 2017, 3856. 2019, 1, 24.
[21] V. Dorfer, S. Maltsev, S. Winkler, K. Mechtler, R. T. Charme, J. Pro- [51] K. Li, A. Jain, A. Malovannaya, B. Wen, B. Zhang, Proteomics 2020,
teome. Res. 2018, 17, 2581. e1900334.
[22] A. T. Chen, A. Franks, N. Slavov, PLoS Comput. Biol. 2019, 15, [52] Z. Noor, S. B. Ahn, M. S. Baker, S. Ranganathan, A. Mohamedali,
e1007082. Brief. Bioinform. 2020, bbz163.
[23] E. F. Strittmatter, L. J. Kangas, K. Petritis, H. M. Mottaz, G. A. Ander- [53] S. J. Barton, J. C. Whittaker, Mass. Spectrom. Rev. 2009, 28, 177.
son, Y. Shen, J. M. Jacobs, D. G. Camp 2nd, R. D. Smith, J. Proteome. [54] J. K. Eng, A. L. McCormack, J. R. Yates, J. Am. Soc. Mass. Spectrom.
Res. 2004, 3, 760. 1994, 5, 976.
[24] A. A. Klammer, X. Yi, M. J. MacCoss, W. S. Noble, Anal. Chem. 2007, [55] S. Li, R. J. Arnold, H. Tang, P. Radivojac, Anal. Chem. 2011, 83, 790.
79, 6111. [56] K. Liu, S. Li, L. Wang, Y. Ye, H. Tang, Anal. Chem. 2020, 92, 4275.
[25] B. Wen, K. Li, Y. Zhang, B. Zhang, Nat. Commun. 2020, 11, 1759. [57] X. X. Zhou, W. F. Zeng, H. Chi, C. Luo, J. Zhan, S.-M. He, Z. Zhang,
[26] B. Blank-Landeshammer, I. Teichert, R. Marker, M. Nowrousian, U. Anal. Chem. 2017, 89, 12690.
Kück, A. Sickmann, mBio 2019, 10, e02367. [58] Z. Zhang, Anal. Chem. 2004, 76, 3908.
[27] E. Lau, Y. Han, D. R. Williams, C. T. Thomas, R. Shrestha, J. C. Wu, [59] Z. Zhang, Anal. Chem. 2005, 77, 6364.
M. P. Y. Lam, Cell Rep. 2019, 29, 3751. [60] Z. Zhang, Anal. Chem. 2011, 83, 8642.
[28] T. Ouspenskaia, T. Law, K. R. Clauser, S. Klaeger, S. Sarkizova, F. [61] Y. Wang, F. Yang, P. Wu, D. Bu, S. Sun, BMC Bioinformatics 2015, 16,
Aguet, B. Li, E. Christian, B. A. Knisbacher, P. M. Le, C. R. Hartigan, 110.
H. Keshishian, A. Apffel, G. Oliveira, W. Zhang, Y. T. Chow, Z. Ji, S. [62] R. J. Arnold, N. Jayasankar, D. Aggarwal, H. Tang, P. Radivojac, Pac.
A. Shukla, P. Bachireddy, G. Getz, N. Hacohen, D. B. Keskin, S. A. Symp. Biocomput. 2006, 219.
Carr, C. J. Wu, A. Regev, bioRxiv 2020. https://doi.org/10.1101/2020. [63] S. Degroeve, L. Martens, Bioinformatics 2013, 29, 3199.
02.12.945840 [64] S. Degroeve, D. Maddelein, L. Martens, Nucleic. Acids. Res. 2015, 43,
[29] S. Gessulat, T. Schmidt, D. P. Zolg, P. Samaras, K. Schnatbaum, J. W326.
Zerweck, T. Knaute, J. Rechenberger, B. Delanghe, A. Huhmer, U. [65] R. Gabriels, L. Martens, S. Degroeve, Nucleic. Acids. Res. 2019, 47,
Reimer, H.-C. Ehrlich, S. Aiche, B. Kuster, M. Wilhelm, Nat. Methods. W295.
2019, 16, 509. [66] N. P. Dong, Y. Z. Liang, Q. S. Xu, D. K. Mok, L.-Z. Yi, H.-M. Lu, M.
[30] Y. Yang, X. Liu, C. Shen, Y. Lin, P. Yang, L. Qiao, Nat. Commun. 2020, He, W. Fan, Anal. Chem. 2014, 86, 7446.
11, 146. [67] C. Zhou, L. D. Bowler, J. Feng, BMC Bioinformatics 2008, 9, 325.

Proteomics 2020, 20, 1900335 1900335 (18 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

[68] A. M. Frank, J. Proteome. Res. 2009, 8, 2226. [95] Y. Zhao, O. N. Jensen, Proteomics 2009, 9, 4632.
[69] W. F. Zeng, X. X. Zhou, W. J. Zhou, H. Chi, J. Zhan, S.-M. He, Anal. [96] D. Wang, S. Zeng, C. Xu, W. Qiu, Y. Liang, T. Joshi, D. Xu, Bioinfor-
Chem. 2019, 91, 9724. matics 2017, 33, 3909.
[70] Y. M. Lin, C. T. Chen, J. M. Chang, BMC Genomics 2019, 20, 906. [97] F. Luo, M. Wang, Y. Liu, X. M. Zhao, A. Li, Bioinformatics 2019, 35,
[71] Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Proc. IEEE 1998, 86, 2278. 2766.
[72] A. Michalski, N. Neuhauser, J. Cox, M. Mann, J. Proteome. Res. 2012, [98] E. Fenoy, J. M. G. Izarzugaza, V. Jurtz, S. Brunak, M. Nielsen, Bioin-
11, 5479. formatics 2019, 35, 1098.
[73] W. J. Zhou, H. Yang, W. F. Zeng, K. Zhang, H. Chi, S.-M. He, J. Pro- [99] K. E. Kirchoff, S. M. Gomez, bioRxiv 2020. https://doi.org/10.1101/
teome. Res. 2019, 18, 2747. 2020.02.04.934216
[74] H. Yang, H. Chi, W. F. Zeng, W. J. Zhou, S. M. He, Bioinformatics [100] I. Deznabi, B. Arabaci, M. Koyuturk, O. Tastan, Bioinformatics 2020,
2019, 35, i183. 36, 3652.
[75] G. Rosenberger, C. C. Koh, T. Guo, H. L. Rost, P. Kouvonen, B. C. [101] H. Fu, Y. Yang, X. Wang, H. Wang, Y. Xu, BMC Bioinformatics 2019,
Collins, M. Heusel, Y. Liu, E. Caron, A. Vichalkovski, M. Faini, O. T. 20, 86.
Schubert, P. Faridi, H. A. Ebhardt, M. Matondo, H. Lam, S. L. Bader, [102] F. He, R. Wang, J. Li, L. Bao, D. Xu, X. Zhao, BMC Syst. Biol. 2018,
D. S. Campbell, E. W. Deutsch, R. L. Moritz, S. Tate, R. Aebersold, 12, 109.
Sci. Data. 2014, 1, 140031. [103] X. Zhao, J. Li, R. Wang, F. He, L. Yue, M. Yin, IEEE Access 2018, 6,
[76] J. Zi, S. Zhang, R. Zhou, B. Zhou, S. Xu, G. Hou, F. Tan, B. Wen, Q. 63560.
Wang, L. Lin, S. Liu, Anal. Chem. 2014, 86, 7242. [104] K. Yu, Q. Zhang, Z. Liu, Y. Du, X. Gao, Q. Zhao, H. Cheng, X. Li, Z.-
[77] T. Zhu, Y. Zhu, Y. Xuan, H. Gao, X. Cai, S. R. Piersma, T. V. Pham, X. Liu, Brief. Bioinform. 2019, bbz107, https://doi.org/10.1093/bib/
T. Schelfhorst, R. R. G. D. Haas, I. V., Bijnsdorp, R. Sun, L. Yue, G. bbz107
Ruan, Q. Zhang, M. Hu, Y. Zhou, W. J. Van Houdt, T. Y. S., Le Large, [105] B. Yu, Z. Yu, C. Chen, A. Ma, B. Liu, B. Tian, Q. Ma, Chemom. Intell.
J. Cloos, A. Wojtuszkiewicz, D. Koppers-Lalic, F. Böttger, C. Scheep- Lab. Syst. 2020, 200, 103999.
bouwer, R. H. Brakenhoff, G. J. L. H. van Leenders, J. N. M. Ijzer- [106] M. Wu, Y. Yang, H. Wang, Y. Xu, BMC Bioinformatics 2019, 20, 49.
mans, J. W. M. Martens, R. D. M. Steenbergen, N. C. Grieken, S. [107] G. Taherzadeh, A. Dehzangi, M. Golchin, Y. Zhou, M. P. Campbell,
Selvarajan, S. Mantoo, S. S. Lee, S. J. Y. Yeow, S. M. F. Alkaff, N. Xi- Bioinformatics 2019, 35, 4140.
ang, Y. Sun, X. Yi, S. Dai, W. Liu, T. Lu, Z. Wu, X. Liang, M. Wang, Y. [108] Z. Chen, N. He, Y. Huang, W. T. Qin, X. Liu, L. Li, Genom. Proteom.
Shao, X. Zheng, K. Xu, Q. Yang, Y. Meng, C. Lu, J. Zhu, J. Zheng, B. Bioinf. 2018, 16, 451.
Wang, S. Lou, Y. Dai, C. Xu, C. Yu, H. Ying, T. K. Lim, J. Wu, X. Gao, [109] J. Sun, Y. Cao, D. Wang, W. Bao, Y. Chen, IEEE Access 2020, 8, 47304.
Z. Luan, X. Teng, P. Wu, S. Huang, Z. Tao, N. G. Iyer, S. Zhou, W. [110] N. Thapa, M. Chaudhari, S. McManus, K. Roy, R. H. Newman, H.
Shao, H. Lam, D. Ma, J. Ji, O. L. Kon, S. Zheng, R. Aebersold, C. R. Saigo, D. B. Kc, BMC Bioinformatics 2020, 21, 63.
Jimenez, T. Guo, Genom. Proteom. Bioinf. 2020, S1672. [111] K. Y. Huang, J. B. Hsu, T. Y. Lee, Sci. Rep. 2019, 9, 16175.
[78] J. Doellinger, C. Blumenscheit, A. Schneider, P. Lasch, Anal. Chem. [112] J. Chen, R. Yang, C. Zhang, L. Zhang, Q. Zhang, IEEE Access 2019, 7,
2020, 92, 12185. 142368.
[79] M. Sundararajan, A. Taly, Q. Yan, arXiv:1703.01365, 2017. [113] Y. Xie, X. Luo, Y. Li, L. Chen, W. Ma, J. Huang, J. Cui, Y. Zhao, Y. Xue,
[80] R. Xu, J. Sheng, M. Bai, K. Shu, Y. Zhu, C. Chang, Proteomics 2020, Z. Zuo, J. Ren, Genom. Proteom. Bioinf. 2018, 16, 294.
e1900345. [114] Y. Zhao, N. He, Z. Chen, L. Li, IEEE Access 2020, 8, 14244.
[81] O. Vinyals, A. Toshev, S. Bengio, D. Erhan, 2015 IEEE Conf. on Com- [115] H. Long, B. Liao, X. Xu, J. Yang, Int. J. Mol. Sci. 2018, 19, 2817.
puter Vision and Pattern Recognition (CVPR), IEEE, Piscataway, NJ [116] W. Ning, P. Jiang, Y. Guo, C. Wang, X. Tan, W. Zhang, D. Peng, Y.
2015, pp. 3156–3164. Xue, Brief. Bioinform. 2020, bbaa038.
[82] A. Karpathy, L. Fei-Fei, IEEE Trans. Pattern. Anal. Mach. Intell. 2017, [117] D. Wang, Y. Liang, D. Xu, Bioinformatics 2019, 35, 2386.
39, 664. [118] J. Gao, J. J. Thelen, A. K. Dunker, D. Xu, Mol. Cell. Proteomics. 2010,
[83] B. Ma, K. Zhang, C. Hendrie, C. Liang, M. Li, A. Doherty-Kirby, G. 9, 2586.
Lajoie, Rapid. Commun. Mass. Spectrom. 2003, 17, 2337. [119] N. Blom, T. Sicheritz-Ponten, R. Gupta, S. Gammeltoft, S. Brunak,
[84] B. Ma, J. Am. Soc. Mass. Spectrom. 2015, 26, 1885. Proteomics 2004, 4, 1633.
[85] A. Frank, P. Pevzner, Anal. Chem. 2005, 77, 964. [120] V. Pejaver, W. L. Hsu, F. Xin, A. K. Dunker, V. N. Uversky, P. Radivojac,
[86] N. H. Tran, R. Qiao, L. Xin, X. Chen, C. Liu, X. Zhang, B. Shan, A. Protein. Sci. 2014, 23, 1077.
Ghodsi, M. Li, Nat. Methods 2019, 16, 63. [121] L. Wei, P. Xing, J. Tang, Q. Zou, IEEE Trans. Nanobiosci. 2017, 16,
[87] K. Karunratanakul, H. Y. Tang, D. W. Speicher, E. Chuangsuwanich, 240.
S. Sriswasdi, Mol. Cell. Proteomics. 2019, 18, 2478. [122] Y. Xue, J. Ren, X. Gao, C. Jin, L. Wen, X. Yao, Mol. Cell. Proteomics.
[88] T. Muth, B. Y. Renard, Brief. Bioinform. 2018, 19, 954. 2008, 7, 1598.
[89] H. Chi, H. Chen, K. He, L. Wu, B. Yang, R.-X. Sun, J. Liu, W.-F. Zeng, [123] Y. Xue, Z. Liu, J. Cao, Q. Ma, X. Gao, Q. Wang, C. Jin, Y. Zhou, L.
C.-Q. Song, S.-M. He, M.-Q. Dong, J. Proteome. Res. 2013, 12, 615. Wen, J. Ren, Protein. Eng. Des. Sel. 2011, 24, 255.
[90] Y. A. Qi, T. K. Maity, C. M. Cultraro, V. Misra, X. Zhang, C. Ade, S. [124] D. Chicco, Methods. Mol. Biol. 2021, 2190, 73.
Gao, D. Milewski, K. D. Nguyen, M. H. Ebrahimabadi, K.-I. Hanada, [125] S. R. Savage, B. Zhang, Clin. Proteomics 2020, 17, 27.
J. Khan, C. Sahinalp, J. C. Yang, U. Guha, bioRxiv 2020. https://doi. [126] E. J. Needham, B. L. Parker, T. Burykin, D. E. James, S. J. Humphrey,
org/10.1101/2020.08.04.236331 Sci. Signal 2019, 12, eaau8645.
[91] J.-Y. Lee, H. D. Mitchell, M. C. Burnet, S. C. Jenson, E. D. Merkley, [127] Y. Xian, B. Schiele, Z. Akata, Proc. of the IEEE Conf. on Computer
A. K. Shukla, E. S. Nakayasu, S. H. Payne, bioRxiv 2018. https://doi. Vision and Pattern Recognition, IEEE, Piscataway, NJ 2017, pp. 4582–
org/10.1101/428334 4591.
[92] E. S. Witze, W. M. Old, K. A. Resing, N. G. Ahn, Nat. Methods. 2007, [128] D. Wang, D. Liu, J. Yuchi, F. He, Y. Jiang, S. Cai, J. Li, D. Xu, Nucleic.
4, 798. Acids. Res. 2020, 48, W140.
[93] Y. C. Wang, S. E. Peterson, J. F. Loring, Cell. Res. 2014, 24, 143. [129] M. Mann, O. N. Jensen, Nat. Biotechnol. 2003, 21, 255.
[94] A. H. Millar, J. L. Heazlewood, C. Giglione, M. J. Holdsworth, A. [130] D. Ochoa, A. F. Jarnuczak, C. Vieitez, M. Gehre, M. Soucheray, A.
Bachmair, W. X. Schulze, Annu. Rev. Plant. Biol. 2019, 70, 119. Mateus, A. A. Kleefeldt, A. Hill, L. Garcia-Alonso, F. Stein, N. J. Kro-

Proteomics 2020, 20, 1900335 1900335 (19 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

gan, M. M. Savitski, D. L. Swaney, J. A. Vizcaíno, K.-M. Noh, P. Bel- [155] J.-W. Sidhom, D. Pardoll, A. Baras, bioRxiv 2018. https://doi.org/10.
trao, Nat. Biotechnol. 2020, 38, 365. 1101/318881
[131] J. Neefjes, M. L. Jongsma, P. Paul, O. Bakke, Nat. Rev. Immunol. [156] X. M. Shao, R. Bhattacharya, J. Huang, I. K. A. Sivakumar, C.
2011, 11, 823. Tokheim, L. Zheng, D. Hirsch, B. Kaminow, A. Omdahl, M. Bon-
[132] P. E. Jensen, Nat. Immunol. 2007, 8, 1041. sack, A. B. Riemer, V. E. Velculescu, V. Anagnostou, K. A. Pagel, R.
[133] D. Gfeller, M. Bassani-Sternberg, Front. Immunol. 2018, 9, 1716. Karchin, Cancer Immunol. Res. 2020, 8, 396.
[134] Y. Kim, J. Ponomarenko, Z. Zhu, D. Tamang, P. Wang, J. Greenbaum, [157] H. Zeng, D. K. Gifford, Cell. Syst. 2019, 9, 159 e153.
C. Lundegaard, A. Sette, O. Lund, P. E. Bourne, M. Nielsen, B. Pe- [158] J. Vielhaben, M. Wenzel, W. Samek, N. Strodthoff, BMC Bioinformat-
ters, Nucleic. Acids. Res. 2012, 40, W525. ics 2020, 21, 279.
[135] W. Shao, P. G. A. Pedrioli, W. Wolski, C. Scurtescu, E. Schmid, J. A. [159] J. S. Blum, P. A. Wearsch, P. Cresswell, Annu. Rev. Immunol. 2013,
Vizcaíno, M. Courcelles, H. Schuster, D. Kowalewski, F. Marino, C. 31, 443.
S. L. Arlehamn, K. Vaughan, B. Peters, A. Sette, T. H. M. Ottenhoff, [160] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee,
K. E. Meijgaarden, N. Nieuwenhuizen, S. H. E. Kaufmann, R. Schlap- L. Zettlemoyer, arXiv preprint, arXiv:1802.05365, 2018.
bach, J. C. Castle, A. I. Nesvizhskii, M. Nielsen, E. W. Deutsch, D. S. [161] S. Sarkizova, S. Klaeger, P. M. Le, L. W. Li, G. Oliveira, H. Keshishian,
Campbell, R. L. Moritz, R. A. Zubarev, A. J. Ytterberg, A. W. Purcell, C. R. Hartigan, W. Zhang, D. A. Braun, K. L. Ligon, P. Bachireddy, I.
M. Marcilla, A. Paradela, Q. Wang, C. E. Costello, N. Ternette, P. A. K. Zervantonakis, J. M. Rosenbluth, T. Ouspenskaia, T. Law, S. Juste-
van Veelen, C. A. C. M. van Els, A. J. R. Heck, G. A. de Souza, L. sen, J. Stevens, W. J. Lane, T. Eisenhaure, G. L. Zhang, K. R. Clauser,
M. Sollid, A. Admon, S. Stevanovic, H.-G. Rammensee, P. Thibault, N. Hacohen, S. A. Carr, C. J. Wu, D. B. Keskin, Nat. Biotechnol. 2020,
C. Perreault, M. Bassani-Sternberg, R. Aebersold, E. Caron, Nucleic. 38, 199.
Acids. Res. 2018, 46, D1237. [162] B. Kuhlman, P. Bradley, Nat. Rev. Mol. Cell. Biol. 2019, 20, 681.
[136] W. Shao, E. Caron, P. Pedrioli, R. Aebersold, Methods Mol. Biol. 2020, [163] M. Grabowski, E. Niedzialkowska, M. D. Zimmerman, W. Minor, J.
2120, 173. Struct. Funct. Genomics. 2016, 17, 1.
[137] B. Bulik-Sullivan, J. Busby, C. D. Palmer, M. J. Davis, T. Murphy, A. [164] W. Kabsch, C. Sander, Biopolymers 1983, 22, 2577.
Clark, M. Busby, F. Duke, A. Yang, L. Young, N. C. Ojo, K. Caldwell, [165] W. Wardah, M. G. M. Khan, A. Sharma, M. A. Rashid, Comput. Biol.
J. Abhyankar, T. Boucher, M. G. Hart, V. Makarov, V. T. De Montpre- Chem. 2019, 81, 1.
ville, O. Mercier, T. A. Chan, G. Scagliotti, P. Bironzo, S. Novello, N. [166] B. Zhang, J. Li, Q. Lu, BMC Bioinformatics 2018, 19, 293.
Karachaliou, R. Rosell, I. Anderson, N. Gabrail, J. Hrom, C. Limvara- [167] C. Fang, Y. Shang, D. Xu, Proteins 2018, 86, 592.
puss, K. Choquette, A. Spira, R. Rousseau, C. Voong, N. A. Rizvi, E. [168] R. Heffernan, Y. Yang, K. Paliwal, Y. Zhou, Bioinformatics 2017, 33,
Fadel, M. Frattini, K. Jooss, M. Skoberne, J. Francis, R. Yelensky, Nat. 2842.
Biotechnol. 2018, 37, 55. [169] Y. Guo, B. Wang, W. Li, B. Yang, J. Bioinform. Comput. Biol. 2018, 16,
[138] B. Chen, M. S. Khodadoust, N. Olsson, L. E. Wagar, E. Fast, C. L. 1850021.
Liu, Y. Muftuoglu, B. J. Sworder, M. Diehn, R. Levy, M. M. Davis, [170] Y. Guo, W. Li, B. Wang, H. Liu, D. Zhou, BMC Bioinformatics 2019,
J. E. Elias, R. B. Altman, A. A. Alizadeh, Nat. Biotechnol. 2019, 37, 20, 341.
1332. [171] S. Wang, J. Peng, J. Ma, J. Xu, Sci. Rep. 2016, 6, 18962.
[139] J. G. Abelin, D. Harjanto, M. Malloy, P. Suri, T. Colson, S. P. Goulding, [172] R. Das, D. Baker, Annu. Rev. Biochem. 2008, 77, 363.
A. L. Creech, L. R. Serrano, G. Nasir, Y. Nasrullah, C. D. McGann, D. [173] S. Ovchinnikov, L. Kinch, H. Park, Y. Liao, J. Pei, D. E. Kim, H.
Velez, Y. S. Ting, A. Poran, D. A. Rothenberg, S. Chhangawala, A. Ru- Kamisetty, N. V. Grishin, D. Baker, eLife 2015, 4, e09248.
binsteyn, J. Hammerbacher, R. B. Gaynor, E. F. Fritsch, J. Greshock, [174] D. S. Marks, L. J. Colwell, R. Sheridan, T. A. Hopf, A. Pagnani, R.
R. C. Oslund, D. Barthelme, T. A. Addona, C. M. Arieta, M. S. Rooney, Zecchina, C. Sander, PLoS One 2011, 6, e28766.
Immunity 2019, 51, 766 e717. [175] L. A. Abriata, G. E. Tamo, M. Dal Peraro, Proteins 2019, 87, 1100.
[140] W. Zhao, X. Sher, PLoS Comput. Biol. 2018, 14, e1006457. [176] J. Moult, K. Fidelis, A. Kryshtafovych, T. Schwede, A. Tramontano,
[141] J. Jin, Z. Liu, A. Nasiri, Y. Cui, S. Louis, A. Zhang, Y. Zhao, J. Hu, Proteins 2016, 84, 4.
bioRxiv 2019. https://doi.org/10.1101/830737 [177] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. S. Marks, C. Sander,
[142] Y. Han, D. Kim, BMC Bioinformatics 2017, 18, 585. R. Zecchina, J. N. Onuchic, T. Hwa, M. Weigt, Proc. Natl. Acad. Sci.
[143] Y. S. Vang, X. Xie, Bioinformatics 2017, 33, 2658. U S A. 2011, 108, E1293.
[144] J. Hu, Z. Liu, bioRxiv 2017. https://doi.org/10.1101/239236 [178] H. Kamisetty, S. Ovchinnikov, D. Baker, Proc. Natl. Acad. Sci. U S A.
[145] Z. Liu, Y. Cui, Z. Xiong, A. Nasiri, A. Zhang, J. Hu, Sci. Rep. 2019, 9, 2013, 110, 15674.
794. [179] D. T. Jones, D. W. Buchan, D. Cozzetto, M. Pontil, Bioinformatics
[146] P. Phloyphisut, N. Pornputtapong, S. Sriswasdi, E. Chuang- 2012, 28, 184.
suwanich, BMC Bioinformatics 2019, 20, 270. [180] D. T. Jones, S. M. Kandathil, Bioinformatics 2018, 34, 3308.
[147] T. J. O’Donnell, A. Rubinsteyn, M. Bonsack, A. B. Riemer, U. Laser- [181] B. Adhikari, J. Hou, J. Cheng, Bioinformatics 2018, 34, 1466.
son, J. Hammerbacher, Cell. Syst. 2018, 7, 129 e124. [182] S. Wang, S. Sun, Z. Li, R. Zhang, J. Xu, PLoS Comput. Biol. 2017, 13,
[148] J. Wu, W. Wang, J. Zhang, B. Zhou, W. Zhao, Z. Su, X. Gu, J. Wu, Z. e1005324.
Zhou, S. Chen, Front. Immunol. 2019, 10, 2559. [183] M. AlQuraishi, Bioinformatics 2019, 35, 4862.
[149] Y. Hu, Z. Wang, H. Hu, F. Wan, L. Chen, Y. Xiong, X. Wang, D. Zhao, [184] A. W. Senior, R. Evans, J. Jumper, J. Kirkpatrick, L. Sifre, T. Green,
W. Huang, J. Zeng, Bioinformatics 2019, 35, 4946. C. Qin, A. Žídek, A. W. R. Nelson, A. Bridgland, H. Penedones, S.
[150] T. Zhao, L. Cheng, T. Zang, Y. Hu, Front. Genet. 2019, 10, 1191. Petersen, K. Simonyan, S. Crossan, P. Kohli, D. T. Jones, D. Silver, K.
[151] K. Kim, H. S. Kim, J. Y. Kim, H. Jung, J.-M. Sun, J. S. Ahn, M.-J. Ahn, Kavukcuoglu, D. Hassabis, Nature 2020, 577, 706.
K. Park, S.-H. Lee, J. K. Choi, Nat. Commun. 2020, 11, 951. [185] A. W. Senior, R. Evans, J. Jumper, J. Kirkpatrick, L. Sifre, T. Green,
[152] H. Zeng, D. K. Gifford, Bioinformatics 2019, 35, i278. C. Qin, A. Žídek, A. W. R. Nelson, A. Bridgland, H. Penedones, S.
[153] X. Xie, Y. Han, K. Zhang, 2019 IEEE Int. Conf. on Bioinformatics and Petersen, K. Simonyan, S. Crossan, P. Kohli, D. T. Jones, D. Silver, K.
Biomedicine (BIBM), IEEE, Piscataway, NJ 2019, pp. 548–554. Kavukcuoglu, D. Hassabis, Proteins 2019, 87, 1141.
[154] Z. Liu, J. Jin, Y. Cui, Z. Xiong, A. Nasiri, Y. Zhao, J. Hu, bioRxiv 2019. [186] K. T. Simons, C. Kooperberg, E. Huang, D. Baker, J. Mol. Biol. 1997,
https://doi.org/10.1101/817502 268, 209.

Proteomics 2020, 20, 1900335 1900335 (20 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH
16159861, 2020, 21-22, Downloaded from https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.201900335 by South African Medical Research, Wiley Online Library on [05/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
www.advancedsciencenews.com www.proteomics-journal.com

[187] J. Xu, S. Wang, Proteins 2019, 87, 1069. [212] H. Dong, Y. Liu, W. F. Zeng, K. Shu, Y. Zhu, C. Chang, Proteomics
[188] R. Shrestha, E. Fajardo, N. Gil, K. Fidelis, A. Kryshtafovych, B. 2020, e1900344.
Monastyrskyy, A. Fiser, Proteins 2019, 87, 1058. [213] F. Zhang, S. Yu, L. Wu, Z. Zang, X. Yi, J. Zhu, C. Lu, P. Sun, Y. Sun,
[189] A. T. Brunger, Nat. Protoc. 2007, 2, 2728. S. Selvarajan, L. Chen, X. Teng, Y. Zhao, G. Wang, J. Xiao, S. Huang,
[190] W. Zheng, Y. Li, C. Zhang, R. Pearce, S. M. Mortuza, Y. Zhang, Pro- O. L. Kon, N. G. Iyer, S. Z. Li, Z. Luan, T. Guo, bioRxiv 2020. https:
teins 2019, 87, 1149. //doi.org/10.1101/2020.03.05.978635
[191] J. Hou, T. Wu, R. Cao, J. Cheng, Proteins 2019, 87, 1165. [214] C. Escher, L. Reiter, B. MacLean, R. Ossola, F. Herzog, J. Chilton, M.
[192] T. I. Croll, M. D. Sammito, A. Kryshtafovych, R. J. Read, Proteins 2019, J. MacCoss, O. Rinner, Proteomics 2012, 12, 1111.
87, 1113. [215] Y. Perez-Riverol, A. Csordas, J. Bai, M. Bernal-Llinares, S. Hewapathi-
[193] B. Adhikari, D. Bhattacharya, R. Cao, J. Cheng, Proteins 2015, 83, rana, D. J. Kundu, A. Inuganti, J. Griss, G. Mayer, M. Eisenacher, E.
1436. Pérez, J. Uszkoreit, J. Pfeuffer, T. Sachsenberg, S. Yilmaz, S. Tiwary,
[194] H. Fukuda, K. Tomii, BMC Bioinformatics 2020, 21, 10. J. Cox, E. Audain, M. Walzer, A. F. Jarnuczak, T. Ternent, A. Brazma,
[195] M. Gao, H. Zhou, J. Skolnick, Sci. Rep. 2019, 9, 3514. J. A. Vizcaíno, Nucleic. Acids. Res. 2019, 47, D442.
[196] J. Yang, I. Anishchenko, H. Park, Z. Peng, S. Ovchinnikov, D. Baker, [216] E. W. Deutsch, N. Bandeira, V. Sharma, Y. Perez-Riverol, J. J. Carver,
Proc. Natl. Acad. Sci. U S A. 2020, 117, 1496. D. J. Kundu, D. García-Seisdedos, A. F. Jarnuczak, S. Hewapathirana,
[197] C. Zhang, W. Zheng, S. M. Mortuza, Y. Li, Y. Zhang, Bioinformatics B. S. Pullman, J. Wertz, Z. Sun, S. Kawano, S. Okuda, Y. Watanabe,
2020, 36, 2105. H. Hermjakob, B. MacLean, M. J. MacCoss, Y. Zhu, Y. Ishihama, J.
[198] S. M. Kandathil, J. G. Greener, D. T. Jones, Proteins 2019, 87, 1179. A. Vizcaíno, Nucleic. Acids. Res. 2020, 48, D1145.
[199] M. AlQuraishi, Cell. Syst. 2019, 8, 292 e293. [217] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N.
[200] J. Ingraham, A. J. Riesselman, C. Sander, D. S. Marks, presented at Gomez, Ł. Kaiser, I. Polosukhin, Advances in Neural Information Pro-
ICLR 2019 Conf., New Orleans, LA, May 2019. cessing Systems, Curran Associates, Inc., New York 2017, pp. 5998–
[201] J. J. Almagro Armenteros, C. K. Sonderby, S. K. Sonderby, H. Nielsen, 6008.
O. Winther, Bioinformatics 2017, 33, 3387. [218] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, arXiv preprint,
[202] W. Long, Y. Yang, H. B. Shen, Bioinformatics 2020, 36, 2244. arXiv:1810.04805, 2018.
[203] T. Sun, B. Zhou, L. Lai, J. Pei, BMC Bioinformatics 2017, 18, 277. [219] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, OpenAI
[204] S. Hashemifar, B. Neyshabur, A. A. Khan, J. Xu, Bioinformatics 2018, Blog 2019, 1, 9.
34, i802. [220] A. Nambiar, M. E. Heflin, S. Liu, S. Maslov, A. Ritz, BioRxiv 2020.
[205] M. Kulmanov, R. Hoehndorf, Bioinformatics 2020, 36, 422. https://doi.org/10.1101/2020.06.15.153643
[206] A. Sureyya Rifaioglu, T. Dogan, M. Jesus Martin, R. Cetin-Atalay, V. [221] L. Chen, X. Tan, D. Wang, F. Zhong, X. Liu, T. Yang, X. Luo, K. Chen,
Atalay, Sci. Rep. 2019, 9, 7344. H. Jiang, M. Zheng, Bioinformatics 2020, btaa524, https://doi.org/
[207] G. Zhou, M. Chen, C. J. T. Ju, Z. Wang, J.-Y. Jiang, W. Wang, NAR 10.1093/bioinformatics/btaa524
Genom. Bioinform. 2020, 2, lqaa015. [222] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, P. S. Yu, IEEE Trans.
[208] H. Y. Kim, D. Kim, Bioinformatics 2020, 36, 2047. Neural. Netw. Learn. Syst. 2020, 1, https://doi.org/10.1109/TNNLS.
[209] R. Dehghanpoor, E. Ricks, K. Hursh, S. Gunderson, R. Farhoodi, 2020.2978386.
N. Haspel, B. Hutchinson, F. Jagodzinski, Molecules 2018, 23, [223] A. Fout, J. Byrd, B. Shariat, A. Ben-Hur, Advances in Neural Informa-
251. tion Processing Systems, Curran Associates, Inc., New York 2017, pp.
[210] G. Serrano, E. Guruceaga, V. Segura, Bioinformatics 2020, 36, 6530–6539.
1279. [224] Z. Chen, X. Liu, F. Li, C. Li, T. Marquez-Lago, A. Leier, T. Akutsu, G. I.
[211] H. Kim, Y. Kim, B. Han, J. Y. Jang, Y. Kim, J. Proteome. Res. 2019, 18, Webb, D. Xu, A. I. Smith, L. Li, K.-C. Chou, J. Song, Brief. Bioinform.
3195. 2019, 20, 2267.

Proteomics 2020, 20, 1900335 1900335 (21 of 21) © 2020 The Authors. Proteomics published by Wiley-VCH GmbH

You might also like