Reviews: Next-Generation Computational Tools For Interrogating Cancer Immunity
Reviews: Next-Generation Computational Tools For Interrogating Cancer Immunity
Immune checkpoint
Cancer immunotherapy is revolutionizing oncology. technologies have not only provided large data sets that
blockers Given the success in achieving long-term durable can be mined for immunologically relevant parame-
Monoclonal antibodies that responses in numerous advanced and metastatic solid ters2,3 but are also increasingly used in a clinical setting
target immune checkpoints to cancers, cancer immunotherapy sparked tremendous to inform cancer therapy. Additionally, novel techno
elicit or boost anticancer
interest and research activities in basic, translational and logies such as single-cell RNA sequencing (scRNA-seq)
immune responses. Immune
checkpoints are receptors or clinical science. This is evident not only from the increas- and mass cytometry by time of flight (CyTOF) have
their ligands expressed on ing number of publications but also from the sheer matured and enable for the first time the precise charac
either tumour cells or immune number of ongoing clinical trials and patients enrolled terization of molecular processes at the single-cell level.
cells that modulate immune
or to be recruited (2,250 active trials with blockers of Obviously, the widespread use of NGS techniques and
cell responses to self-proteins,
chronic infections and tumour
programmed cell death protein 1 (PD1) or one of its lig- continuous development of novel medium-to-high-
antigens. ands, programmed cell death 1 ligand 1 (PDL1), encom- throughput technologies require an expanded compu-
passing 380,900 patients, as of September 2018 (ref.1)). tational toolbox for the analysis and visualization of
Neoantigens The research activities are likely to have a major impact heterogeneous data.
Short peptides generated from
in the field and provide novel mechanistic insights Here, we review computational tools for interrogat-
the expression of mutated or
rearranged genes in cancer into the complex tumour–immune cell interactions. ing cancer immunity, discuss advantages and limitations
cells, but not in normal cells. However, major challenges still remain, including the of various methods and provide guidelines to assist in
Bound to HLA molecules on elucidation of mechanisms of intrinsic and acquired method selection. This Review is complementary to our
the surface of cancer cells,
resistance to therapy with immune checkpoint blockers, previous work in which computational genomics tools
neoantigens are recognized by
T cells through the interaction
the identification of predictive markers for response, the for cancer immunology were described4. We first briefly
of the T cell receptor with the determination of mechanistic rationales for combination describe the different hallmarks of cancer immunity and
peptide–HLA complex. therapies with synergistic potential, the identification then give an overview of cutting-edge experimental tech-
and selection of neoantigens for therapeutic cancer vac- niques for single-cell analysis and spatial cellular pheno-
cination and the determination of targets for adoptive typing. This is followed by the main focus of the Review
therapy with engineered T cells. on computational methods for interrogating cancer
The intrinsic complexity of the interaction of the immunity covering neoantigen prediction, characteri-
two interwoven systems, the tumour and the immune zation of tumour-infiltrating immune cells using bulk
system, poses considerable challenges and requires com- tissue and single-cell approaches, analysis of T cell and
prehensive approaches to interrogate cancer immunity B cell repertoires, analysis of cellular phenotypes from
Biocenter, Institute of during tumour initiation and progression, and follow- histological images and single-cell data visualization.
Bioinformatics, Medical ing therapeutic modulation thereof. Several established
University of Innsbruck,
Innsbruck, Austria.
and novel high-throughput technologies enable the Hallmarks of cancer immunity
generation of the necessary data and thereby provide The interactions between tumour and immune cells
*e-mail: zlatko.trajanoski@
i-med.ac.at the basis for mechanistic understanding, and ultimately have been conceptualized as a series of events, refer
https://doi.org/10.1038/ increase the number of patients that benefit from can- red to as the cancer-immunity cycle 5, occurring at
s41576-019-0166-7 cer immunotherapy. Next-generation sequencing (NGS) distinct anatomical sites6 (Fig. 1). Hence, it is likely that
a b
Priming and activation Response Composition Localization Function
Cancer
antigen Dendritic
presentation cell
Activated
cytotoxic Clonal
expansion Regulatory T cells T cell exclusion Dysfunctional T cells
T cells
Neoantigens
Dying GZMB
cancer cells IFNγ
Tumour Effector
microenvironment response
Cytotoxic T cells Regulatory T cells Cancer cells
Cancer cells Activated T cells Dysfunctional T cells
c d
Tregcell Normal Quiescent ECM Local Circulating
epithelium CAF microbiota cytokines
RBC
Tissue-resident MSC
Basement Chronic
Immuno-
Cancer cell membrane viral
modulating
infections
medications
NK cell
Fig. 1 | Distinct hallmarks of cancer immunity. a | In the tumour microenvironment, neoantigens released by dying cancer
cells are captured by dendritic cells for processing. After homing to draining lymph nodes, dendritic cells present the
captured neoantigens to T cells, inducing their priming, activation and clonal expansion. Activated T cells migrate into the
tumour microenvironment, where they exert anticancer immune responses through secretion of molecules such as
granzyme B (GZMB) and interferon-γ (IFNγ). b | Three aspects of the tumour immune contexture determine the likelihood of
cancer patients to respond to immunotherapy: composition of the immune infiltrates in terms of effectors and suppressors
cells; localization of immune cells, which can be infiltrating the tumour core (immune inflamed phenotype), confined at the
tumour margin (immune excluded phenotype) or absent from the tumour mass (immune desert phenotype); and function
of effector cells, which can be fully activated or dysfunctional. c | The tumour microenvironment is a complex ecosystem
composed of different cell types, which include cancer cells, epithelial cells, cancer-associated fibroblasts (CAFs) and
immune cells, such as cytotoxic T cells, regulatory T (Treg) cells and myeloid suppressor cells. d | Besides the tumour immune
contexture, different systemic parameters influence the patient’s outcome and response to therapy , including chronic viral
infections, local microbiota, immune state of the host, immunomodulating medications and circulating cytokines. ECM,
extracellular matrix; MDSC, myeloid-derived suppressor cell; MSC, mesenchymal stem cell; NK cell, natural killer cell; RBC,
red blood cell. Part a is adapted from ref.4, Springer Nature Limited. Part c is adapted from ref.204, Springer Nature Limited.
multiparametric assessment rather than the use of single tumour microenvironment (TME), and host and
parameters is necessary to dissect the complex tumour– environmental factors.
immune cell interactions and inform cancer immuno
therapy. Comprehensive characterization of cancer Neoantigens. Cytotoxic CD8+ T cells are at the core of
immunity requires the determination of the following immunological tumour control and response to anti-
broad characteristics: neoantigens, immune contexture, cancer therapies5,7,8. After priming and activation by
www.nature.com/nrg
Reviews
Dendritic cells
dendritic cells in draining lymph nodes (Fig. 1a), CD8+ Technologies for interrogating cancer immunity
Professional antigen-presenting T cells can recognize tumour neoantigens — that is, The application of NGS for genomic, transcriptomic and
cells that act as messengers short peptides generated from the expression of mutated epigenomic profiling of tumours is building the main
between the innate and the or rearranged genes bound to class I HLA molecules source of data that enables the extraction of hallmark
adaptive immune system.
of tumour cells — and induce anticancer immune characteristics. The application of these techniques on
Dendritic cells capture
antigens, transport them into responses4,9,10. Accumulating evidence suggests that bulk tissue as well as the computational tools that ena-
lymphoid organs and present neoantigens are major determinants of the response to ble the NGS data to be leveraged have been previously
them to naive T cells together immunotherapy with checkpoint blockers, and their reviewed by us4. Additionally, microbiome analysis meth-
with co-stimulatory signals to computational or experimental characterization in can- ods (reviewed elsewhere17) have advanced rapidly and
induce T cell priming and
activation.
cer patients is the basis for personalized cancer vaccines now provide the means to study the microbiota compo-
and T cell-based immunotherapies9–11. sition and function from 16S ribosomal RNA sequenc-
Hot tumours ing, metagenomics and metatranscriptomics data. Major
Immunogenic tumours with Immune contexture. Immune cells infiltrating into the achievements in recent years have been the development
high infiltration of T cells and
tumour have profound effects on the clinical response of techniques for single-cell analysis and for multiplexed
high likelihood of response to
immune checkpoint inhibitor to immunotherapy, and it has become clear that not spatial cellular phenotyping. These techniques are of par-
therapy (as opposed to cold only the composition but also their localization and ticular relevance for cancer immunity as they enable for
tumours). their functional orientation — jointly referred to as the the first time a comprehensive interrogation of cancer
immune contexture — determine the efficacy of anti immunity, including characterization of the cellular
Cold tumours
Poorly immunogenic tumours
cancer immune responses6,12. Patients with high densities composition of cancerous and normal tissue, quantifi-
with low or no infiltration of of specific immune cell subpopulations in the tumour cation of the immune contexture and pairing of α-chains
T cells and low likelihood of centre or invasive margin have better prog nosis 13, and β-chains of individual T cell receptors (TCRs) and B
response to immune suggesting that the immune system is controlling the cell receptors (BCRs). The data types, the intermediate
checkpoint inhibitor therapy
growth of the tumour. Beyond the prognostic value, analyses and the immunogenomic analyses are shown in
(as opposed to hot tumours).
the immune contexture has profound effects on the Fig. 2. Appropriate analyses of the data require an under-
Microbiota response to cancer immunotherapies (Fig. 1b): hot tumours standing of the experimental steps that generated those
The community of are more amenable to checkpoint-blocker-based mono data in order to understand the origins of the key features
microorganisms, including therapy or combination therapy than cold tumours14. as well as possible biases of the resultant data. We there-
bacteria, viruses and fungi,
which are found within a
Hence, the quantitation of the immune contexture in fore first describe recent technological developments and
specific environment (for archived and prospective samples will provide valuable then review the associated computational tools.
example, the human gut). information for improving cancer immunotherapy.
Single-cell omics of isolated cells. Whereas bulk RNA
Microbiome
Tumour microenvironment. The TME comprises not sequencing (RNA-seq) data enable only reconstruction
The collection of all genomes
from all of the microorganisms only cancer cells, normal epithelial cells and immune of an average transcriptome of mixed cell populations,
composing the microbiota. cells from the adaptive and the innate lineages but also new scRNA-seq technologies can be used to reconstruct
cancer-associated fibroblasts (CAFs), endothelial cells, the transcriptomes of individual cells, opening new ave-
mesenchymal stem cells, the extracellular matrix and the nues for the study of the heterogeneity, plasticity and
basement membrane (Fig. 1c). CAFs provide physical sup- functional diversity of the immune system18,19. Most of
port for epithelial cells, release various tumour-promoting the techniques to capture single cells can be assigned
cytokines and chemokines (which favour tumour growth to either plate- or microfluidics-based methods 20.
and angiogenesis) and are major contributors to an immu- Plate-based approaches such as Smart-seq2 (ref.21) sort
nosuppressive TME. Importantly, anticancer immunity cells into separate wells via fluorescence-activated cell
can be therapeutically exploited by using combinations sorting (FACS). They generate full-length transcripts
of immune checkpoint blockers and anti-angiogenic from single cells and have higher sensitivity than
inhibitors15. Thus, it is necessary to determine the cellular microfluidics-based methods (that is, higher number of
components of the tumour environment and investigate detected genes per cell), but lower throughput in terms
their interactions. Finally, it might be necessary to also of sequenced cells due to the complexity of the single-cell
include data from measured physical properties because isolation step22. Microfluidics-based platforms such as
aberrant cell mechanics is crucial for altered cellular the 10X Chromium23 generate nanolitre-sized droplets
behaviour and the onset of cancer16. containing a single cell each, together with a barcoded
bead and the reagents needed for the downstream reac-
Host and environmental factors. Several systemic fac- tions. These platforms are more cost-efficient and there-
tors including the host microbiota have been associated fore enable profiling of larger number of cells compared
with response to cancer therapy with immune check- with the plate-based systems.
point blockers (see a recent review11), indicating that An alternative method for the analysis of single
systemic factors play a major role (Fig. 1d). Obviously, cells is CyTOF24, which characterizes cells according to
global immunological competence of the patient, their cell-surface-expressed proteins. In this method,
including external factors such as infections or immuno metal-isotope-conjugated antibodies are used to stain
modulating medications6, determines the likelihood cells and are then subjected to a quadrupole time-of-
of obtaining clinical benefit. Additionally, commensal flight (TOF) mass spectrometer analysis. The major
microbes influence immune responses, indicating that advantage compared with traditional fluorescence-based
antitumour immunity can also be modulated by the flow cytometry is that there are no spectral overlaps and
gut microbiota. therefore the number of assessed markers can be larger
www.nature.com/nrg
Reviews
scRNA-seq data from the same sample (to assign func- arcasHLA51, xHLA52, HLA-HD53 and HLAProfiler54,
tional states using a panel of genes)36. However, defining which also perform class II HLA typing. However, unbi-
a panel of genes and assigning specific functions remains ased benchmarking of these recent tools is not availa-
arbitrary and we expect that community-organized con- ble and would be extremely useful for characterizing
sortial projects using scRNA-seq data and expert anno- their accuracy in class II typing, for which very limited
tation such as the Human Cell Atlas37 will provide this validation has so far been carried out.
valuable information in the near future. Tools for predicting peptides binding to HLA mol-
Apart from using multiplexed imaging of spe- ecules use machine-learning methods trained on large
cific markers, several methods that aim to measure in vitro peptide–HLA binding data sets. NetMHC55 and
the expression of tens to thousands of genes in situ its pan-allele version NetMHCpan56 are based on artifi-
may be used to quantify the immune contexture in an cial neural networks and are currently the most widely
antibody-free and spatially resolved manner. By detect- used methods due to their high performance. Both
ing transcripts instead of proteins, cells that produce tools predict the binding affinity as the half-maximal
secreted factors for which antibodies are not availa- inhibitory concentration (IC50) expressed in nanomolar
ble can also be identified. Two main classes of meth- units, as well as the rank of predicted affinity compared
ods use either hybridization (seqFISH38, seqFISH+39 with a set of random natural peptides, to account for
and MERFISH40) or sequencing (FISSEQ 41, Spatial allele-specific bias. Strong binders are usually selected
Transcriptomics42 and Slide-seq43). Although these considering a binding affinity or rank lower than
promising methods have the capability to better char- 500 nM or 0.5%, respectively. Recent advancements in
acterize the spatial and functional composition of the the field of deep learning have fostered the develop-
immune landscape, they have certain advantages and ment of new machine-learning methods based on deep
limitations. For example, seqFISH and MERFISH ena- convolutional neural networks, such as HLA-CNN57
ble subcellular resolution but are time-consuming and and DeepSeqPan58. In parallel, a pan-allele method
require a high number of probes to provide complete called PSSMHCpan has been developed to leverage
coverage. Slide-seq and Spatial Transcriptomics, on the binding motifs to predict peptide binding affinity also
contrary, can provide a more complete transcriptome, for currently under-represented HLA alleles59. The
but achieving single-cell resolution remains a challenge recently developed pVACtools suite for the prediction
(Slide-seq resolution >10 µm). For FISSEQ, subcellular and prioritization of putative neoantigens60 includes
resolution is possible, but the detection threshold is an updated version of the pVACseq61 pipeline that can
high (>200 mRNA molecules per cell) and only a small compute binding-affinity predictions with different
number of transcripts can be analysed. state-of-the-art machine-learning methods, as well as
quantify features linked to antigen pre-processing and
Computational tools for predicting neoantigens recognition (see Box 1).
In silico prediction of putative neoantigens from mutated Notably, only ~1–5% of the class I binders predicted
genes consists of three main computational steps4 in silico using different computational tools have been
(Fig. 3a): first, identification of somatic mutations using experimentally validated9. One possible reason for
whole-genome sequencing (WGS) or whole-exome the discrepancy between predicted and experimen-
sequencing (WES) data from paired tumour and normal tally validated neoantigens is the low sensitivity of
tissue and reconstruction of mutated peptides; second, mass spectrometry (MS)-based methods to directly
genotyping of the patient’s HLA genes from tumour identify binding peptides. Despite this limitation, MS
RNA-seq or WES data; and, third, prediction of peptides measurements of eluted HLA-binding peptides can be
binding to the patient’s HLA molecules. used to directly interrogate the human immunopep-
Mutated peptides arising from somatic muta- tidome, namely the set of peptides presented on HLA
tions can be predicted by comparing tumour versus molecules, and to enable reconstruction of antigen
normal-tissue NGS data from the same patient. NGS profiles presented in vivo that could not be captured
data for neoantigen prediction are generated preferen- from previous in vitro affinity studies62. Novel meth-
tially from WES, which provides the deepest mutation ods such as MHCflurry 1.2.0 (ref.63), ForestMHC64,
coverage by restricting the assay only to protein-coding MixMHCpred65,66 and EDGE67 as well as the latest ver-
4-digit HLA typing regions of the genome. The computational analysis con- sion of NetMHC68 were also trained on MS data from
The standard nomenclature of
sists of data pre-processing and quality control, iden- HLA-eluted peptides, and the increasing amount of
HLA alleles is composed of the
gene name, an asterisk and
tification of somatic mutations using tools for variant HLA–ligand MS data available in databases like IEDB69,
eight digits separated by a detection, and prediction of the affected proteins and PRIDE70 or SysteMHC Atlas71 can provide rich training
colon, for example, functional impact using public repositories of genomic, data sets for the next-generation predictors. However,
HLA-A*02:01:01:05. HLA transcriptomic and proteomic sequences. For a review MS measurements have two major limitations: first,
alleles that differ at 4-digit
and critical discussion of these approaches, we refer to the requirement for a large amount of starting mate-
resolution (for example,
HLA-A*02:02 and previous literature4,44,45. rial (~1 × 108 cells62); and, second, the dependence on
HLA-A*02:01) have similar State-of-the art methods for HLA typing from NGS protein sequence databases for data analysis, which
serological specificity for a data (Table 1) are mature and widely used, and include limits the identification of peptides to the annotated
peptide, but have different
OptiType46 and Polysolver47, which showed high accu- human proteome. The latter issue can be overcome
protein sequences that can
result in different T cell
racy in 4-digit HLA typing, as well as seq2HLA48, which can with computational approaches for updating reference
recognition of the peptide– compute both HLA types and allele-specific expression. databases incorporating predicted non-canonical neo-
HLA complex. More recent methods include Kourami49, HLA*LA50, antigens, such as those derived from non-exonic regions,
Prediction
of mutated HLA typing
peptides
c d Tumour cell
scRNA-seq Data types and computational analyses
WES or
Quality control >16,000 class I HLA alleles RNA-seq Class I HLA typing
Class I
HLA
WES plus Prediction of neoantigen–
Gene selection Neoantigen ~1014 possible 8–11mers HLA binding
RNA-seq
TCR
Reconstruction of
Normalization ~1016 αβ TCRs scRNA-seq αβ TCR pairs
Cell annotation
Fig. 3 | Overview of computational tools for interrogating cancer immunity. a | Putative neoantigens arising from the
expression of somatic mutations can be predicted in silico through three main computational steps: prediction of mutated
peptides using whole-exome sequencing (WES) or whole-genome sequencing (WGS) data from paired tumour and
normal samples; HL A typing from tumour sequencing data (preferentially RNA sequencing (RNA-seq)); and prediction
of the binding affinity between HL A types and mutated peptides. b | The analysis of different types of data can reveal
different facets of the tumour immune contexture depending on their pros and cons. Bulk RNA-seq data can be analysed
with deconvolution methods to quantify the fractions of different cell subpopulations, but cannot be used to study the
phenotypes of single cells. By contrast, single-cell RNA-seq (scRNA-seq) is currently not optimal to quantitatively assess
the cellular composition of the tumour, but can be used to portray single-cell types and states. Multiplexed imaging
allows the study of cells in a spatial context, but only reconstructs a restricted, 2D portion of the tumour microenvironment
and is limited in the number of markers that can be phenotyped. c | A basic scRNA-seq analysis pipeline consists of quality
control and removal of low-quality cell profiles; selection of informative genes; normalization of expression profiles;
and annotation of cell types. d | Schematic representation of the interaction between a tumour cell and a cytotoxic T cell:
the T-cell receptor (TCR), composed of an α-chain and β-chain, interacts with the neoantigen bound on the class I HL A
molecule of the tumour cell. In humans, there are more than 16,000 class I HL A alleles and ~1016 αβ TCRs, whereas all
possible peptides 8–11 amino acids long (mutated or not) amount to ~1014 8–11mers. Class I HL A typing can be performed
in silico using WES or RNA-seq data, whereas the binding between class I HL A molecules and putative neoantigens can be
predicted by integrating WES (or WGS) and RNA-seq data (details in part a). αβ TCRs of single cells can be reconstructed
from scRNA-seq data, but there are currently no computational methods to predict neoantigen recognition by TCRs. β2M,
β2-microglobulin.
insertions or deletions (indels), gene fusions, alternative neoantigens recognized by CD4+ T cells have limited
splicing or post-translational modifications (see Box 1). accuracy and are advancing slowly due to a lack of proper
However, when supplementing peptide databases with training data. Since our latest review4, the landscape of
non-canonical peptides, care must be taken to avoid false class II predictors has evolved little, with NetMHCII and
positives. The potential relevance of non-canonical neo- NetMHCIIpan73 still representing the top performers74
antigens was shown in a recent study on patients with and only one novel method proposed: MixMHC2pred75.
head and neck cancer treated with immune checkpoint MixMHC2pred was trained on MS-based, class II
inhibitors, demonstrating that gene fusions are a source immunopeptidomics data and demonstrated higher
of immunogenic neoantigens that can mediate responses accuracy compared with NetMHCIIpan75.
to immunotherapy in patients with low mutational load Overall, recent developments in deep learn-
and low pretreatment immune infiltration72. ing algorithms and MS-based immunopeptidomics
Despite the progress in predicting class I HLA neo- have created fertile ground for the development of
antigens, the current tools for predicting class II HLA next-generation predictors of HLA presentation.
www.nature.com/nrg
Reviews
cell types from bulk gene expression; eight immune cell MCPcounter
transcriptomic data types, fibroblasts and endothelial
cells; inter-sample comparison
CIBERSORT Deconvolution of SVR-deconvolution of 22 immune +a https://cibersort.stanford.edu 87
www.nature.com/nrg
Reviews
www.nature.com/nrg
Reviews
MIBIAnalysis
CODEX CODEX data processing Processing of raw CODEX data; + https://github.com/nolanlab/ 35
tools) analysis pipeline analyse raw data; generate data sets SpatialTranscriptomicsResearch
for downstream analysis
seqFISH+ seqFISH+ data processing Processing images and barcode +++ https://github.com/ 39
www.nature.com/nrg
Reviews
However, a comprehensive and unbiased evaluation on agreement between predicted and experimental bind-
external data is currently lacking. Unfortunately, results ing affinities. This hampers a complete characterization
from two recent studies based on public peptide–HLA of tools in terms of accuracy, positive predictive value
binding data74 or de novo experimentally validated and coverage of HLA alleles. In this context, the gener-
human papillomavirus (HPV) peptides76 provided little ation of the optimal validation data set is of paramount
guidance on method selection. These studies identified importance: it should cover a wide range of HLA alleles,
either MHCflurry or NetMHC as top performers for effectively capture the rules of antigen presentation
class I binding prediction, and reported variable accu- (which is not possible using in vitro assay data) and be
racy across HLA types and peptide lengths and low based on unseen data not used for the training of the
www.nature.com/nrg
Reviews
Data dimensionality blockers demonstrated that deconvolution methods can computational strategies for scRNA-seq data are far less
The high dimensionality of be used both to monitor the immunological effects of mature and standardized than those for bulk RNA-seq,
single-cell RNA sequencing targeted agents and to reveal immune cell composition well-implemented and documented frameworks tack-
data is due to the high number in response to immune checkpoint blockers91. ling the main analytical steps are already available
of genes measured
(20,000–30,000 genes),
Benchmarking cell-type quantification methods (Table 1) and include pipelines using R (for example,
although for many of those the is difficult due to the differences in the cell types and Seurat36,105, Scater106, SINCERA107 and Scran108), Python
expression in a certain cell estimated scores/fractions. A recent comparative bench- (for example, Scanpy109) or user-friendly graphical inter-
would be zero due to dropouts. marking revealed high accuracy in the quantification of faces (for example, Granatum110 and ASAP111). The core
Due to the high dimensionality,
CD8+ T cells across different approaches, but limited per- pipeline usually consists of four main steps: first, qual-
cells become very similar and
difficult to assign to different
formance for heterogeneous cell types such as dendritic ity control and removal of low-quality cell profiles (for
groups (for example, cell cells92. The Tumor Deconvolution Challenge, organized example, stressed cells or doublets); second, selection of
subpopulations). by the Dialogue on Reverse Engineering Assessment and informative genes (for example, genes with highly var-
Dimensionality reduction Methods (DREAM) initiative, has the potential to reveal iable expression among cells); third, normalization of
techniques can ameliorate this
issue, known as the curse of
the top performers and provide guidelines for the selec- expression profiles to allow cell comparison; and, fourth,
dimensionality, and decrease tion of the best method based on the cell type of interest. annotation of cell types based on their transcriptional
the computational time. Importantly, besides simple enumeration of cell types, profiles (Fig. 3c). Seurat is currently the best developed
novel methods including CIBERSORTx88 and linseed93 and documented framework and allows single-cell,
Data sparsity
can reconstruct cell- and sample-specific transcriptional multi-omics data integration, data harmonization and
A data set is sparse when it is
mainly composed of zeros and
profiles and, thus, have the potential to elucidate the cell-type identification36,105. For in-depth review of the
the actual information is rare. functional state of cell subpopulations in the TME. computational tools, we refer readers elsewhere18,112–114.
In single-cell RNA sequencing In summary, the selection of the method depends on Annotation of different cell types is a pivotal step
data sets, data sparsity is the questions to be addressed and the type of informa- in scRNA-seq data analysis. However, there is cur-
mainly due to dropouts.
tion expected to be gained. EPIC and quanTIseq are the rently no consensus on how to systematically identify
Dropouts preferred methods to obtain cell fractions that can be known and novel cell types (or cell states) based on
In single-cell RNA sequencing compared both within and between samples, whereas their expression profiles. One common approach is to
(scRNA-seq) data, when MCP-counter and xCell provide higher signature use unsupervised clustering to group cells with similar
expressed genes result in null
specificity and lower background noise, respectively92. profiles and — assuming that each cluster represents
expression values due to the
inefficiency of mRNA capture
Specific methods can be also selected considering the one cell type or cell state — to identify the marker genes
and/or to the stochasticity of cell type of interest (for example, CIBERSORT, xCell and that are specific for each cluster. This approach has sev-
mRNA expression. They are quanTIseq for M1/M2 macrophages; xCell, EPIC and eral limitations. First, clustering approaches may force
the main cause of data sparsity TIMER for epithelial cells). the partitioning of the data into discrete clusters even
in scRNA-seq data sets.
when cells cover a continuum of states. Second, results
Doublets Cell phenotypes from single-cell data. Compared with strongly depend on the clustering strategy adopted (that
Pairs of cells that are captured bulk approaches, single-cell technologies can provide is, computational method and parameter settings).
and sequenced together in complementary insights into cancer immunity (Fig. 3b) Third, standard clustering methods might not iden-
single-cell RNA sequencing
and have been used to study the TME of different can- tify small clusters or rare cells, and therefore dedicated
experiments. As doublets have
hybrid transcriptomes that
cer types (for example, refs19,94–99). Notably, scRNA-seq approaches such as RaceID3 (ref.115) and GiniClust116
might be falsely interpreted as techniques open new avenues to study rare or unknown have to be used. Last, once cluster-specific marker genes
intermediate cell phenotypes, immune cell types100, and can shed light on the transcrip- are identified there is no standard strategy to assign cell
they have to be identified and tional programmes that underlie the plasticity and func- identities to clusters. Most of the scRNA-seq studies pub-
removed before running
downstream analysis and data
tionality of the immune cells. For example, scRNA-seq lished so far involve manual cell annotation based on
interpretation. from tumour-infiltrating CD8+ T cells can provide val- marker genes and prior knowledge, an approach that is
uable information about their activation state and the labour-intensive and has low reproducibility.
Unsupervised clustering level of exhaustion101. Besides the investigation of gene One alternative approach is to project scRNA-seq
The objective of clustering is to
expression in single immune cell types, single-cell tech- data onto reference expression profiles of previously
find different groups within the
elements in the data (usually
nologies can uncover the genetic and transcriptomic annotated cell types. For example, the tool scmap117 maps
samples or cells in makeup of tumour cells, allowing the study of rare cells single-cell profiles onto single cells or clusters of a refer-
transcriptomic data sets), such as circulating tumour cells, cancer stem cells and ence data set. SingleR classifies single-cell transcriptomes
assigning to the same cluster cells committed to epithelial-to-mesenchymal transition, by comparing them with expression profiles of sorted
the elements that are more
similar to each other. This
as well as the detection of cell-specific genetic variants cell types using correlation analysis118. SingleR embeds
process is called unsupervised and estimation of tumour clonality and evolution102. reference data from human and mouse cell populations,
because the real groups are not However, care must be taken when using scRNA-seq including immune cells, but also accepts user-supplied
known a priori. By contrast, techniques for quantifying the cellular composition of references118. The recently developed Garnett tool uses
supervised clustering or
tumours due to the differences in single-cell dissocia- a hierarchy of cell subtypes and relative gene markers
classification is based on
pre-labelled groups of samples,
tion efficiency relative to immune cells, which can bias together with a reference scRNA-seq data set to build a
which are used to classify a new cell-type proportions99. classifier for annotating cells in external scRNA-seq data
sample considering its similarity Analysis of scRNA-seq data shares some analytical sets119. Other tools leverage the integration and harmo-
to the elements of each group. steps with bulk RNA-seq (for example, read mapping) nization of scRNA-seq data sets across studies36,105, or
Cell ontologies
but also poses additional challenges due to the pecu- the mapping of marker genes onto cell ontologies120. The
Structured vocabularies of cell liarities of these data: high data dimensionality, higher recently developed scMatch method is based on correla-
types. noise and absence of biological replicates per se, and tion analysis, but can either use reference transcriptomes
data sparsity due to gene dropouts103,104. Although the or cell ontologies to annotate cells121. When analysing
large scRNA-seq data sets, a combination of manual and methods have been developed to extract TCRs (for
automated cell annotation should be used, whereas for example, TraCeR 130, TRAPeS 131 and scTCRseq 132)
small data sets manual annotation can be sufficient114. and BCRs (for example, BASIC 133, BraCeR 134 and
However, we recommend caution when prior knowledge BALDR135) or both (for example, VDJPuzzle136,137) from
is integrated, specifically for classifying marker-negative full-transcript scRNA-seq data (Table 1). The relevance
cells as they might be affected by dropouts. of this approach for immuno-oncology is increasingly
Independent evaluation of the computational tools for being appreciated and demonstrated in numerous stud-
scRNA-seq data analysis has not yet been carried out, so ies. For example, Zhang et al.138 performed full-transcript
the use of consensus approaches using diverse methods scRNA-seq from T cells isolated from tumour, normal
to validate the robustness of the results is recommended. mucosa and blood of 12 patients with colorectal cancer.
In the near future, the availability of carefully designed, Paired analysis of single-cell transcriptomes and TCRs
gold-standard data sets such as those recently generated revealed tumour exclusivity of the TCRs of exhausted
using both droplet- and plate-based scRNA-seq122 will CD8+ T cells, and association of this subtype with effec-
finally enable method benchmarking and definition of tor but not central memory CD8+ T cells, implicating a
guidelines and best practices for data analysis. TCR-based fate decision of tumour-infiltrating memory
CD8+ T cells. In another study using a 10X Genomics
Lymphocyte receptor repertoires. Interrogation of can- platform, which enables the direct and simultane-
cer immunity also requires the search for common ous characterization of cell phenotypes and immune
clonotypes involved in the response to tumour antigens receptors, Azizi et al.98 profiled T cells isolated from
in order to identify shared BCR and TCR sequences. eight patients with breast cancer. Through integrative
The specificity of B and T cell responses (that is, which analysis of expression and TCR diversity it was shown
antigens they recognize) depends on the repertoire of that T cell phenotypes and activation states are shaped
receptors they are equipped with. NGS has become a by a combination of antigenic TCR stimulation and
powerful tool to interrogate BCRs and TCRs, and dif- environmental stimuli.
ferent computational tools now provide simplified access
to the analysis of BCR and TCR diversity123. Recently, Spatial cellular phenotyping. The quantification of the
the analysis of immune repertoires from sequencing immune contexture requires images from tissue slides
data has seen two major advancements: the develop- in order to obtain cellular phenotypes and their spa-
ment of dedicated computational tools for the extrac- tial distribution. Individual cells are first detected by
tion of immune repertoires from bulk-tumour RNA-seq thresholding and segmenting the raw images, and then
data, and the possibility to determine pairs of protein their individual phenotypes are identified and classified
chains of individual TCRs and BCRs from single cells by detecting signals from the specific markers in the
(Fig. 3d; Table 1). corresponding cellular compartment (such as nucleus,
Three tools have been recently developed for cytoplasm or cytoplasmic membrane) used in stain-
the analysis of TCR and BCR repertoires from bulk ing procedures. Besides commercial software packages
RNA-seq data. Originally developed for targeted such as inForm (Perkin Elmer), Halo (Indica Labs) or
sequencing of BCR and TCR repertoires124, MiXCR StrataQuest (TissueGnostics GmbH), a growing num-
has been recently adapted to analyse bulk-tumour ber of open-source and free software tools, including
RNA-seq data with high accuracy and precision125. The ImageJ139, CellProfiler140 and Ilastik141, are available for
TRUST algorithm, initially developed for TCR analysis this purpose (Table 1). By combining and extending their
of bulk-tumour RNA-seq data126, can now also extract core functionalities via plug-ins, macros or scripting, cus-
BCR repertoires127. As this approach can produce incom- tom analysis pipelines have been created and adapted to
plete CDR3 sequences mapping to different clonotypes128, fit the different multiplex imaging methods (for example,
data post-processing is advisable to decrease the num- refs28,32,33,91). As an alternative to the use of fully devel-
ber of false positives. Finally, V’DJer is a tool specifically oped imaging software packages, image analysis pipe-
designed to extract BCR repertoires from bulk RNA-seq lines are often implemented using the image-processing
data (as precomputed files of mapped reads), which can routines and libraries from MATLAB (imaging toolbox)
be then quantified in downstream analysis129. Although or Python (scikit-image and opencv)32,34. This is the
promising for its applicability to short-read data, the case for novel multiplex imaging techniques including
requirement of data pre- and post-processing might IMC, MIBI-TOF, MERFISH, CODEX, seqFISH, Spatial
restrict the usage of V’DJer to bioinformaticians. Transcriptomics or Slide-seq, which require specialized
Clonotypes TCRs and BCRs consist of pairs of protein chains pre-processing, image restoration and post-processing
Populations of T cells that carry that, collectively, determine their antigen specificity. tools (Table 1).
identical T cell receptors. In bulk data sets, the pairing of the two chains is lost These primary analyses of raw images typically result
and cannot be tracked back by computational means. in data sets that provide information about each individ-
CDR3 sequences
Complementarity-determining Single-cell approaches not only retain this information ual detected cell, including spatial coordinates, expressed
region 3 (CDR3) is the region of but further allow the joint analysis of transcriptomes markers, staining intensities of the expressed markers,
the variable chain in B cell and immune repertoires to link the latter to the cell compartments and metastructures (that is, tumour or
receptors and T cell receptors state and functional orientation. Despite their still lim- stroma). Different software packages that are either com-
that binds to the cognate
antigen, thus accounting for
ited standardization and the lack of unbiased bench- mercial (for example, TIBCO Spotfire and Phenomap)
most of the variation of marking, these approaches enable analyses that are or freely available (CellProfiler Analyst142 and histo-
immune repertoires. inaccessible to bulk approaches. Several computational CAT143) implement methods (for example, t-distributed
www.nature.com/nrg
Reviews
Dimension 2
Genes
Cells
Dimension 1
b c
Exhausted
CD8+LAYN+
Cytotoxic
t-SNE 2
Effector
CD8+CX3CR1+
CD8+ T cells
Naive
Naive
CD8+LEF1+
Fig. 4 | single-cell analysis and visualization of tumour-related T cells. a | Analytical steps for visualization of single-cell
data: starting with an expression matrix indicating normalized expression values for each gene (rows) and single cell
(columns), similarities of expression profiles between two cells each are calculated (for example, using Euclidean distance)
and can be represented as a similarity matrix. As many cells are studied, a more simplified representation can be achieved
by (non-linear) dimensionality reduction and the projections of the most informative components are commonly
visualized in a 2D plot, thereby allowing grouping (clustering) of cells with similar expression profiles. Graph-based
approaches are used to infer linear and branched pseudotime trajectories along which the cells can be ordered.
b | Example t-distributed stochastic neighbour embedding (t-SNE) plot and clustering of single tumour-infiltrating T cells
in cancer. Functionally related marker genes can be assigned to clusters. Cell clusters and marker-gene expression can
shed light on novel, uncharacterized immune cell subtypes or subpopulations of cells with specific functional changes
within the tumour microenvironment, which may be prognostic or predictive for immunotherapy (that is, exhausted
versus cytotoxic CD8+ T cells). Indicated are clusters representing naive CD8+ T cells, exhausted CD8+ T cells and
effector CD8+ T cells, characterized by expression of the indicated marker genes. c | Beyond the useful insights from
clustering analysis in part b, the reconstruction of continuous (branched) cell transitions is only possible through the
computational analysis of pseudotime trajectories. An example pseudo-temporal ordering of CD8+ T cells is shown.
The branched trajectories of CD8+ T cells according to pseudo-temporal reconstruction underscore the functional
orientation of CD8+ T cells and their continuous transitions (from naive to cytotoxic cells, and from non-exhausted to
exhausted). Colour codes are according to clusters in the t-SNE plot and respective marker genes defining those types.
Data plotted in panels b and c are from single-cell analysis of non-small-cell lung cancer samples from 14 patients205.
Non-parametric methods including t-SNE have by pseudo-temporal trajectories (Fig. 4). In this approach
Pseudotime limitations such as loss of large-scale information it is assumed that cells with similar expression profiles
Single-cell RNA sequencing and intercluster relationships, but these limitations are arising from the same lineage, and that cells with
(scRNA-seq) can capture
different cell types and, when
can be circumvented by interpretable dimensionality more similar expression profiles are more closely
the throughput is sufficient, cell reduction153, as recently demonstrated by clustering related156. Once the data have been analysed (Fig. 4a) and
transitions from one functional immune cell types in the TME using scRNA-seq data cells have been projected into a low-dimensional space
state to another. Algorithms for from patients with melanoma19. Similar reproducibility (Fig. 4b), a minimum spanning tree can be used to build
pseudotime ordering can
and preservation of global distances can be achieved a backbone for cell state transitions, for example from
extract from scRNA-seq data
the transcriptional profiles with uniform manifold approximation and projection naive to cytotoxic CD8+ T cells (Fig. 4c). This 1D ordering
underling dynamic changes of (UMAP)154. Notably, recent studies indicate that in some is referred to as pseudotime.
cells moving throughout cases the state of a cell represents a continuum rather One of the first developed tools for this pseudotime
subsequent states, thereby than being assigned to several discrete states, which alignment is Monocle157. Another pioneering method
reconstructing their overall
trajectories in time. This
ensures the plasticity of the immune system to respond to robustly reconstruct lineage branching and to meas-
estimated time reference is to pathogens or to neoantigens released by the tumour155. ure transitions between cell states is diffusion pseudo-
referred to as pseudotime. The continuous nature of cell states can be represented time (DPT) analysis158. This method uses a non-linear
www.nature.com/nrg
Reviews
approach for recovering the low-dimensional structure are those for HLA typing and the tools for predicting class
underlying high-dimensional observations159. As of I HLA binding affinity from NGS data (at least for the
today, various tools based on different methods have common alleles). In other areas, little progress has been
been developed, reviewed156,160 and benchmarked161. achieved for various reasons. Accurate predictions of class
One of the top scoring methods with respect to the II HLA binding affinity is still challenging for both biolog-
analysis of the complexity of the trajectories and over- ical and technical reasons. First, the length of the binding
all performance was partition-based graph abstraction peptides is variable (between 13 and 25 amino acids) and
(PAGA). PAGA generates graph-like maps of cells that the peptide-flanking regions on either side of the binding
preserve both continuous and disconnected structure in core affect peptide–HLA binding. Additionally, there is a
data at multiple resolutions162. A very recent and prom- scarcity of both positive and negative training data sets.
ising addition to the visualization toolbox is a method Thus, rather than optimizing algorithms to improve their
based on Markov processes to characterize cell fate prob- performance by a few per cent (for example, prediction of
abilities (Palantir)163. Finally, an interesting approach is class I HLA binding affinity), efforts should be directed
implemented in the tool velocyto, which uses exonic and towards generating training data and developing methods
intronic reads from scRNA-seq data to model the abun- for applications that are advancing slowly.
dances of pre-mRNAs and mature mRNAs to predict Another extremely challenging area is the predic-
gene expression changes over time (that is, RNA velocity). tion of immunogenicity of neoantigens, that is, which
This information is then used to predict future cell states neoantigens will induce a T cell response (Box 1) .
and to display cell kinetics in the form of a vector field Understanding the TCR recognition rules for peptide–
overlaid onto a dimensionality-reduced representation of HLA complexes (pHLA) would tremendously help for
the cell populations164. designing cancer vaccines and enabling T cell engineer-
The longitudinal single-cell analysis of samples is ing for solid cancers. Recent studies demonstrated that
an exemplary application of the usefulness of such vis- TCR sequences can be assigned to an antigen specific-
ualization tools165. Using t-SNE clustering, the major ity by sequence analysis alone175,176. Although in these
monocyte/macrophage subpopulations that comprise studies only few epitopes from common viruses were
the intratumoural myeloid compartment could be used, the findings suggest that the development of a
identified, as well as their remodelling upon immune generalizable model of TCR–pHLA recognition might
checkpoint blockers. However, there were limited be possible, which would be an important step towards
insights into the origins of the cells that populate the designing TCR sequences with neoantigen specific-
individual clusters, and only computational analyses of ity and, hence, rationally engineering T cell immunity
the pseudotime-organized sequence of differentiation/ against tumours.
activation events with Monocle2 (ref. 166) revealed The computational tools reviewed here are used to
that neither CX3CR1+ macrophages nor iNOS+ macro analyse single molecular entities such as RNA expres-
phages are present in a tumour-induced early state sion or protein expression. Emerging NGS technologies
and that there is obviously a branching point in the fate enable simultaneous measurements of different molecu-
of intratumoural myeloid cells. lar entities such as scRNA-seq coupled with cell-surface
In comparison with scRNA-seq data, visualization protein expression, as in the two related methods
of CyTOF data is more advanced as many methods of cellular indexing of transcriptomes and epitopes
are further developments of visualization methods for (CITE-seq)177 and RNA expression and protein sequenc-
conventional flow cytometry data. Numerous cyto ing (REAP-seq)178. Similarly, other methods combine
metry data visualization and clustering tools have been spatial and molecular data like Spatial Transcriptomics42
developed, such as viSNE146, PhenoGraph167, SPADE168, and simultaneous detection of proteins and transcripts
X-shift169, ACCENSE170, FlowSOM171 and Citrus172, and using IMC179 (see also Box 3). These and other upcoming
these tools have been comprehensively reviewed173 assays will require innovative computational methods
and summarized in a web resource (Bridging Bench, and tools for integrative analyses of heterogeneous data
Biology, and Bioinformatics in the Field of Mass in both bulk-tissue and single-cell settings. Specifically,
Cytometry). An interesting approach for visualization integrating information across different modalities asso-
of single-cell CyTOF data is scaffold maps, which are ciated with single-cell data sets such as transcriptomic,
based on force-directed graphs and have been used to epigenomic, proteomic and spatially resolved single-cell
reveal immune organization in different tissues174. data will be necessary to gain deep biological under-
In general, the choice of a specific visualization tool standing beyond listing of cell clusters180. Additionally,
for scRNA-seq or CyTOF data depends on the function- transferring information from one data set to another
ality, the programming preferences (for example, R or will be tremendously helpful for exploratory analysis
Python), the size of the data sets to be analysed and the and biological interpretation. Such original strategies
computational requirements, and should be made in using information integration and transfer learning
the context of the problem addressed. have been recently developed36,181. For example, a novel
approach has been used to transfer scRNA-seq annota-
Emerging methods and future trends tions onto chromatin accessibility data (generated using
Within the past few years numerous methods and tools single-cell assay for transposase accessible chromatin
for interrogating cancer immunity have matured, and (scATAC-seq)), thereby revealing finer distinctions
we do not expect substantial improvements for some of among the cell types36 that was not possible by using
them in the near future. Examples of such mature tools solely scATAC-seq data.
Box 3 | Multimodal interrogation of cancer immunity pioneering attempts were directed towards investigating
the dynamics of tumorigenesis during immunosurveil-
Despite their great potential, it is likely that the information content derived from lance183 or developing patient-specific models to simu-
sequencing-based assays will have to be complemented with additional modalities in late the effects of combination therapies184. We expect
order to comprehensively dissect tumour–immune cell interactions and inform therapy that in the near future the computational toolbox will
for individual patients. For example, longitudinal monitoring and assessment of the
be enriched with such modelling approaches, often
immunotherapy response will probably be based on approaches combining radiological
imaging, liquid biopsies and computational methods to infer changes in tumour integrated in an experimental–computational cycle.
composition and heterogeneity. a promising radiomics approach addressing this issue
was recently presented200. Combining contrast-enhanced computer tomography Conclusions
images and RNA sequencing (RNA-seq) data enabled the development of a radiomic Comprehensive and quantitative interrogation of can-
signature for tumour-infiltrating CD8+ T cells. Such an approach could pave the way cer immunity requires the use of molecular and cellular
for the non-invasive assessment of immune infiltration in tumours and, hence, enable tools, as well as sophisticated computational methods to
longitudinal monitoring of the effects of immunotherapy. analyse complex and large data sets. Given the maturity
additionally, it will be also necessary to dissect tumour cell signalling for several and robustness of NGS-based technologies and the avail-
reasons. First, dysfunctional signalling in tumours arises not only from gene mutations ability of the associated computational tools reviewed
but also from epigenetic modifications and rewiring of signalling networks201.
here, we expect that enormous amounts of data will be
second, cell signalling determines several processes including cell growth, cell–cell
communications, nutrient responses, cell cycle and cell death. Last, as nearly all generated in the upcoming years. This will pose consid-
targeted drugs are directed against signalling molecules, combination immunotherapies erable challenges to, first, make these data available and,
with these drugs will also require analyses of the pharmacological signalling rewiring. second, extract information for immuno-oncology from
signalling information from analyses of oncogenic signalling in tumours, signalling the data sets. For example, there is currently no central-
rewiring induced by drugs and the crosstalk with immune-related pathways has not ized database that hosts genomic/immunogenomic data
been used so far in clinical decision-making due to the lack of sensitive and reproducible from published clinical studies with immune checkpoint
technologies for protein-based measurements. RNA-seq data as a surrogate for blockers and in many cases researchers have to request
phosphoproteomic assays are suboptimal because the regulation of signalling pathways access from individual laboratories. Similarly, a central-
is predominantly at the post-transcriptional level. However, recent developments of ized database that enables queries across scRNA-seq data
(phospho)proteomics techniques202 that enable deep coverage and quantitative
sets would be extremely helpful. Both challenges could
consistency and accuracy provide for the first time the possibility to comprehensively
probe signalling pathways and networks. Notably, advances in organoid technologies203 be solved, but they require community efforts to address
enable the generation of personalized cancer models and thereby also the possibility to and overcome ethical issues (for example, access to con-
study signalling rewiring for individual patients. trolled data including human sequence information) and
technical issues (for example, the size of scRNA-seq data
sets). Thus, existing and future computational tools will
Complementary to the data-driven modelling as be instrumental for the interrogation of cancer immu-
described in the previous sections, mathematical mecha- nity in individual patients and will ultimately enable
nistic modelling and simulations hold promise to derive precision immuno-oncology.
novel insights by providing quantitative predictions
that can be experimentally validated182. For instance, Published online xx xx xxxx
1. Tang, J. et al. Trial watch: the clinical trial landscape 12. Fridman, W. H., Zitvogel, L., Sautès-Fridman, C. 23. Zheng, G. X. Y. et al. Massively parallel digital
for PD1/PDL1 immune checkpoint inhibitors. & Kroemer, G. The immune contexture in cancer transcriptional profiling of single cells. Nat. Commun.
Nat. Rev. Drug Discov. 17, 854–855 (2018). prognosis and treatment. Nat. Rev. Clin. Oncol. 8, 14049 (2017).
2. Charoentong, P. et al. Pan-cancer immunogenomic 14, 717–734 (2017). 24. Bendall, S. C. & Nolan, G. P. From single cells to deep
analyses reveal genotype–immunophenotype 13. Galon, J. et al. Type, density, and location of immune phenotypes in cancer. Nat. Biotechnol. 30, 639–647
relationships and predictors of response to checkpoint cells within human colorectal tumors predict clinical (2012).
blockade. Cell Rep. 18, 248–262 (2017). outcome. Science 313, 1960–1964 (2006). 25. Mansfield, J. R., Hoyt, C. & Levenson, R. M.
3. Thorsson, V. et al. The immune landscape of cancer. 14. Galon, J. & Bruni, D. Approaches to treat immune hot, Visualization of microscopy-based spectral imaging
Immunity 48, 812–830.e14 (2018). altered and cold tumours with combination data from multi-label tissue sections. Curr. Protoc.
4. Hackl, H., Charoentong, P., Finotello, F. & Trajanoski, Z. immunotherapies. Nat. Rev. Drug Discov. 18, Mol. Biol. https://doi.org/10.1002/0471142727.
Computational genomics tools for dissecting tumour– 197–218 (2019). mb1419s84 (2008).
immune cell interactions. Nat. Rev. Genet. 17, 441 15. Motzer, R. J. et al. Avelumab plus axitinib versus 26. Mansfield, J. R. Multispectral imaging: a review
(2016). sunitinib for advanced renal-cell carcinoma. N. Engl. of its technical aspects and applications in
5. Chen, D. S. & Mellman, I. Oncology meets J. Med. 380, 1103–1115 (2019). anatomic pathology. Vet. Pathol. 51, 185–210
immunology: the cancer-immunity cycle. Immunity 39, 16. Panciera, T., Azzolin, L., Cordenonsi, M. & Piccolo, S. (2014).
1–10 (2013). Mechanobiology of YAP and TAZ in physiology and 27. Stack, E. C., Wang, C., Roman, K. A. & Hoyt, C. C.
6. Galluzzi, L., Chan, T. A., Kroemer, G., Wolchok, J. D. & disease. Nat. Rev. Mol. Cell Biol. 18, 758–770 (2017). Multiplexed immunohistochemistry, imaging, and
López-Soto, A. The hallmarks of successful anticancer 17. Knight, R. et al. Best practices for analysing quantitation: a review, with an assessment
immunotherapy. Sci. Transl Med. 10, eaat7807 (2018). microbiomes. Nat. Rev. Microbiol. 16, 410–422 of Tyramide signal amplification, multispectral
7. Fridman, W. H., Pagès, F., Sautès-Fridman, C. & (2018). imaging and multiplex analysis. Methods 70, 46–58
Galon, J. The immune contexture in human tumours: 18. Giladi, A. & Amit, I. Single-cell genomics: a stepping (2014).
impact on clinical outcome. Nat. Rev. Cancer 12, stone for future immunology discoveries. Cell 172, 28. Tsujikawa, T. et al. Quantitative multiplex
298–306 (2012). 14–21 (2018). immunohistochemistry reveals myeloid-inflamed
8. Galluzzi, L., Buqué, A., Kepp, O., Zitvogel, L. & 19. Finotello, F. & Eduati, F. Multi-omics profiling of the tumor-immune complexity associated with poor
Kroemer, G. Immunological effects of conventional tumor microenvironment: paving the way to precision prognosis. Cell Rep. 19, 203–217 (2017).
chemotherapy and targeted anticancer agents. Cancer immuno-oncology. Front. Oncol. 8, 430 (2018). 29. Lin, J.-R., Fallahi-Sichani, M. & Sorger, P. K.
Cell 28, 690–714 (2015). 20. Kolodziejczyk, A. A., Kim, J. K., Svensson, V., Highly multiplexed imaging of single cells using
9. Lee, C.-H., Yelensky, R., Jooss, K. & Chan, T. A. Update Marioni, J. C. & Teichmann, S. A. The technology and a high-throughput cyclic immunofluorescence method.
on tumor neoantigens and their utility: why it is good biology of single-cell RNA sequencing. Mol. Cell 58, Nat. Commun. 6, 8390 (2015).
to be different. Trends Immunol. 39, 536–548 (2018). 610–620 (2015). 30. Gerdes, M. J. et al. Highly multiplexed single-cell
10. Schumacher, T. N., Scheper, W. & Kvistborg, P. Cancer 21. Picelli, S. et al. Smart-seq2 for sensitive full-length analysis of formalin-fixed, paraffin-embedded cancer
neoantigens. Annu. Rev. Immunol. 37, 173–200 (2018). transcriptome profiling in single cells. Nat. Methods tissue. Proc. Natl Acad. Sci. USA 110, 11982–11987
11. Havel, J. J., Chowell, D. & Chan, T. A. The evolving 10, 1096–1098 (2013). (2013).
landscape of biomarkers for checkpoint inhibitor 22. Ziegenhain, C. et al. Comparative analysis of single-cell 31. Schubert, W. et al. Analyzing proteome topology and
immunotherapy. Nat. Rev. Cancer 19, 133–150 RNA sequencing methods. Mol. Cell 65, 631–643.e4 function by automated multidimensional fluorescence
(2019). (2017). microscopy. Nat. Biotechnol. 24, 1270–1278 (2006).
www.nature.com/nrg
Reviews
32. Giesen, C. et al. Highly multiplexed imaging of tumor 59. Liu, G. et al. PSSMHCpan: a novel PSSM-based 85. Becht, E., Giraldo, N. A. & Lacroix, L. Estimating the
tissues with subcellular resolution by mass cytometry. software for predicting class I peptide–HLA binding population abundance of tissue-infiltrating immune
Nat. Methods 11, 417–422 (2014). affinity. Gigascience 6, 1–11 (2017). and stromal cell populations using gene expression.
33. Angelo, M. et al. Multiplexed ion beam imaging 60. Hundal, J. et al. pVACtools: a computational toolkit to Genome Biol. 17, 218 (2016).
of human breast tumors. Nat. Med. 20, 436–442 identify and visualize cancer neoantigens. Preprint at 86. Finotello, F. & Trajanoski, Z. Quantifying
(2014). bioRxiv https://doi.org/10.1101/501817 (2019). tumor-infiltrating immune cells from transcriptomics
34. Keren, L. et al. A Structured tumor-immune 61. Hundal, J. et al. pVAC-Seq: a genome-guided in silico data. Cancer Immunol. Immunother. 67, 1031–1040
microenvironment in triple negative breast cancer approach to identifying tumor neoantigens. Genome (2018).
revealed by multiplexed ion beam imaging. Cell 174, Med. 8, 11 (2016). 87. Newman, A. M. et al. Robust enumeration of cell
1373–1387.e19 (2018). This study proposes a method for neoantigen subsets from tissue expression profiles. Nat. Methods
35. Goltsev, Y. et al. Deep profiling of mouse splenic vaccine design based on the prediction of peptide– 12, 453–457 (2015).
architecture with codex multiplexed imaging. Cell MHC binding affinity and other features linked to This study presents pioneering work on a
174, 968–981.e15 (2018). neoantigen immunogenicity. Hundal et al. (2019) computational method (CIBERSORT) for
36. Stuart, T. et al. Comprehensive integration of presents a suite for neoantigen predictions based building immune cell-specific signatures, cell-type
single-cell data. Cell 177, 1888–1902.e21 (2019). on different machine-learning methods. deconvolution from bulk transcriptomics data and
37. Regev, A. et al. The human cell atlas. eLife 6, e27041 62. Gfeller, D. & Bassani-Sternberg, M. Predicting antigen extraction of transcriptional profiles.
(2017). presentation—what could we learn from a million 88. Newman, A. M. et al. Determining cell type
38. Lubeck, E., Coskun, A. F., Zhiyentayev, T., Ahmad, M. peptides? Front. Immunol. 9, 1716 (2018). abundance and expression from bulk tissues with
& Cai, L. Single-cell in situ RNA profiling by sequential 63. O’Donnell, T. J. et al. MHCflurry: open-source class I digital cytometry. Nat. Biotechnol. 37, 773–782
hybridization. Nat. Methods 11, 360–361 (2014). MHC binding affinity prediction. Cell Syst. 7, (2019).
39. Eng, C.-H. L. et al. Transcriptome-scale super-resolved 129–132.e4 (2018). 89. Li, B. et al. Comprehensive analyses of tumor
imaging in tissues by RNA seqFISH. Nature 568, 64. Boehm, K. M., Bhinder, B., Raja, V. J., Dephoure, N. & immunity: implications for cancer immunotherapy.
235–239 (2019). Elemento, O. Predicting peptide presentation by major Genome Biol. 17, 174 (2016).
40. Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. histocompatibility complex class I: an improved 90. Racle, J., de Jonge, K., Baumgaertner, P., Speiser, D. E.
& Zhuang, X. RNA imaging. Spatially resolved, highly machine learning approach to the immunopeptidome. & Gfeller, D. Simultaneous enumeration of cancer and
multiplexed RNA profiling in single cells. Science 348, BMC Bioinformatics 20, 7 (2019). immune cell types from bulk tumor gene expression
aaa6090 (2015). 65. Bassani-Sternberg, M. et al. Deciphering HLA-I motifs data. eLife 6, e26476 (2017).
41. Lee, J. H. et al. Fluorescent in situ sequencing across HLA peptidomes improves neo-antigen 91. Finotello, F. et al. Molecular and pharmacological
(FISSEQ) of RNA for gene expression profiling in intact predictions and identifies allostery regulating modulators of the tumor immune contexture revealed
cells and tissues. Nat. Protoc. 10, 442–458 (2015). HLA specificity. PLOS Comput. Biol. 13, e1005725 by deconvolution of RNA-seq data. Genome Med. 11,
42. Ståhl, P. L. et al. Visualization and analysis of gene (2017). 34 (2019).
expression in tissue sections by spatial 66. Gfeller, D. et al. The length distribution and multiple 92. Sturm, G. et al. Comprehensive evaluation of
transcriptomics. Science 353, 78–82 (2016). specificity of naturally presented HLA-I ligands. transcriptome-based cell-type quantification methods
43. Rodriques, S. G. et al. Slide-seq: a scalable technology J. Immunol. 201, 3705–3716 (2018). for immuno-oncology. Bioinformatics 35, i436–i445
for measuring genome-wide expression at high spatial 67. Bulik-Sullivan, B. et al. Deep learning using tumor (2019).
resolution. Science 363, 1463–1467 (2019). HLA peptide mass spectrometry datasets improves 93. Zaitsev, K., Bambouskova, M., Swain, A. &
44. Ding, L., Wendl, M. C., McMichael, J. F. & Raphael, B. J. neoantigen identification. Nat. Biotechnol. 37, 55–63 Artyomov, M. N. Complete deconvolution of cellular
Expanding the computational toolbox for mining (2019). mixtures based on linearity of transcriptional
cancer genomes. Nat. Rev. Genet. 15, 556–570 68. Andreatta, M. & Nielsen, M. Gapped sequence signatures. Nat. Commun. 10, 2209 (2019).
(2014). alignment using artificial neural networks: application 94. Tirosh, I. et al. Dissecting the multicellular ecosystem
45. Xu, C. A review of somatic single nucleotide variant to the MHC class I system. Bioinformatics 32, of metastatic melanoma by single-cell RNA-seq.
calling algorithms for next-generation sequencing 511–517 (2016). Science 352, 189–196 (2016).
data. Comput. Struct. Biotechnol. J. 16, 15–24 69. Vita, R. et al. The Immune Epitope Database (IEDB) 95. Puram, S. V. et al. Single-cell transcriptomic analysis of
(2018). 3.0. Nucleic Acids Res. 43, D405–D412 (2015). primary and metastatic tumor ecosystems in head and
46. Szolek, A. et al. OptiType: precision HLA typing from 70. Vizcaíno, J. A. et al. 2016 update of the PRIDE neck cancer. Cell 171, 1611–1624.e24 (2017).
next-generation sequencing data. Bioinformatics 30, database and its related tools. Nucleic Acids Res. 44, 96. Li, H. et al. Reference component analysis of single-cell
3310–3316 (2014). 11033 (2016). transcriptomes elucidates cellular heterogeneity in
47. Shukla, S. A. et al. Comprehensive analysis of 71. Shao, W. et al. The SysteMHC Atlas project. human colorectal tumors. Nat. Genet. 49, 708–718
cancer-associated somatic mutations in class I HLA Nucleic Acids Res. 46, D1237–D1247 (2017). (2017).
genes. Nat. Biotechnol. 33, 1152–1158 (2015). 72. Yang, W. et al. Immunogenic neoantigens derived from 97. Lavin, Y. et al. Innate immune landscape in early lung
48. Boegel, S. et al. HLA typing from RNA-seq sequence gene fusions stimulate T cell responses. Nat. Med. 25, adenocarcinoma by paired single-cell analyses. Cell
reads. Genome Med. 4, 102 (2012). 767–775 (2019). 169, 750–765.e17 (2017).
49. Lee, H. & Kingsford, C. Kourami: graph-guided 73. Jensen, K. K. et al. Improved methods for predicting 98. Azizi, E. et al. Single-cell map of diverse immune
assembly for novel human leukocyte antigen allele peptide binding affinity to MHC class II molecules. phenotypes in the breast tumor microenvironment.
discovery. Genome Biol. 19, 16 (2018). Immunology 154, 394–406 (2018). Cell 174, 1293–1308.e36 (2018).
50. Dilthey, A. T. et al. HLA*LA—HLA typing from linearly 74. Zhao, W. & Sher, X. Systematically benchmarking 99. Lambrechts, D. et al. Phenotype molding of stromal
projected graph alignments. Bioinformatics https:// peptide–MHC binding predictors: from synthetic to cells in the lung tumor microenvironment. Nat. Med.
doi.org/10.1093/bioinformatics/btz235 (2019). naturally processed epitopes. PLOS Comput. Biol. 14, 24, 1277–1289 (2018).
51. Orenbuch, R. et al. arcasHLA: high resolution HLA e1006457 (2018). 100. Villani, A.-C. et al. Single-cell RNA-seq reveals new
typing from RNAseq. Bioinformatics https://doi. 75. Racle, J., Michaux, J., Rockinger, G. A. & Arnaud, M. types of human blood dendritic cells, monocytes,
org/10.1093/bioinformatics/btz474 (2019). Deep motif deconvolution of HLA-II peptidomes for and progenitors. Science 356, eaah4573 (2017).
52. Xie, C. et al. Fast and accurate HLA typing from robust class II epitope predictions. Preprint at bioRxiv 101. Zheng, C. et al. Landscape of infiltrating t cells in liver
short-read next-generation sequence data with xHLA. https://doi.org/10.1101/539338 (2019). cancer revealed by single-cell sequencing. Cell 169,
Proc. Natl Acad. Sci. USA 114, 8059–8064 (2017). 76. Bonsack, M. et al. Performance evaluation of MHC 1342–1356.e16 (2017).
53. Kawaguchi, S., Higasa, K., Shimizu, M., Yamada, R. class-I binding prediction tools based on an 102. Navin, N. E. The first five years of single-cell cancer
& Matsuda, F. HLA-HD: an accurate HLA typing experimentally validated MHC–peptide binding data genomics and beyond. Genome Res. 25, 1499–1507
algorithm for next-generation sequencing data. set. Cancer Immunol. Res. 7, 719–736 (2019). (2015).
Hum. Mutat. 38, 788–797 (2017). 77. [No authors listed] The problem with neoantigen 103. Stegle, O., Teichmann, S. A. & Marioni, J. C.
54. Buchkovich, M. L. et al. HLAProfiler utilizes k-mer prediction. Nat. Biotechnol. 35, 97 (2017). Computational and analytical challenges in single-cell
profiles to improve HLA calling accuracy for rare and 78. Sahin, U. et al. Personalized RNA mutanome vaccines transcriptomics. Nat. Rev. Genet. 16, 133–145
common alleles in RNA-seq data. Genome Med. 9, 86 mobilize poly-specific therapeutic immunity against (2015).
(2017). cancer. Nature 547, 222–226 (2017). 104. Yuan, G.-C. et al. Challenges and emerging directions
55. Nielsen, M. et al. Reliable prediction of T-cell epitopes 79. Ott, P. A. et al. An immunogenic personal neoantigen in single-cell analysis. Genome Biol. 18, 84 (2017).
using neural networks with novel sequence vaccine for patients with melanoma. Nature 547, 105. Butler, A., Hoffman, P., Smibert, P., Papalexi, E. &
representations. Protein Sci. 12, 1007–1017 (2003). 217–221 (2017). Satija, R. Integrating single-cell transcriptomic data
56. Jurtz, V. et al. NetMHCpan-4.0: improved peptide– 80. Camidge, D. R., Doebele, R. C. & Kerr, K. M. across different conditions, technologies, and species.
MHC class I interaction predictions integrating eluted Comparing and contrasting predictive biomarkers Nat. Biotechnol. 36, 411 (2018).
ligand and peptide binding affinity data. J. Immunol. for immunotherapy and targeted therapy of NSCLC. This article presents the R toolkit Seurat 3.0 for
199, 3360–3368 (2017). Nat. Rev. Clin. Oncol. 16, 341–355 (2019). the analysis and integration of multimodal
This study and Nielsen et al. (2003) describe the 81. Jiang, P. et al. Signatures of T cell dysfunction and single-cell data.
original and the latest version of the popular tool exclusion predict cancer immunotherapy response. 106. McCarthy, D. J., Campbell, K. R., Lun, A. T. L. &
NetMHCpan that predicts the binding affinity of Nat. Med. 24, 1550–1558 (2018). Wills, Q. F. Scater: pre-processing, quality control,
peptides to class I MHC molecules and provides 82. Auslander, N. et al. Robust prediction of response normalization and visualization of single-cell
high-accuracy predictions for both well-annotated to immune checkpoint blockade therapy in RNA-seq data in R. Bioinformatics 33, 1179–1186
and novel alleles. metastatic melanoma. Nat. Med. 24, 1545–1549 (2017).
57. Han, Y. & Kim, D. Deep convolutional neural networks (2018). 107. Guo, M., Wang, H., Potter, S. S., Whitsett, J. A. & Xu, Y.
for pan-specific peptide–MHC class I binding 83. Aran, D., Hu, Z. & Butte, A. J. xCell: digitally SINCERA: a pipeline for single-cell RNA-seq profiling
prediction. BMC Bioinformatics 18, 585 (2017). portraying the tissue cellular heterogeneity landscape. analysis. PLOS Comput. Biol. 11, e1004575 (2015).
58. Liu, Z. et al. DeepSeqPan, a novel deep convolutional Genome Biol. 18, 220 (2017). 108. Lun, A. T. L., McCarthy, D. J. & Marioni, J. C. A
neural network model for pan-specific class I HLA– 84. Tappeiner, E. et al. TIminer: NGS data mining pipeline step-by-step workflow for low-level analysis of
peptide binding affinity prediction. Sci. Rep. 9, 794 for cancer immunology and immunotherapy. single-cell RNA-seq data with bioconductor.
(2019). Bioinformatics 33, 3140–3141 (2017). F1000Res. 5, 2122 (2016).
109. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: 133. Canzar, S., Neu, K. E., Tang, Q., Wilson, P. C. & 158. Haghverdi, L., Büttner, M., Wolf, F. A., Buettner, F. &
large-scale single-cell gene expression data analysis. Khan, A. A. BASIC: BCR assembly from single cells. Theis, F. J. Diffusion pseudotime robustly reconstructs
Genome Biol. 19, 15 (2018). Bioinformatics 33, 425–427 (2017). lineage branching. Nat. Methods 13, 845–848
110. Zhu, X. et al. Granatum: a graphical single-cell 134. Lindeman, I. et al. BraCeR: B-cell-receptor (2016).
RNA-seq analysis pipeline for genomics scientists. reconstruction and clonality inference from single-cell 159. Coifman, R. R. et al. Geometric diffusions as a tool for
Genome Med. 9, 108 (2017). RNA-seq. Nat. Methods 15, 563–565 (2018). harmonic analysis and structure definition of data:
111. Gardeux, V., David, F. P. A., Shajkofci, A., Schwalie, P. C. 135. Upadhyay, A. A. et al. BALDR: a computational diffusion maps. Proc. Natl Acad. Sci. USA 102,
& Deplancke, B. ASAP: a web-based platform for the pipeline for paired heavy and light chain 7426–7431 (2005).
analysis and interactive visualization of single-cell immunoglobulin reconstruction in single-cell RNA-seq 160. Cannoodt, R., Saelens, W. & Saeys, Y. Computational
RNA-seq data. Bioinformatics 33, 3123–3125 data. Genome Med. 10, 20 (2018). methods for trajectory inference from single-cell
(2017). 136. Eltahla, A. A. et al. Linking the T cell receptor transcriptomics. Eur. J. Immunol. 46, 2496–2506
112. Singer, M. & Anderson, A. C. Revolutionizing cancer to the single cell transcriptome in antigen-specific (2016).
immunology: the power of next-generation sequencing human T cells. Immunol. Cell Biol. 94, 604–611 161. Saelens, W., Cannoodt, R., Todorov, H. & Saeys, Y. A
technologies. Cancer Immunol Res 7, 168–173 (2016). comparison of single-cell trajectory inference methods.
(2019). 137. Rizzetto, S. et al. B-cell receptor reconstruction from Nat. Biotechnol. 37, 547–554 (2019).
113. Kiselev, V. Y., Andrews, T. S. & Hemberg, M. single-cell RNA-seq with VDJPuzzle. Bioinformatics This study presents a comprehensive benchmark
Challenges in unsupervised clustering of single-cell 34, 2846–2847 (2018). of many computational tools for single-cell
RNA-seq data. Nat. Rev. Genet. 20, 273–282 (2019). 138. Zhang, L. et al. Lineage tracking reveals dynamic pseudotime trajectory inference.
114. Luecken, M. D. & Theis, F. J. Current best practices in relationships of T cells in colorectal cancer. Nature 162. Wolf, F. A. et al. PAGA: graph abstraction reconciles
single-cell RNA-seq analysis: a tutorial. Molecular 564, 268–272 (2018). clustering with trajectory inference through a topology
Systems Biology 15, e8746 (2019). 139. Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. preserving map of single cells. Genome Biol. 20, 59
This study presents best-practice recommendations NIH Image to ImageJ: 25 years of image analysis. (2019).
covering the different steps of scRNA-seq analysis, Nat. Methods 9, 671–675 (2012). 163. Setty, M. et al. Characterization of cell fate probabilities
also documented in a bioinformatics workflow. 140. Carpenter, A. E. et al. CellProfiler: image analysis in single-cell data with Palantir. Nat. Biotechnol. https://
115. Sagar, Herman, J. S. & Grün, D. FateID infers cell fate software for identifying and quantifying cell doi.org/10.1038/s41587-019-0068-4 (2019).
bias in multipotent progenitors from single-cell phenotypes. Genome Biol. 7, R100 (2006). 164. La Manno, G. et al. RNA velocity of single cells. Nature
RNA-seq data. Nat. Methods 15, 379–386 (2018). This original publication describes a free, flexible, 560, 494–498 (2018).
116. Jiang, L., Chen, H., Pinello, L. & Yuan, G.-C. GiniClust: user-friendly and continuously maintained software 165. Gubin, M. M. et al. High-dimensional analysis
detecting rare cell types from single-cell gene package for developing image analysis and delineates myeloid and lymphoid compartment
expression data with Gini index. Genome Biol. 17, phenotyping pipelines. remodeling during successful immune-checkpoint
144 (2016). 141. Sommer, C., Straehle, C., Kothe, U. & Hamprecht, F. A. cancer therapy. Cell 175, 1014–1030.e19 (2018).
117. Kiselev, V. Y., Yiu, A. & Hemberg, M. scmap: projection in 2011 IEEE Int. Symp. on Biomed. Imaging: From 166. Qiu, X. et al. Reversed graph embedding resolves
of single-cell RNA-seq data across data sets. Nat. Nano to Macro https://doi.org/10.1109/ complex single-cell trajectories. Nat. Methods 14,
Methods 15, 359–362 (2018). isbi.2011.5872394 (IEEE, 2011). 979–982 (2017).
118. Aran, D. et al. Reference-based analysis of lung 142. Dao, D. et al. CellProfiler Analyst: interactive data 167. Levine, J. H. et al. Data-driven phenotypic dissection
single-cell sequencing reveals a transitional profibrotic exploration, analysis and classification of large of AML reveals progenitor-like cells that correlate with
macrophage. Nat. Immunol. 20, 163–172 (2019). biological image sets. Bioinformatics 32, 3210–3212 prognosis. Cell 162, 184–197 (2015).
119. Pliner, H. A., Shendure, J. & Trapnell, C. Supervised (2016). 168. Qiu, P. et al. Extracting a cellular hierarchy from
classification enables rapid annotation of cell atlases. 143. Schapiro, D. et al. histoCAT: analysis of cell phenotypes high-dimensional cytometry data with SPADE.
Preprint at bioRxiv https://doi.org/10.1101/538652 and interactions in multiplex image cytometry data. Nat. Biotechnol. 29, 886–891 (2011).
(2019). Nat. Methods 14, 873–876 (2017). 169. Samusik, N., Good, Z., Spitzer, M. H., Davis, K. L. &
120. Aevermann, B. D. et al. Cell type discovery using 144. Van Valen, D. A. et al. Deep learning automates the Nolan, G. P. Automated mapping of phenotype space
single-cell transcriptomics: implications for ontological quantitative analysis of individual cells in live-cell with single-cell data. Nat. Methods 13, 493–496
representation. Hum. Mol. Genet. 27, R40–R47 imaging experiments. PLOS Comput. Biol. 12, (2016).
(2018). e1005177 (2016). 170. Shekhar, K., Brodin, P., Davis, M. M. &
121. Hou, R., Denisenko, E. & Forrest, A. R. R. scMatch: 145. Maaten, L. vander & Hinton, G. Visualizing data using Chakraborty, A. K. Automatic classification of cellular
a single-cell gene expression profile annotation tool t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008). expression by nonlinear stochastic embedding
using reference datasets. Bioinformatics https:// This study is a pioneering work for visualizing (ACCENSE). Proc. Natl Acad. Sci. USA 111, 202–207
doi.org/10.1093/bioinformatics/btz292 (2019). high-dimensional data using non-linear (2014).
122. Tian, L. et al. Benchmarking single cell transformation in two dimensions (t-SNE). 171. Van Gassen, S. et al. FlowSOM: using self-organizing
RNA-sequencing analysis pipelines using mixture 146. Amir, E.-A. D. et al. viSNE enables visualization of high maps for visualization and interpretation of cytometry
control experiments. Nat. Methods 16, 479–487 dimensional single-cell data and reveals phenotypic data. Cytometry A 87, 636–645 (2015).
(2019). heterogeneity of leukemia. Nat. Biotechnol. 31, 172. Bruggner, R. V., Bodenmiller, B., Dill, D. L.,
123. Heather, J. M., Ismail, M., Oakes, T. & Chain, B. 545–552 (2013). Tibshirani, R. J. & Nolan, G. P. Automated identification
High-throughput sequencing of the T-cell receptor 147. Van Der Maaten, L. Accelerating t-SNE using of stratifying signatures in cellular subpopulations.
repertoire: pitfalls and opportunities. Brief. Bioinform. tree-based algorithms. J. Mach. Learn. Res. 15, Proc. Natl Acad. Sci. USA 111, E2770–E2777
19, 554–565 (2018). 3221–3245 (2014). (2014).
124. Bolotin, D. A. et al. MiXCR: software for comprehensive 148. Linderman, G. C., Rachh, M., Hoskins, J. G., 173. Olsen, L. R., Leipold, M. D., Pedersen, C. B. &
adaptive immunity profiling. Nat. Methods 12, Steinerberger, S. & Kluger, Y. Efficient algorithms for Maecker, H. T. The anatomy of single cell mass
380–381 (2015). t-distributed stochastic neighborhood embedding. cytometry data. Cytometry A 95, 156–172 (2019).
125. Bolotin, D. A. et al. Antigen receptor repertoire Preprint at arXiv https://arxiv.org/abs/1712.09005 174. Spitzer, M. H. et al. IMMUNOLOGY. An interactive
profiling from RNA-seq data. Nat. Biotechnol. 35, (2017). reference framework for modeling a dynamic immune
908–911 (2017). 149. van Unen, V. et al. Visual analysis of mass cytometry system. Science 349, 1259425 (2015).
126. Li, B. et al. Landscape of tumor-infiltrating T cell data by hierarchical stochastic neighbour embedding 175. Glanville, J. et al. Identifying specificity groups
repertoire of human cancers. Nat. Genet. 48, reveals rare cell types. Nat. Commun. 8, 1740 (2017). in the T cell receptor repertoire. Nature 547, 94–98
725–732 (2016). 150. Wattenberg, M., Viégas, F. & Johnson, I. How to use (2017).
127. Hu, X. et al. Landscape of B cell immunity and related t-SNE effectively. Distill https://doi.org/10.23915/ 176. Dash, P. et al. Quantifiable predictive features define
immune evasion in human cancers. Nat. Genet. 51, distill.00002 (2016). epitope-specific T cell receptor repertoires. Nature
560–567 (2019). 151. Wang, B., Zhu, J., Pierson, E., Ramazzotti, D. & 547, 89–93 (2017).
128. Bolotin, D. A., Poslavsky, S., Davydov, A. N. & Batzoglou, S. Visualization and analysis of single-cell 177. Stoeckius, M. et al. Simultaneous epitope and
Chudakov, D. M. Reply to ‘Evaluation of immune RNA-seq data by kernel-based similarity learning. Nat. transcriptome measurement in single cells.
repertoire inference methods from RNA-seq data’. Methods 14, 414–416 (2017). Nat. Methods 14, 865–868 (2017).
Nat. Biotechnol. 36, 1035–1036 (2018). 152. Pierson, E. & Yau, C. ZIFA: dimensionality reduction 178. Peterson, V. M. et al. Multiplexed quantification of
129. Mose, L. E. et al. Assembly-based inference of B-cell for zero-inflated single-cell gene expression analysis. proteins and transcripts in single cells. Nat. Biotechnol.
receptor repertoires from short read RNA sequencing Genome Biol. 16, 241 (2015). 35, 936–939 (2017).
data with V’DJer. Bioinformatics 32, 3729–3734 153. Ding, J., Condon, A. & Shah, S. P. Interpretable 179. Schulz, D. et al. Simultaneous multiplexed imaging
(2016). dimensionality reduction of single cell transcriptome of mRNA and proteins with subcellular resolution in
130. Stubbington, M. J. T. et al. T cell fate and clonality data with deep generative models. Nat. Commun. 9, breast cancer tissue samples by mass cytometry.
inference from single-cell transcriptomes. Nat. 2002 (2018). Cell Syst. 6, 531 (2018).
Methods 13, 329–332 (2016). 154. Becht, E. et al. Dimensionality reduction for visualizing 180. Stuart, T. & Satija, R. Integrative single-cell analysis.
This study presents TraCeR, a computational single-cell data using UMAP. Nat. Biotechnol. https:// Nat. Rev. Genet. 20, 257–272 (2019).
method for reconstruction of paired TCR chains doi.org/10.1038/nbt.4314 (2018). 181. Stein-O’Brien, G. L. et al. Decomposing cell identity
and inference of clonality and clonotype networks 155. Villani, A.-C., Sarkizova, S. & Hacohen, N. Systems for transfer learning across cellular measurements,
from full-transcript scRNA-seq data. immunology: learning the rules of the immune system. platforms, tissues, and species. Cell Syst. 8, 395–411.
131. Afik, S. et al. Targeted reconstruction of T cell receptor Annu. Rev. Immunol. 36, 813–842 (2018). e8 (2019).
sequence from single cell RNA-seq links CDR3 length 156. Kester, L. & van Oudenaarden, A. Single-cell 182. Altrock, P. M., Liu, L. L. & Michor, F. The mathematics
to T cell differentiation state. Nucleic Acids Res. 45, transcriptomics meets lineage tracing. Cell Stem Cell of cancer: integrating quantitative models. Nat. Rev.
e148 (2017). 23, 166–179 (2018). Cancer 15, 730–745 (2015).
132. Redmond, D., Poran, A. & Elemento, O. Single-cell 157. Trapnell, C. et al. The dynamics and regulators of cell 183. Iwami, S., Haeno, H. & Michor, F. A race between
TCRseq: paired recovery of entire T-cell α and β chain fate decisions are revealed by pseudotemporal tumor immunoescape and genome maintenance
transcripts in T-cell receptors from single-cell RNAseq. ordering of single cells. Nat. Biotechnol. 32, 381–386 selects for optimum levels of (epi)genetic instability.
Genome Med. 8, 80 (2016). (2014). PLOS Comput. Biol. 8, e1002370 (2012).
www.nature.com/nrg
Reviews
184. Kather, J. N. et al. High-throughput screening of between t cells. PLOS Comput. Biol. 11, e1004206 208. Navarro, J. F., Sjöstrand, J., Salmén, F., Lundeberg, J.
combinatorial immunotherapies with patient-specific (2015). & Ståhl, P. L. ST Pipeline: an automated pipeline for
in silico models of metastatic colorectal cancer. Cancer 197. Altan-Bonnet, G. & Mukherjee, R. Cytokine-mediated spatial mapping of unique transcripts. Bioinformatics
Res. 78, 5155–5163 (2018). communication: a quantitative appraisal of immune 33, 2591–2593 (2017).
185. Saini, S. K., Rekers, N. & Hadrup, S. R. Novel tools complexity. Nat. Rev. Immunol. https://doi. This study presents a comprehensive analysis pipeline
to assist neoepitope targeting in personalized cancer org/10.1038/s41577-019-0131-x (2019). and software tools for spatial transcriptomics.
immunotherapy. Ann. Oncol. 28, xii3–xii10 (2017). 198. Choi, H. et al. Transcriptome analysis of individual 209. Street, K. et al. Slingshot: cell lineage and pseudotime
186. Jørgensen, K. W., Rasmussen, M. & Buus, S. stromal cell populations identifies stroma–tumor inference for single-cell transcriptomics. BMC
NetMHCstab—predicting stability of peptide–MHC-I crosstalk in mouse lung cancer model. Cell Rep. 10, Genomics 19, 477 (2018).
complexes; impacts for cytotoxic T lymphocyte epitope 1187–1201 (2015).
discovery. Immunology 141, 18–26 (2014). 199. Yeung, T.-L. et al. Systematic identification of Acknowledgements
187. Rasmussen, M. et al. Pan-specific prediction of druggable epithelial–stromal crosstalk signaling The authors thank S. Boegel for fruitful discussions on
peptide–MHC class I complex stability, a correlate of networks in ovarian cancer. J. Natl Cancer Inst. 111, state-of-the-art computational methods. This work was sup-
T cell immunogenicity. J. Immunol. 197, 1517–1524 272–282 (2019). ported by the European Research Council (grant agreement
(2016). 200. Sun, R. et al. A radiomics approach to assess No. 786295 to Z.T.), the Austrian Cancer Aid/Tyrol (pro-
188. Shugay, M. et al. VDJdb: a curated database of T-cell tumour-infiltrating CD8 cells and response to ject No. 17003 to F.F.), the Austrian Science Fund (FWF)
receptor sequences with known antigen specificity. anti-PD-1 or anti-PD-L1 immunotherapy: an imaging (project No. T 974-B30 to F.F. and projects I3291 and I3978
Nucleic Acids Res. 46, D419–D427 (2018). biomarker, retrospective multicohort study. Lancet to Z.T.) and the Vienna Science and Technology Fund (Project
189. Blank, C. U., Haanen, J. B., Ribas, A. & Oncol. 19, 1180–1191 (2018). LS16–025 to Z.T.). Z.T. is a member of the German Research
Schumacher, T. N. The ‘cancer immunogram’. 201. Yaffe, M. B. Why geneticists stole cancer research Foundation (DFG) project TRR 241(INF).
Science 352, 658–660 (2016). even though cancer is primarily a signaling disease.
190. Łuksza, M. et al. A neoantigen fitness model predicts Sci. Signal. 12, eaaw3483 (2019). Author contributions
tumour response to checkpoint blockade 202. Aebersold, R. & Mann, M. Mass-spectrometric All authors contributed to all aspects of the article.
immunotherapy. Nature 551, 517–520 (2017). exploration of proteome structure and function.
Competing interests
191. Balkwill, F. Cancer and the chemokine network. Nature 537, 347–355 (2016).
The authors declare no competing interests.
Nat. Rev. Cancer 4, 540–550 (2004). 203. Drost, J. & Clevers, H. Organoids in cancer research.
192. Rieckmann, J. C. et al. Social network architecture Nat. Rev. Cancer 18, 407–418 (2018). Peer review information
of human immune cells unveiled by quantitative 204. Kobayashi, H. et al. Cancer-associated fibroblasts in Nature Reviews Genetics thanks T. Chan, A. Gentles and the
proteomics. Nat. Immunol. 18, 583–593 (2017). gastrointestinal cancer. Nat. Rev. Gastroenterol. other, anonymous, reviewer(s) for their contribution to
193. Kveler, K. et al. Immune-centric network of cytokines Hepatol. 16, 282–295 (2019). the peer review of this work.
and cells in disease context identified by 205. Guo, X. et al. Global characterization of T cells in
computational mining of PubMed. Nat. Biotechnol. non-small-cell lung cancer by single-cell sequencing. Publisher’s note
36, 651–659 (2018). Nat. Med. 24, 978–985 (2018). Springer Nature remains neutral with regard to jurisdictional
194. Vento-Tormo, R. et al. Single-cell reconstruction of the 206. Sato, K., Tsuyuzaki, K., Shimizu, K. & Nikaido, I. claims in published maps and institutional affiliations.
early maternal–fetal interface in humans. Nature 563, CellFishing.jl: an ultrafast and scalable cell search
347–353 (2018). method for single-cell RNA sequencing. Genome Biol. Related links
195. Orchard, S. et al. Protein interaction data curation: 20, 31 (2019). Bridging Bench, Biology, and Bioinformatics in the Field of
the International Molecular Exchange (IMEx) 207. Srivastava, D., Iyer, A., Kumar, V. & Sengupta, D. Mass Cytometry: http://cytof.biosurf.org
consortium. Nat. Methods 9, 345–350 (2012). CellAtlasSearch: a scalable search engine for Tumor Deconvolution Challenge: https://www.synapse.
196. Thurley, K., Gerecht, D., Friedmann, E. & Höfer, T. single cells. Nucleic Acids Res. 46, W141–W147 org/#!Synapse:syn15589870/wiki/582446
Three-dimensional gradients of cytokine signaling (2018).