100% found this document useful (1 vote)
96 views15 pages

Clinical Metagenomics: Microbial Genomics

This document discusses the potential for clinical metagenomic next-generation sequencing (mNGS) to transform infectious disease diagnosis and treatment. MNGS can simultaneously analyze all microbial and host genetic material in a sample, enabling detection of all potential pathogens and analysis of antimicrobial resistance and host responses. While promising, implementing MNGS in clinical settings faces challenges including distinguishing colonizers from pathogens and lack of reference standards.

Uploaded by

ebrar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
96 views15 pages

Clinical Metagenomics: Microbial Genomics

This document discusses the potential for clinical metagenomic next-generation sequencing (mNGS) to transform infectious disease diagnosis and treatment. MNGS can simultaneously analyze all microbial and host genetic material in a sample, enabling detection of all potential pathogens and analysis of antimicrobial resistance and host responses. While promising, implementing MNGS in clinical settings faces challenges including distinguishing colonizers from pathogens and lack of reference standards.

Uploaded by

ebrar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

MICROBIAL GENOMICS

Clinical metagenomics
Charles Y. Chiu   1,2* and Steven A. Miller1
Abstract | Clinical metagenomic next-​generation sequencing (mNGS), the comprehensive
analysis of microbial and host genetic material (DNA and RNA) in samples from patients,
is rapidly moving from research to clinical laboratories. This emerging approach is changing
how physicians diagnose and treat infectious disease, with applications spanning a wide range
of areas, including antimicrobial resistance, the microbiome, human host gene expression
(transcriptomics) and oncology. Here, we focus on the challenges of implementing mNGS in
the clinical laboratory and address potential solutions for maximizing its impact on patient
care and public health.

Microbiome
The field of clinical microbiology comprises both could be generated in a single run, permitting analysis of
The entirety of organisms that diagnostic microbiology, the identification of patho­ the entire genetic content of a clinical or environmental
colonize individual sites in the gens from clinical samples to guide management sample. The proliferation of available sequencing instru­
human body. and treatment strategies for patients with infection, and ments and exponential decreases in sequencing costs
public health microbiology, the surveillance and moni­ over the ensuing decade drove the rapid adoption of
Microarrays
Commonly referred to as toring of infectious disease outbreaks in the community. NGS technology.
‘chips’, these platforms consist Traditional diagnostic techniques in the microbiology To date, several studies have provided a glimpse into
of spots of DNA fragments, laboratory include growth and isolation of micro­ the promise of NGS in clinical and public health settings.
antibodies or proteins printed
organisms in culture, detection of pathogen-​specific For example, NGS was used for the clinical diagnosis
onto surfaces, enabling massive
multiplexing of hundreds to
anti­bodies (serology) or antigens and molecular identi­ of neuroleptospirosis in a 14-year-​old critically ill boy
thousands of targets. fication of microbial nucleic acids (DNA or RNA), most with meningoencephalitis19; this case was the first to
commonly via PCR. While most molecular assays target demonstrate the utility of metagenomic NGS (mNGS) in
Reads only a limited number of pathogens using specific prim­ providing clinically actionable information, as success­
In DNA sequencing, reads are
ers or probes, metagenomic approaches characterize all ful diagnosis prompted appropriate targeted antibiotic
inferred sequences of base
pairs corresponding to part of DNA or RNA present in a sample, enabling analysis of treatment and eventual recovery of the patient. Examples
or all of a single DNA fragment. the entire microbiome as well as the human host genome in public health microbiology include the use of NGS, in
or transcriptome in patient samples. Metagenomic combination with transmission network analysis 20, to
Metagenomic NGS approaches have been applied for decades to charac­ investigate outbreaks of the Escherichia coli strain
(mNGS). A shotgun sequencing
approach in which all genomic
terize various niches, ranging from marine environ­ O104:H4 (ref.21) and for surveillance of antimicrobial
content (DNA and/or RNA) ments1 to toxic soils2 to arthropod disease vectors3,4 to resistance in the food supply by bacterial whole-​genome
of a clinical or environmental the human microbiome5,6. These tools have also been sequencing22. Increasingly, big data provided by mNGS
sample is sequenced. used to identify infections in ancient remains7, discover is being leveraged for clinical purposes, including charac­
novel viral pathogens8 and characterize the human terization of antibiotic resistance directly from clinical
virome in both healthy and diseased states9–11 and for samples23 and analysis of human host response (tran­
forensic applications12. scriptomic) data to predict causes of infection and evalu­
The capacity to detect all potential pathogens — ate disease risk24,25. Thus, mNGS can be a key driver for
bacteria, viruses, fungi and parasites — in a sample and precision diagnosis of infectious diseases, advancing
1
Department of Laboratory
Medicine, University of
simultaneously interrogate host responses has great precision medicine efforts to personalize patient care in
California, San Francisco, potential utility in the diagnosis of infectious disease. this field.
CA, USA. Metagenomics for clinical applications derives its roots Despite the potential and recent successes of
2
Department of Medicine, from the use of microarrays in the early 2000s13,14. Some metagenomics, clinical diagnostic applications have
Division of Infectious early successes using this technology include the discov­ lagged behind research advances owing to a number
Diseases, University of ery of the SARS coronavirus15, gene profiling of muta­ of factors. A complex interplay of microbial and host
California, San Francisco,
CA, USA.
tions in cancer16 and in-​depth microbiome analysis of factors influences human health, as exemplified by the
different sites in the human body17. However, it was the role of the microbiome in modulating host immune
*e-​mail: charles.chiu@
ucsf.edu advent of next-​generation sequencing (NGS) techno­ responses26, and it is often unclear whether a detected
https://doi.org/10.1038/ logies in 2005 that jump-​started the metagenomics microorganism is a contaminant, colonizer or bona fide
s41576-019-0113-7 field18. For the first time, millions to billions of reads pathogen. Additionally, universal reference standards

NATuRe RevIews | GeNeTiCS volume 20 | JUNE 2019 | 341


REVIEWS

A Infectious disease diagnostics Resistance gene Mobility element


Aa Microorganism identification Ab Antibiotic resistance prediction Resistance regulatory element
qacEDelta1 Cfla or pp-flo
Acinetobacter drfA1 sul1 tetA(G)
baumanii
Ebola virus Zaire

Ad Antiviral resistance prediction


HIV-1
Ac Detection of virulence determinants 1,000× 100
900× 90 NRTI 1 –0.4 Sensitive
Endotoxin

Pairwise identity (%)


Cell wall 800× 80 NRTI 2 0.41 Intermediate

Coverage map
700× 70 NRTI 3 0.96 Resistant
600× 60 NRTI 4 0.55
500× 50 NNRTI 1 4.0*
400× 40 NNRTI 2 5.0*
300× 30 PI 1 –0.16
Exotoxin 200× 20 PI 2 –0.02
100× 10
PI 3 0.17
0 0
0 2 4 6 8 0 2.0 4.0 6.0
Genomic position (kb) Z-score

B Microbiome analyses C Transcriptomics


Ca 20
Infection No infection
10

0
Host metric
–10 threshold

–20
Healthy individual Patient
Cb
Gene A
Gene B

expressed genes
Gene C

Differentially
Gene D
Gene E
Gene F
Gene G
Gene H
Harvest Gene I
Gene J
Gene K
Gene L
Patients
Probiotic
development

Synthetic stool

Mutation 1 Mutation 2

D Oncology applications Host DNA Host DNA

Da
Merkel cell
polyomavirus Helicase
Full-length LT Truncated LT

Helicase
Full-length LT Host DNA
Infection Transformation

Db

Truncated LT Truncated LT

342 | JUNE 2019 | volume 20 www.nature.com/nrg


MICROBIAL GENOMICS

◀ Fig. 1 | Clinical applications of metagenomic sequencing. A | Applications in infectious human host response to infection by transcriptomics
disease diagnostics include direct identification of microorganisms from primary clinical and the identification of tumour-​associated viruses and
samples (part Aa); antimicrobial resistance prediction by characterization of resistance their genomic integration sites (Fig. 1; Table 1). Aside
genes (part Ab); detection of species-​level or strain-​level virulence determinants, such as from infectious disease diagnostics, adoption of mNGS
secretion of specific endotoxins or exotoxins (part Ac); and antiviral resistance prediction
in clinical laboratories has been slow, and most applica­
(part Ad). As shown for HIV-1, recovery of the complete viral genome from a patient
sample by metagenomic next-​generation sequencing (mNGS) (part Ad, graph) facilitates
tions have yet to be incorporated into routine clinical
sequence analysis to predict susceptibility or resistance to antiretroviral drugs (part Ad, practice. Nonetheless, the breadth and potential clini­
bar plot); the susceptibility profile for the analysed strain (black bars) predicts resistance cal utility of these applications are likely to transform
to the non-​nucleoside reverse transcriptase inhibitor (NNRTI) class of drugs (denoted by the field of diagnostic microbiology in the near future.
an asterisk), as opposed to nucleoside reverse transcriptase inhibitors (NRTIs) or protease
inhibitors (PIs). B | Microbiome analyses can inform disease prognosis in acute and chronic Infectious disease diagnosis
disease states and underlie the development of probiotic therapies. Coloured bars The traditional clinical paradigm for diagnosis of infec­
represent individual microbiota species. A reduction in species diversity is seen in tious disease in patients, applied for more than a century,
dysbiosis (an unhealthy state), such as present in patients with Clostridium difficile-​ involves a physician formulating a differential diagnosis
associated disease. Stool from healthy individuals can be harvested to treat patients with
and then ordering a series of tests (generally ‘one bug,
C. difficile infection by faecal stool transplantation or as orally administered encapsulated
faecal pills. Alternatively , synthetic stool generated from microbiota species observed in one test’) in an attempt to identify the causative agent.
healthy individuals can be used as probiotics to treat patients. In addition to C. difficile The spectrum of conventional testing for pathogens
infection, chronic diseases such as obesity , inflammatory bowel disease and diabetes in clinical samples ranges from the identification of
mellitus are potential targets for probiotic therapy. C | RNA-​sequencing-based microorganisms growing in culture (for example, by
transcriptomics can improve the diagnosis of infectious and non-​infectious conditions biochemical phenotype testing or matrix-​assisted laser
on the basis of the human host response. Host transcriptomic profiling by NGS can enable desorption/ionization (MALDI) time-​of-flight mass
the construction of a classifier metric to discriminate between patients with infection spectrometry), the detection of organism-​specific bio­
(red bars) from uninfected patients (blue bars) with high accuracy (part Ca). Metric scores markers (such as antigen testing by latex agglutination
above the dotted line indicate infection, whereas scores below the dotted line indicate or antibody testing by enzyme-​linked immunosorbent
absence of infection; the overall accuracy of the classifier metric shown is 83%. Cluster
assay (ELISA)) or nucleic acid testing by PCR for sin­
heat map analysis identifies individual, differentially expressed host genes associated
with infection (genes A–F) versus those associated with no infection (genes G–L) (part Cb). gle agents to multiplexed PCR testing using syndromic
D | Sequencing of viral tumours or liquid biopsy analyses in oncology can be used for panels. These panels generally include the most common
simultaneous pathogen detection and characterization of host genetic mutations. pathogens associated with a defined clinical syndrome,
mNGS can be used to detect Merkel cell polyomavirus, the virus associated with the such as meningitis and encephalitis, acute respiratory
development of Merkel cell carcinoma. Simultaneous sequencing of host DNA can infection, sepsis or diarrhoeal disease28–31.
identify mutations that arise from integration of the viral genome containing the Molecular diagnostic assays provide a fairly cost-​
full-​length large T antigen (LT) followed by subsequent truncation of the LT antigen effective and rapid (generally <2 hours of turnaround
(part Da) or truncation of the LT antigen before viral genome integration (part Db). time) means to diagnose the most common infections.
Both of these two mutations lead to cellular transformation that drives tumour However, nearly all conventional microbiological tests in
proliferation. Although promising, many of these sequencing-​based applications
current use detect only one or a limited panel of patho­
have yet to be incorporated into routine clinical practice. Part C is adapted from ref.25,
CC BY-NC-ND 4.0 (https://creativecommons.org/licenses/by-​nc-nd/4.0/). Part D is
gens at a time or require that a microorganism be suc­
adapted from ref.134, CC BY 3.0 (https://creativecommons.org/licenses/by/3.0/). cessfully cultured from a clinical sample. By contrast,
while NGS assays in current use cannot compare with
conventional tests with respect to speed — the sequenc­
and proven approaches to demonstrate test valida­ ing run alone on a standard Illumina instrument takes
tion, reproducibility and quality assurance for clinical >18 hours — mNGS enables a broad range of pathogens
Transmission network
analysis
metagenomic assays are lacking. Considerations of cost, — viruses, bacteria, fungi and/or parasites — to be
The integration of reimbursement, turnaround time, regulatory considera­ identified from culture or directly from clinical samples
epidemiological, laboratory tions and, perhaps most importantly, clinical utility also on the basis of uniquely identifiable DNA and/or RNA
and genomic data to track remain major hurdles for the routine implementation sequences32. Another key advantage of NGS approaches
patterns of transmission and
of clinical mNGS in patient care settings27. is that the sequencing data can potentially be leveraged
to infer origin and dates of
infection during an outbreak. We review here the various applications of mNGS for additional analyses beyond the mere identification
currently being exploited in clinical and public health of a causative pathogen, such as microbiome character­
Precision medicine settings. We discuss the challenges involved in the ization and parallel analyses of human host responses
An approach to medical care by adoption of mNGS in the clinical laboratory, including through transcriptome profiling by RNA sequencing
which disease treatment and
prevention take into account
validation and regulatory considerations that extend (RNA-​seq). Thus, the clinical utility of NGS in diagno­
genetic information obtained beyond its initial development in research laboratories, sis may be in the most difficult-​to-diagnose cases or for
by genomic or molecular and propose steps to overcome these challenges. Finally, immunocompromised patients, in whom the spectrum
profiling of clinical samples. we envisage future directions for the field of clinical of potential pathogens is greater. Eventually, mNGS
metagenomics and anticipate what will be achievable in may become cost competitive with multiplexed assays
Reference standards
In laboratory test development, the next 5 years. or used as an upfront ‘rule out’ assay to exclude infec­
well-​characterized, standardized tious aetiologies. Of course, detection of nucleic acids,
and validated reference Applications of clinical metagenomics either by multiplex PCR panels or NGS, does not by itself
materials or databases that To date, applications of clinical metagenomics have prove that an identified microorganism is the cause of
enable measurement of
performance characteristics of
included infectious disease diagnostics for a variety of the illness, and findings have to be interpreted in the
an assay, including sensitivity, syndromes and sample types, microbiome analyses in clinical context. In particular, discovery of an atypical
specificity and accuracy. both diseased and healthy states, characterization of the or novel infectious agent in clinical samples should be

NATuRe RevIews | GeNeTiCS volume 20 | JUNE 2019 | 343


REVIEWS

Table 1 | Clinical microbiology approaches using next-​generation sequencing


Sequencing method Clinical sample type Potential clinical indications Clinical test Refs
available?
Infectious disease diagnosis — targeted analyses
Amplicon sequencing (universal Multiple body fluids and tissues Multiplexed pathogen detection Yesa 39

bacterial, fungal or parasitic rRNA


sequencing)
Amplicon sequencing (multiplexed Multiple body fluids and tissues Multiplexed pathogen detection No 135

primer panels)
Capture probe enrichment Multiple body fluids and tissues Viral genome recovery for infection No 43,44,46,47

control, epidemiology and public health


Capture probe enrichment Multiple body fluids and tissues Multiplexed pathogen detection No 49–52

Capture probe enrichment Multiple body fluids and tissues Antibiotic resistance characterization No 23,136

Infectious disease diagnosis — untargeted analyses


Metagenomic sequencing Blood (plasma) Culture-​negative sepsis, endocarditis, Yesb 33,57

febrile neutropenia, fever of


unknown origin or monitoring of
immunocompromised patients
Metagenomic sequencing Respiratory secretions Culture-​negative and/or PCR-​negative Yesc 25,37,58,137,138

pneumonia
Metagenomic sequencing Cerebrospinal fluid Undiagnosed meningitis, encephalitis Yesd 36,37

or myelitis
Metagenomic sequencing Stool Severe diarrhoea No 139

Metagenomic sequencing Infected tissue or other body fluid Culture-​negative infection No 118,140

Microbiome analyses
Metagenomic sequencing Stool Consumer-​based microbiome testinge Noe No reference
Metagenomic sequencing Stool Guiding management and treatment No 141

of Clostridium difficile infection


Metagenomic sequencing Stool Chronic illnesses No 64

Metagenomic sequencing Respiratory secretions Aiding in diagnosis of acute respiratory No 137

infection
Human host response analyses
RNA sequencing Multiple sample types; whole Aiding diagnosis or characterization No 24,25,68

blood or PBMC most common of infections such as bacterial sepsis


or pneumonia; disease prognosis
Oncological analyses
Whole-​genome tumour Tumour Identification of viruses associated No 142

sequencing with cancer


Liquid biopsy sequencing Cell-​free body fluids Simultaneous cancer and infectious No 57,143

disease testing
PBMC, peripheral blood mononuclear cell; rRNA , ribosomal RNA. aUniversity of Washington39, Fry Laboratories. bKarius33. cIDbyDNA37. dUniversity of California,
San Francisco36. euBiome; testing is not for diagnosis or treatment of disease.

followed up with confirmatory investigations such as The details for the specific steps vary by laboratory and
Latex agglutination orthogonal testing of tissue biopsy samples and demon­ are described extensively elsewhere33–37.
A clinical laboratory test for stration of seroconversion or via the use of cell culture
detection of a specific antibody
in which the corresponding
or animal models, as appropriate8, to ascertain its true Targeted NGS analyses. Targeted approaches have
antigen is adsorbed on spherical pathogenic potential. the benefit of increasing the number and proportion
polystyrene latex particles that NGS of clinical samples as performed in either of pathogen reads in the sequence data. This step can
undergo agglutination in the research or clinical laboratories involves a number of increase the detection sensitivity for microorganisms
presence of the antibody.
steps, including nucleic acid extraction, enrichment being targeted, although it limits the breadth of poten­
Seroconversion for DNA and/or RNA, library preparation, PCR ampli­ tial pathogens that can be identified. An example of a
The development of detectable fication (if needed), sequencing and bioinformat­ targeted approach is the use of highly conserved prim­
antibodies in the blood that are ics analysis (Fig. 2). Any body fluid or tissue yielding ers for universal PCR amplification and detection of all
directed against an infectious sufficient nucleic acid is amenable to NGS analysis, microorganisms corresponding to a specific type from
agent, such as HIV-1, after
which the infectious disease
which can either be targeted, that is, enriching indi­ clinical samples, such as 16S ribosomal RNA (rRNA)
can be detected by serological vidual genes or genomic regions, or untargeted, as is gene amplification for bacteria38,39 and 18S rRNA and
testing for the antibody. the case for metagenomic ‘shotgun’ approaches (Fig. 2). internal transcribed spacer (ITS) gene amplification for

344 | JUNE 2019 | volume 20 www.nature.com/nrg


MICROBIAL GENOMICS

Library
fungi40 (Fig. 2). Previously, such approaches were followed Untargeted mNGS of clinical samples is perhaps
In DNA sequencing, a by Sanger sequencing of the resulting PCR amplicon to the most promising approach for the comprehensive
collection of DNA fragments identify the pathogen and make a diagnosis; now, this diagnosis of infections. In principle, nearly all patho­
with known adapter sequences step is commonly accomplished using NGS. Universal gens, including viruses, bacteria, fungi and parasites,
at one or both ends that is
derived from a single clinical
PCR for detection of bacteria and fungi has now been can be identified in a single assay56. mNGS is a needle-​
or environmental sample. adopted in many hospital laboratories and has increased in-a-​haystack endeavour, as only a small proportion
the number and proportion of infectious diagnoses39,41, (typically <1%) of reads are non-​human, of which
Sanger sequencing although the technique is limited by the breadth of only a subset may correspond to potential pathogens.
A classical method of DNA
detection (that is, bacteria or fungi only or even a more A limitation of mNGS is that the sensitivity of the
sequencing based on selective
incorporation of chain-​
limited range of targets, such as mycobacteria only, approach is critically dependent on the level of back­
terminating dideoxynucleotides depending on the primer sets used) and by concerns ground. Tissues, for example, have increased human
developed by Frederick Sanger regarding sensitivity42. host background relative to cell-​free body fluids, result­
and colleagues in 1977; Another example of a targeted NGS approach is the ing in a reduced number and proportion of microbial
now largely supplanted by
design of primers tiled across the genome to facilitate reads and hence a decrease in mNGS sensitivity33,36,37.
next-​generation sequencing.
PCR amplification and amplicon NGS for recovery Moreover, defining specific microbial profiles that are
Subtyping of viral genomes directly from clinical samples43. This diagnostic or predictive of disease development can be
In microbiology, refers to the method has been used to track the evolution and spread difficult, especially from nonsterile sites that harbour
identification of a specific
of Zika virus (ZIKV) in the Americas44–46 and of Ebola a complex microbiome, such as respiratory secretions
genetic variant or strain of a
microorganism (for example,
virus in West Africa47, with some demonstrations of or stool6. Nevertheless, several groups have successfully
virus, bacterium or fungus), real-​time monitoring having an impact on public health validated mNGS in Clinical Laboratory Improvement
usually by sequencing all interventions. Amendments (CLIA)-certified clinical laboratories for
or part of the genome. Another targeted approach is capture probe enrich­ the diagnosis of infections, including meningitis or
ment, whereby metagenomic libraries are subjected encephalitis36,37, sepsis33,57 and pneumonia58, and these
to hybridization using capture ‘bait’ probes48. These assays are now available for clinical reference testing
probes are generally 30–120 bp in length, and the num­ of patients.
ber of probes can vary from less than 50 to more than
2 million49–52. Although this enrichment method has been Clinical microbiome analyses
shown to increase the sensitivity of metagenomic detec­ Many researchers now use mNGS instead of targeted
sequencing of the 16S rRNA gene for in-​depth charac­
tion in research settings, especially for viruses, it has yet to
terization of the microbiome59. There is growing public
be used routinely for clinical diagnosis. A promising appli­
cation of this approach may be the enrichment of clinical awareness of the microbiome and its likely involvement
samples for characterization of antibiotic resistance23, a in both acute and chronic disease states60. However, no
considerable problem in hospitals and the primary focus microbiome-​based tests have been clinically validated
of the US National Action Plan for Combating Antibiotic-​ for the diagnosis or treatment of disease, in part owing
Resistant Bacteria53. However, drawbacks of capture probe to an incomplete understanding of the complexity of the
enrichment, compared with untargeted approaches for microbiome and its role in disease pathogenesis.
infectious disease diagnosis, include a bias towards tar­ One future clinical application of microbiome
geted microorganisms, added steps, increased costs and analysis may be in the management and treatment of
long hybridization times (24–48 hours) as a result of the Clostridium difficile-​associated disease. C. difficile is an
additional processing needed for maximal efficiency. opportunistic bacterium that can infect the gut, result­
ing in the production of toxins that can cause diarrhoea,
Untargeted metagenomic NGS analyses. Untargeted dehydration, sepsis and death. C. difficile infection occurs
shotgun mNGS analyses forego the use of specific only in the setting of a microbiome that is altered by
primers or probes54. Instead, the entirety of the DNA factors such as exposure to broad-​spectrum anti­biotics
and/or RNA (after reverse transcription to cDNA) or recent gastrointestinal surgery61. The importance of
is sequenced. With pure cultures of bacteria or fungi, the microbiome in C. difficile infection is underscored
mNGS reads can be assembled into partial or complete by the 80–90% effectiveness of faecal stool transplan­
genomes. These genome sequences are then used for tation in treating and potentially curing the disease62,63.
subtyping and/or monitoring hospital outbreaks in sup­ The use of mNGS to characterize the microbiome in
port of infection control and/or public health surveil­ multiple studies has facilitated the development of bac­
lance efforts. For example, a seminal study described the terial probiotic mixtures that can be administered as pills
use of whole-​genome sequencing of multidrug-​resistant, for prophylaxis or treatment of C. difficile-​associated
carbapenemase-​producing Klebsiella pneumoniae to disease (Fig. 1B).
track the origin and evolution of a hospital outbreak55. Another potential application of the microbiome is
This study demonstrated for the first time the high-​ in the analysis of bacterial diversity, which can provide
resolution mapping of likely transmission events in a clues as to whether a patient’s illness is infectious or
hospital, some of which were unexpected on the basis non-​infectious. For example, a study of mNGS for the
of initial epidemiological data, and also identified puta­ identification of respiratory pathogens in patients with
tive resistance mutations in emerging resistant strains. pneumonia found that individuals with culture-​proven
The integration of genomic and epidemiological data infection had significantly less diversity in their res­
yielded actionable insights that would have been useful piratory microbiome25. Alterations of the microbiome,
for curbing transmission. known as dysbiosis, have also been shown to be related

NATuRe RevIews | GeNeTiCS volume 20 | JUNE 2019 | 345


REVIEWS

Patient sample

Microbial colonies Urine Stool Tissue Bodily Nasal or


grown on agar samples sample biopsy fluid skin swab

Amplicon sequencing Metagenomic sequencing


DNA extraction Total nucleic acid extraction

RNA DNA
Bacteria Fungi Parasites Human viruses Bacteria Fungi viruses Parasites Human

RNA DNA
Universal PCR
Bacteria Fungi and parasites Reverse transcription
16S or 23S rRNA 18S, 28S or ITS1

ITS1
ITS ITS2
16S 23S 18S 5.8S 28S RNA cDNA

Universal PCR
Multiplexed amplicon PCR
cDNA and DNA
Amplification of target region (targeted mNGS)

PCR
Library preparation

Library preparation

Primers

Primers
Untargeted Targeted mNGS
mNGS

Sequencing of amplicons Biotinylated pathogen-


+ specific RNA or DNA
bait library

AGTCAG
Bead capture

CAAATACTGAGTCTG
Baits hybridized to
pathogen genome

Pathogen identification or microbiome Host Capture probe enrichment


analyses transcriptome using RNA or DNA baits
profiling
Sequencing of all
nucleic acids
expressed genes
Differentially

Kingdom Bacteria Species


Bacteria Genus A Strain A
Eukarya Species A.1 Strain B
Viruses Genus B Strain C
Fungi

Genus A
Species A.1 AGTCAG

Samples

CAAATACTGAGTCTG

346 | JUNE 2019 | volume 20 www.nature.com/nrg


MICROBIAL GENOMICS

◀ Fig. 2 | Targeted versus untargeted shotgun metagenomic next-​generation (such as early Lyme disease82 or arboviral infections,
sequencing approaches. A variety of patient samples, as well as cultured microbial including West Nile virus83 or ZIKV84); analogous to
colonies, can be analysed using targeted or untargeted metagenomic next-​generation serologic testing, indirect diagnosis of infections may be
sequencing (mNGS) methods for pathogen identification, microbiome analyses and/or possible on the basis of a pathogen-​specific human host
host transcriptome profiling. Universal PCR (left) is a targeted mNGS approach that
response. Analysis of pathogen-​specific host responses
uses primers designed from conserved regions such as the ribosomal RNA (rRNA) genes
that are universally conserved among bacteria (16S or 23S rRNA) or fungi and parasites may also be useful in discriminating the bona fide
(18S rRNA , 28S rRNA or internal transcribed spacer (ITS)). Other sets of primers can be causative pathogen or pathogens in a complex clinical
designed to target a defined set of pathogens and/or genes and used for multiplex metagenomic sample, such as a polymicrobial abscess
reverse transcription PCR or PCR (multiplexed amplicon PCR). NGS library preparation or respiratory fluid25. Yet another promising applica­
and sequencing of the resultant amplicons enable pathogen identification down to the tion of RNA-​seq is in discriminating infectious versus
genus or species level. Metagenomic sequencing (right) entails unbiased shotgun non-​infectious causes of acute illness25. If an illness is
sequencing of all microbial and host nucleic acids present in a clinical sample. judged more likely to be non-​infectious (for example, an
Separate DNA and RNA libraries are constructed; the DNA library is used for identification autoimmune disease) on the basis of the host response,
of bacteria, fungi, DNA viruses and parasites, whereas the RNA library is used for for example, clinicians may be more willing to discon­
identification of RNA viruses and RNA sequencing-​based human host transcriptome
tinue antibiotics and treat the patient aggressively with
profiling (heat map, bottom right). As no primers or probes are used in unbiased mNGS,
the vast majority of reads corresponds to the human host and, thus, detection of steroids and other immunosuppressive medications.
pathogens from metagenomic libraries is a ‘needle-​in-a-​haystack’ endeavour. An optional As large-​scale sequencing data continue to be gener­
capture probe enrichment step using magnetic beads enables targeted mNGS of ated, perhaps driven by routine clinical mNGS testing,
pathogens and/or genes from metagenomic libraries. All these methods are compatible secondary mining of human reads might improve the
with sequencing on traditional benchtop instruments such as the Illumina HiSeq and accuracy of clinical diagnoses by incorporating both
portable nanopore sequencers such as the Oxford Nanopore Technologies MinION. microbial and host gene expression data.

to obesity, diabetes mellitus and inflammatory bowel Applications in oncology


disease64, and manipulation of the microbiome may be In oncology, whole-​genome or directed NGS approaches
a pathway to treating these pathological conditions. to identify mutated genes can be used to simultaneously
uncover viruses associated with cancer (that is, herpes­
Human host response analyses viruses, papillomaviruses and polyomaviruses) and/or
Clinical mNGS typically focuses on microbial reads; to gather data on virus–host interactions85. For exam­
however, there is a complementary role for the analysis ple, mNGS was critical in the discovery of Merkel cell
of gene expression in studying human host responses to polyomavirus (Fig. 1d), now believed to be the cause
infection65 (Fig. 1c). mNGS of RNA libraries used for the of Merkel cell carcinoma, a rare skin cancer seen most
detection of pathogens such as RNA viruses in clinical commonly in elderly patients86. To date, the US Food and
samples incidentally produces host gene expression data Drug Administration (FDA) has approved the clinical
for transcriptome (RNA-​seq) analyses66. Although RNA-​ use of two NGS panels testing for actionable genomic
seq analyses are commonly performed on whole blood aberrations in tumour samples87. Detection of reads cor­
or peripheral blood mononuclear cell (PBMC) samples, responding to both integrated and exogenous viruses in
any body fluid or tissue type is potentially amenable these samples would be possible with the addition of
to these analyses. Classification of genes by expression specific viral probes to the panel or accomplished inci­
profiling using RNA-​seq has been used to characterize dentally while sequencing the whole tumour genome
several infections, including staphylococcal bacterae­ or exome.
mia67, Lyme disease68, candidiasis69, tuberculosis (dis­ Additional knowledge of integrated or active viral
criminating between latent and active disease risk)70–72 infections in cancers and their involvement in signal­
and influenza73–75. Machine-​learning-based analyses ling pathways may inform preventive and therapeutic
of RNA-​seq data have been used for cancer classifi­ interventions with targeted antiviral and/or chemothera­
cation76, and translation of these approaches may be peutic drugs88, as evidenced by the decreased risk of
promising for infectious diseases. Panels containing a hepatitis C virus-​associated hepatocellular carcinoma
limited number of host biomarkers are being developed after treatment with direct-​acting antiviral agents89.
as diagnostic assays for influenza77, tuberculosis70 and In the future, mNGS of cell-​free DNA from liquid biopsy
bacterial sepsis78. samples (for example, plasma) might be leveraged for the
Although no RNA-​seq-based assay has been clinically simultaneous identification of early cancer and diagnosis
validated to date for use in patients, the potential clin­ of infection in immunocompromised patients (Box 1).
ical impact of RNA-​seq analyses is high. Interrogation
of RNA reads from microorganisms corresponding to Clinical implementation of metagenomic NGS
active microbial gene expression might enable the dis­ Implementation of mNGS in the clinical laboratory is
crimination between infection versus colonization25 and a complex endeavour that requires customization of
live (viable) versus dead organisms79. Moreover, RNA-​ research protocols using a quality management approach
Liquid biopsy seq analyses of the human host can be used to identify consistent with regulatory standards90. Library prepara­
The detection of molecular novel or underappreciated host–microbial interactions tion reagents, sequencing instrumentation and bioin­
biomarkers from minimally directly from clinical samples, as previously shown for formatics tools are constantly changing in the research
invasive sampling of clinical
body fluids, such as DNA
patients with Lyme disease68, dengue80 or malaria81. environment. However, in the clinical laboratory, assays
sequences in blood, for the RNA-​seq may be particularly useful in clinical cases in need to be implemented following standardized (locked-​
purpose of diagnosing disease. which the causative pathogen is only transiently present down) protocols. Changes made to any component of the

NATuRe RevIews | GeNeTiCS volume 20 | JUNE 2019 | 347


REVIEWS

Box 1 | Where is the signal — cellular or cell-​free DNA? Differential lysis of human cells followed by degrada­
tion of background DNA with DNase I — thus retain­
Metagenomic sequencing for clinical diagnostic purposes typically uses a shotgun ing and enriching for nucleic acid from organisms with
approach by sequencing all of the DNA and/or RNA in a clinical sample. Clinical cell walls, which include some bacteria and fungi — has
samples can vary significantly in their cellularity, ranging from cell-​free fluids (that is, been shown to provide substantial microbial enrichment
plasma, bronchoalveolar lavage fluid or centrifuged cerebrospinal fluid) to tissues.
of up to 1,000 times94,99,100. However, the performance of
In the next-​generation sequencing (NGS) field, there is great interest in the use of liquid
biopsies from cell-​free DNA (cfDNA) extracted from body fluids, such as plasma, to differential lysis methods can be limited by a number
identify chromosomal or other genetic mutations and thus diagnose malignancies in of factors. These limitations include potential decreased
the presymptomatic phase123. Similarly, cfDNA analysis has been useful for non-​invasive sensitivity for microorganisms without cell walls, such
prenatal testing applications, such as for the identification of trisomy 21 (ref.124). as Mycoplasma spp. or parasites; a possible paradoxi­
One study has described the potential utility of cfDNA analysis in diagnosing invasive cal increase in exogenous background contamination
fungal infection in cases where biopsy is not possible57. Another advantage to cfDNA by use of additional reagents101; and the inability to
analysis is the higher sensitivity of metagenomic sequencing owing to less cellular detect free nucleic acid from dead organisms that are
background from the human host. However, limitations of cfDNA analysis may include lysed in vivo by human host immune cells or antibiotic
decreased sensitivity for detection of predominantly intracellular pathogens, such as treatment. The importance of retaining the ability for
human T cell lymphotropic virus, Rickettsia spp. and Pneumocystis jirovecii, and loss of
cell-​free DNA detection from culture-​negative samples
the ability to interrogate cellular human host responses with RNA sequencing.
from dead organisms is also why incorporation of a
propidium monoazide treatment step to select for DNA
assay need to be validated and shown to have acceptable from live organisms may not be clinically useful as an
performance before testing in patients. Periodic updates enrichment method for mNGS102. In general, both the
and repeat validation studies are performed as deemed differential lysis and propidium monoazide approaches
necessary to incorporate interim technological advances would also be cumbersome to implement in a highly
in NGS reagents, protocols and instrumentation. reproducible fashion, which is needed for clinical
Metagenomic methods for pathogen detection pres­ laboratory validation.
ent a particularly challenging scenario for clinical vali­ To some extent, the human host background limi­
dation (Fig. 3), as it is not practical to test an essentially tation may be overcome with brute force, made possi­
unlimited number of different organisms for the assay ble by the increasing capacities of available sequencers.
to be considered validated. Although the FDA has pro­ For instance, an astrovirus was detected in a child with
vided general guidelines for clinical validation of NGS encephalitis by ultradeep sequencing of brain tissue,
infectious disease testing91, there are no definitive reco­ yielding only 1,612 reads out of ~134 million (0.0012%)
mmendations for the clinical implementation of mNGS sequences103. Yet another approach to improve sensitiv­
testing, nor is there mention of specific requirements. ity is to leverage a hybrid method for enrichment, such
However, a best-​practice approach can be taken that as metagenomic sequencing with spiked primers46.
includes failure-​mode analysis and evaluations of per­ Combining targeted with untargeted sequencing, the
formance characteristics using representative organ­ method uses variably sized panels (100–10,000) of short
isms with ongoing assay monitoring and independent primers that are added (‘spiked’) into reaction mixtures
confirmation of unexpected results. to enrich for specific target organisms while retaining
the breadth of metagenomic sequencing for off-​target
Sensitivity and enrichment or depletion methods organisms. When spiked at the reverse transcription
A key limitation of mNGS is its decreased sensitivity with step, a panel of ZIKV-​specific primers was found to
high background, either predominantly from the human increase the number of ZIKV reads by more than ten­
host (for example, in tissue biopsies) or the microbiome fold without appreciably decreasing broad metagenomic
(for example, in stool). The background can be clini­ sensitivity for other pathogens, enabling whole-​genome
cally relevant as the pathogen load in infections, such as viral sequencing to characterize ZIKV spread from
Shigella flexneri in stool from patients with diarrhoea92 or Brazil into Central America and Mexico46.
ZIKV in plasma from patients with vector-​borne febrile
illness93, can be very low (<103 copies per ml). Laboratory workflow considerations
Host depletion methods for RNA libraries have been The complexity of mNGS analysis requires highly
developed and shown to be effective, including DNase I trained personnel and extreme care in sample handling
treatment after extraction to remove residual human to avoid errors and cross-​contamination. Even miniscule
background DNA94; the use of RNA probes followed amounts of exogenous DNA or RNA introduced during
by RNase H treatment95; antibodies against human and sample collection, aliquoting, nucleic acid extraction,
mitochondrial rRNA (the most abundant host RNA library preparation or pooling can yield a detectable
types in clinical samples)96; and/or CRISPR–Cas9-based signal from contaminating reads. In addition, labora­
approaches, such as depletion of abundant sequences by tory surfaces, consumables and reagents are not DNA
hybridization97. free. A database of background microorganisms com­
Unfortunately, there are no comparably effective monly detected in mNGS data and arising from nor­
parallel methods for DNA libraries. Limited enrich­ mal flora or laboratory contamination101,104 typically
ment in the 3–5 times range can be achieved with needs to be maintained for accurate mNGS analyses.
the use of antibodies against methylated human host Microorganisms on this list are either not reported or
DNA98, which enriches microbial reads owing to the will require higher thresholds for reporting if they are
lack of methylated DNA in most pathogen genomes. clinically significant organisms.

348 | JUNE 2019 | volume 20 www.nature.com/nrg


MICROBIAL GENOMICS

Patient with illness

• Access to mNGS testing


• Unclear clinical indications
• Upfront or second-line testing

Sample collection • Human host background


• Sample stability and transport
• Contamination

Nucleic acid extraction • Host depletion and microbial enrichment


• Standardized clinical laboratory protocols
• Universally accepted reference standards (positive and
negative control materials)

RNA DNA
viruses Bacteria Fungi viruses Parasites Human

Library preparation • Contamination


• Quality control metrics
• Workflow complexity (manual versus automated)

Sequencing • Cost
AGTCAG
• Turnaround times
• Sequence quality

Bioinformatic analysis • Computational power


• Misaligned sequences
• Database misannotations and biases in representation of
organisms
• User-friendly software
• Bioinformatics software validation

Reporting • Patient privacy


• Data confidentiality
• EMR integration
Patient • Regulatory approval
chart • Medical reimbursement

• Clinical interpretation
• Clinical utility
• Available treatments
• Clinical indications

Diagnosis and
treatment
Clinical microbiology
sequencing board

Fig. 3 | Challenges to routine deployment of metagenomic sequencing in the clinical setting. At each step in the
process, multiple factors (bullet points) must be taken into account when implementing a clinical metagenomic pipeline
for diagnosis of infections to maximize accuracy and clinical relevance. In particular, it is often useful to interpret and
discuss the results of metagenomic next-​generation-sequencing (mNGS) testing in a clinical context as part of a clinical
microbial sequencing board, akin to a tumour board in oncology. EMR , electronic medical record.

Clinical laboratory operations are characterized by wet lab manipulations for mNGS require considerable
a defined workflow with scheduled staffing levels and hands-​on time to perform, as well as clinical staff who
are less amenable to on-​demand testing than those of are highly trained in molecular biology procedures.
research laboratories. As samples are typically handled in There are ergonomic concerns with repetitive tasks
batches, the frequency of batch analysis is a major deter­ such as pipetting, as well as potential for inadvertent
minant of overall turnaround time. Unless fully auto­ sample mix-​up or omission of critical steps in the work­
mated sample-​handling systems are readily available, flow. Maintaining high quality during complex mNGS

NATuRe RevIews | GeNeTiCS volume 20 | JUNE 2019 | 349


REVIEWS

procedures can be stressful to staff, as slight deviations in A typi­cal bioinformatics pipeline consists of a series of
sample handling can lead to major changes in the results analysis steps from raw input FASTQ files including
generated. Separating the assay workflow into multiple quality and low-​complexity filtering, adaptor trimming,
discrete steps to be performed by rotating shifts can be human host subtraction, microorganism identification
helpful to avoid laboratory errors. by alignment to reference databases, optional sequence
assembly and taxonomic classification of individual
Reference standards reads and/or contiguous sequences (contigs) at levels
Well-​characterized reference standards and controls are such as family, genus and species (Fig. 4). Each step in
needed to ensure mNGS assay quality and stability over the pipeline must be carefully assessed for accuracy and
time. Most available metagenomic reference materials completeness of data processing, with consideration for
are highly customized to specific applications (for exam­ propagation of errors. Sensitivity analyses should be
ple, ZymoBIOMICS Microbial Community Standard performed with the inclusion of both in silico data and
for microbiome analyses and bacterial and fungal meta­ data generated from clinical samples. Customized data
genomics105) and/or focused on a more limited spec­ sets can be prepared to mimic input sequence data and
trum of organisms (for example, the National Institute expand the range of microorganisms detected through in
of Standards and Technology (NIST) reference materials silico analysis37. The use of standardized reference mate­
for mixed microbial DNA detection, which contain only rials and NGS data sets is also helpful in comparative
bacteria106). Thus, these materials may not be applicable evaluation of different bioinformatics pipelines105.
to untargeted mNGS analyses. Additionally, public databases for microbial reference
Custom mixtures consisting of a pool of micro­ genomes are being continuously updated, and laborato­
organisms (mock microbial communities) or their ries need to keep track of the exact versions used in addi­
nucleic acids can be developed as external controls to tion to dealing with potential misannotations and other
establish limits of detection for mNGS testing. Internal database errors. Larger and more complete databases
spike-​in control standards are available for other NGS containing publicly deposited sequences such as the
applications such as transcriptome analysis by RNA-​ National Center for Biotechnology Information (NCBI)
seq, with External RNA Controls Consortium (ERCC) Nucleotide database are more comprehensive but also
RNA standards composed of synthetic RNA oligonu­ contain more errors than curated, more limited data­
cleotides spanning a range of nucleotide lengths and bases such as FDA-​ARGOS91,113 or the FDA Reference
concentrations107. The complete set or a portion of Viral Database (RVDB)114. A combined approach that
the ERCC RNA standards (or their DNA equivalents) incorporates annotated sequences from multiple data­
can be used as spike-​in internal controls to control bases may enable greater confidence in the sensitivity
for assay inhibition and to quantify titres of detected and specificity of microorganism identification.
pathogens by standard curve analysis108. Nonetheless, Performance validation and verification for bioinfor­
the lack of universally accepted reference standards for matics analysis constitute a time-​consuming endeavour
mNGS makes it difficult to compare assay performances and include analysis of control and patient data sets and
between different laboratories. There is a critical need comparisons, with orthogonal clinical testing to deter­
for standardized reference organisms and genomic mine the accuracy of the final result36. Establishing
materials to facilitate such comparisons and to define thresholds enables separation of true-​positive matches
optimal analysis methods. from the background, and these thresholds can incor­
porate metrics such as the number of sequence reads
Bioinformatics challenges aligning to the detected microorganism, normalized to
User-​friendly bioinformatics software for analysis of reads per million, external no-​template control samples or
mNGS data is not currently available. Thus, customized internal spike-​in material; the number of nonoverlapping
bioinformatics pipelines for analysis of clinical mNGS genomic regions covered; and the read abundance in clin­
data56,109–111 still require highly trained programming staff ical samples relative to negative control samples (to avoid
to develop, validate and maintain the pipeline for clinical reporting of contaminant organisms). Receiver–operator
use. The laboratory can either host computational serv­ curve (ROC) analysis is a useful tool to determine opti­
Spike-​in ers locally or move the bioinformatics analysis and data mal threshold values for a training set of clinical samples
In laboratory test development, storage to cloud platforms. In either case, hardware and with known results, with verification of pre-​established
refers to the use of a nucleic software setups can be complex, and adequate measures thresholds using an independent validation set36.
acid fragment or positive
control microorganism that is
must be in place to protect confidential patient sequence As in the wet lab workflow, analysis software and ref­
added to a negative sample data and information, especially in the cloud environment. erence databases should ideally be locked down before
matrix (for example, plasma Storage requirements for sequencing data can quickly validation and clinical use. Many laboratories maintain
from blood donors) or clinical become quite large, and the clinical laboratory must decide both production and up-​to-date development versions
samples and that serves as an
on the quantity, location and duration of data storage. of the clinical reference database (for example, the NCBI
internal control for the assay.
Bioinformatics pipelines for mNGS analysis use a nucleotide database is updated every 2 weeks), with the
No-​template control number of different algorithms, usually developed for production database being updated at regular, prespec­
In PCR or sequencing reactions, the research setting and constantly updated by soft­ ified intervals. Standardized data sets should be used to
a negative control sample in ware developers. As for wet lab procedures, it is usually verify the database after any update and to ensure that
which the DNA or cDNA is
left out, thus monitoring for
neces­sary to make custom modifications to the pipeline assay results are accurate and reproducible, as errors
contamination that could software and then lock down both the software and ref­ can be introduced from newly deposited sequences and
produce false-​positive results. erence databases for the purposes of clinical validation112. clinical metadata.

350 | JUNE 2019 | volume 20 www.nature.com/nrg


MICROBIAL GENOMICS

Bioinformatics pipeline

NGS
data set FASTQ

Low-quality read filtering


Low-complexity read filtering

Preprocessing
Adapter trimming

Mapping to host genome and/or transcriptome

Human
Microbial
Other

… Human genome GRCh38 …

Computationally subtracted
human reads for genome or
transcriptome analysis
Unmapped reads
Alignment to database Contig
reference sequences de novo Alignment
assembly to database
reference
sequences

Pairwise identity plot Coverage map Krona plot Taxonomic binning (microbiome)
(aligned reads) (aligned reads)
Samples
1 2 3 4 5 6 7 8 9
1
2
Microbial species or OTUs

Reference sequence Reference sequence 3


4
5
Phylogenetic analysis 6
7
Heat map (read counts) 8
9
10
11
12
Species hits

13
14

Abundance
Samples Low Medium High

Fig. 4 | A typical metagenomic next-​g eneration sequencing or are first assembled de novo into longer contiguous sequences (contigs)
bio­informatics pipeline. A next-​generation sequencing (NGS) data set, followed by alignment to reference databases. After taxonomic classification,
generally in FASTQ or sequence alignment map (SAM) format, is analysed on in which individual reads or contigs are assigned into specific taxa (for
a computational server, portable laptop or desktop computer or on the cloud. example, species, genus and family), the data can be analysed and visualized
An initial preprocessing step consists of low-​quality filtering, low-​complexity in a number of different formats. These include coverage map and pairwise
filtering and adaptor trimming. Computational host subtraction is performed identity plots to determine how much of the microbial genome has been
by mapping reads to the host (for example, human) genome and setting aside recovered and its similarity to reference genomes in the database; Krona
host reads for subsequent transcriptome (RNA) or genome (DNA) analysis. plots to visualize taxonomic diversity in the metagenomic library ;
The remaining unmapped reads are directly aligned to large reference phylogenetic analysis to compare assembled genes, gene regions or
databases, such as the National Center for Biotechnology Information (NCBI) genomes to reference sequences; and heat maps to show microorganisms
GenBank database or microbial reference sequence or genome collections, that were detected in the clinical samples. OTU, operational taxonomic unit.

Cost considerations automated protocols to multiplex large numbers


Although there have been substantial cost reductions in of patient samples in a single run. Thus, the majority of
the generation of sequence data, the overall per-​sample library preparation methods for mNGS are performed
reagent cost for sequencing remains fairly high. Most lab­ manually and hence incur considerable staff time. The
oratories lack the robotic equipment and established additional resources needed to run and maintain a

NATuRe RevIews | GeNeTiCS volume 20 | JUNE 2019 | 351


REVIEWS

Biorobots
bioinformatics analysis pipeline are also considerable, Regulatory considerations
The automated and steps taken to ensure regulatory oversight can add Clinical laboratories are highly regulated, and general
instrumentation in the clinical notably to costs as well. This leads to an overall cost laboratory and testing requirements apply to all mole­
laboratory that enables parallel of several hundreds to thousands of dollars per sam­ cular diagnostic assays reported for patient care90.
processing of many samples
at a time.
ple analysed, which is higher than that for many other Quality control is paramount, and methods must be
clinical tests. developed to ensure analytic accuracy throughout the
Point-​of-care Technical improvements in hardware are needed assay workflow. Important quality control steps can
Refers to diagnostic testing for mNGS sample processing to increase throughput include initial sample quality checks, library param­
or other medical procedures
and to reduce costs. As NGS procedures become more eters (concentration and size distribution), sequence
that are done near the time
and place of patient care
standardized, there has been a drive towards increasing data generation (cluster density and Q-​score), recovery of
(for example, at the bedside, automation with the use of liquid-​handling biorobots115. internal controls and performance of external controls.
in an emergency department Typically, two biorobots are needed for clinical mNGS Validation data generated during assay development and
or in a developing-​world field for both the pre-​amplification and post-​amplification implementation should be recorded and made availa­
laboratory).
steps to avoid PCR amplicon cross-​contamination. ble to laboratory inspectors (for laboratory-​developed
Cluster density Increased multiplexing is also possible with the greatly tests) or submitted to regulatory agencies, such as the
On Illumina sequencing enhanced output from the latest generation of sequenc­ FDA in the USA or the European Medicines Agency
systems, a quality control ers, such as the Illumina NovaSeq instruments. However, (EMA) in Europe, for approval.
metric that refers to the
a potential limitation with running larger numbers of Ongoing monitoring is particularly important for
density of the clonal clusters
that are produced, with each
samples per run is longer overall turnaround times for mNGS assays to verify acceptable performance over
cluster corresponding to a clinical use owing to the requirement for batch pro­ time and to investigate atypical findings36. Monitoring is
single read. An optimal cluster cessing as well as sample workflow and computational accomplished using sample internal controls, intra-​run
density is needed to maximize analysis considerations. Additionally, high-​throughput control samples, swipe tests for contamination and perio­
the number and accuracy
of reads generated from a
processing of clinical samples for NGS may only be dic proficiency testing. Unexpected or unusual results are
sequencing run. possible in reference laboratories. The development of further investigated by reviewing patients’ clinical charts
microfluidic devices for NGS sample library preparation, or by confirmatory laboratory testing using orthogonal
such as VolTRAX116, could eventually enable clinicians methods. Identification of microorganisms that have
to use mNGS more widely in hospital laboratories or not been identified before in the laboratory should be
point-​of-​care settings. independently confirmed, usually through clinical ref­
erence or public health laboratory testing. Atypical or
novel organisms should be assessed for their clinical
Box 2 | Nanopore sequencing significance, and these findings should be reported and
discussed with health-​care providers, with consideration
Nanopore sequencing is an emerging next-​generation sequencing (NGS)
technology that enables real-​time analysis of sequencing data125. As such, it is
for their potential pathogenicity and for further testing
particularly applicable to metagenomic NGS (mNGS) approaches because time is and treatment options. Clinical microbial sequencing
of the essence when treating patients with acute infectious diseases. To date, the boards, modelled after tumour boards in oncology, can
only commercially available instruments for nanopore sequencing are from Oxford be convened via real-​time teleconferencing to discuss
Nanopore Technologies and include the MinION (1 flow cell), GridION (5 flow mNGS results with treatment providers in clinical con­
cell capacity) and PromethION (48 flow cell capacity). In a published research text (Fig. 3). Detection of microorganisms with public
study126, mNGS-​based detection of Ebola and chikungunya virus infections on health implications such as Sin Nombre hantavirus or
a nanopore sequencer was possible in <10 minutes of sequencing time and in Ebola virus should be reported, as appropriate, to the
<6 hours of sample-​to-answer turnaround time overall. Research studies have relevant public health agencies.
also demonstrated the clinical potential of nanopore sequencing in targeted
universal 16S ribosomal RNA (rRNA) bacterial detection127, microbiome analyses128,
whole-​genome sequencing of bacteria129 and outbreak viruses44,45,47, RNA
Conclusions and future perspectives
sequencing (RNA-​seq) using standardized controls130 and diagnosis of prosthetic Technological advancements in library preparation
joint131 and lower respiratory infections99. Untargeted approaches such as mNGS methods, sequence generation and computational bio­
or whole-​transcriptome RNA-​seq, however, may be limited by the lower throughput informatics are enabling quicker and more comprehen­
of nanopore sequencing relative to short-​read sequencing such as with an Illumina sive metagenomic analyses at lower cost. Sequencing
instrument. technologies and their applications continue to evolve.
Currently, no NGS-​based clinical test for pathogens has been validated on a nanopore Real-​time sequencing in particular may be a game-​
sequencing platform. The clinical adoption of these devices has been limited by the changing technology for point-​of-care applications in
rapid pace of improvements to the platform, which can hinder clinical validation efforts clinical medicine and public health, as laboratories have
requiring standardized instruments and locked-​down protocols, and by ongoing issues
begun to apply these tools to diagnose atypical infec­
regarding sequencing quality and yield. Nonetheless, there is enormous potential for
nanopore sequencing in point-​of-care clinical sequencing applications, such as mNGS
tions and track pathogen outbreaks, as demonstrated by
testing done at a patient’s bedside or in an emergency room, local clinic or in the the recent deployment of real-​time nanopore sequencing
field132. Importantly, selective sequencing of pathogen reads has been demonstrated for remote epidemiological surveillance of Ebola44 and
on the nanopore platform by early termination of the sequencing of the human reads ZIKV44,45, and even for use aboard the International
as they are identified in real time133. Although attractive for purposes of protecting Space Station117 (Box 2).
patient privacy and confidentiality, as human reads are depleted as part of the Nonetheless, formidable challenges remain when
sequencing run, this approach is not currently scalable owing to the limited throughput implementing mNGS for routine patient care. In par­
of the nanopore sequencer to date (up to 10 million mNGS reads per run on the MinION ticular, sensitivity for pathogen detection is decreased
nanopore sequencer as of 2019) and the need to computationally match reads to in clinical samples with a high nucleic acid background
reference sequences in real time.
or with exceedingly low pathogen titres; this concern is

352 | JUNE 2019 | volume 20 www.nature.com/nrg


MICROBIAL GENOMICS

Q-​score
only partially mitigated by increasing sequencing depth potential assay and patient selection bias and compare
A quality control metric per sample as costs continue to drop. As a comprehen­ relevant health outcomes using data sets generated from
for DNA sequencing that is sive direct detection method, mNGS may eventually large patient cohorts119,120.
logarithmically related to the replace culture, antigen detection and PCR methods in We predict that, over the next 5 years, prospective
base calling error probabilities
and serves as a measurement
clinical microbiology, but indirect approaches such as clinical trial data evaluating the clinical utility and cost-​
of read accuracy. viral serological testing will continue to play a key part in effectiveness of mNGS will become available; overall
the diagnostic work-​up for infections27, and functional costs and turnaround time for mNGS will continue to
Proficiency testing assays such as culture and phenotypic susceptibility test­ drop; other aspects of mNGS beyond mere identifica­
A method for evaluating the
ing will likely always be useful for research studies. In tion, such as incorporation of human host response and
performance of individual
laboratories for specific
summary, while current limitations suggest that mNGS microbiome data, will prove clinically useful; robotic
laboratory tests using a is unlikely to replace conventional diagnostics in the sample handling and microfluidic devices will be devel­
standard set of unknown short term, it can be a complementary, and perhaps oped for push-​button operation; computational analysis
samples that permits essential, test in certain clinical situations. platforms will be more widely available, both locally and
interlaboratory comparisons.
Although the use of mNGS for informing clinical on the cloud, obviating the need for dedicated bioinfor­
Nanopore sequencing care has been demonstrated in multiple case reports and matics expertise; and at least a few mNGS-​based diag­
A sequencing method in which small case series118, nearly all studies have been retro­ nostic assays for infectious diseases will attain regulatory
DNA or RNA molecules are spective, and clinical utility has yet to be established in a approval with clinical reimbursement. We will witness
transported through miniature
large-​scale prospective clinical trial. Prospective clinical the widespread democratization of mNGS as genomic
pores by electrophoresis.
Sequencing reads are
studies will be critical to understand when to perform analyses become widely accessible not only to physicians
generated by measurement mNGS and how the diagnostic yield compares with that and researchers but also to patients and the public via
of transient changes in ionic of other methods. For example, the mNGS transcrip­ crowdsourcing initiatives121,122. Furthermore, in a world
current as the molecule passes tomic approach might enable effective treatment triage, with constantly emerging pathogens, we envisage that
through the pore.
whereby antimicrobials are only needed for patients mNGS-​based testing will have a pivotal role in monitor­
showing an ‘infectious profile’ of gene expression and ing and tracking new disease outbreaks. As surveillance
those with a ‘non-​infectious profile’ can be treated for networks and rapid diagnostic platforms such as nano­
other causes. In particular, prospective clinical trial and pore sequencing are deployed globally, it will be possi­
economic data showing the cost-​effectiveness of these ble to detect and contain infectious outbreaks at a much
relatively expensive tests in improving patient outcomes earlier stage, saving lives and lowering costs. In the near
are needed to justify their use. These data will also sup­ future, mNGS will not be a luxury but a necessity in the
port a pathway towards regulatory approval and clini­ clinician’s armamentarium as we engage in the perpetual
cal reimbursement. High-​quality evidence that clinical fight against infectious diseases.
metagenomic assays are effective in guiding patient
management will require protocols that minimize Published online 27 March 2019

1. Zhao, F. & Bajic, V. B. The value and significance of 13. Miller, M. B. & Tang, Y. W. Basic concepts of sequencing (WMS) for surveillance of antimicrobial
metagenomics of marine environments. Genomics microarrays and potential applications in clinical resistant microorganisms and antimicrobial resistance
Proteomics Bioinformatics 13, 271–274 (2015). microbiology. Clin. Microbiol. Rev. 22, 611–633 genes across the food chain. Genes (Basel) 9, E268
2. Ufarte, L., Laville, E., Duquesne, S. & (2009). (2018).
Potocki-​Veronese, G. Metagenomics for the 14. Streit, W. R. & Schmitz, R. A. Metagenomics—the key 23. Stefan, C., Koehler, J. & Minogue, T. Targeted next-​
discovery of pollutant degrading enzymes. to the uncultured microbes. Curr. Opin. Microbiol. 7, generation sequencing for the detection of ciprofloxacin
Biotechnol. Adv. 33, 1845–1854 (2015). 492–498 (2004). resistance markers using molecular inversion probes.
3. Greay, T. L. et al. Recent insights into the tick 15. Rota, P. A. et al. Characterization of a novel coronavirus Sci. Rep. 6, 25904 (2016).
microbiome gained through next-​generation associated with severe acute respiratory syndrome. 24. Gliddon, H. D., Herberg, J. A., Levin, M. & Kaforou, M.
sequencing. Parasit. Vectors 11, 12 (2018). Science 300, 1394–1399 (2003). Genome-​wide host RNA signatures of infectious
4. Guegan, M. et al. The mosquito holobiont: fresh 16. Sotiriou, C. & Pusztai, L. Gene-​expression signatures diseases: discovery and clinical translation. Immunology
insight into mosquito-​microbiota interactions. in breast cancer. N. Engl. J. Med. 360, 790–800 153, 171–178 (2018).
Microbiome 6, 49 (2018). (2009). 25. Langelier, C. et al. Integrating host response and
5. Lloyd-​Price, J., Abu-​Ali, G. & Huttenhower, C. 17. Palmer, C. et al. Rapid quantitative profiling of complex unbiased microbe detection for lower respiratory tract
The healthy human microbiome. Genome Med. 8, microbial populations. Nucleic Acids Res. 34, e5 infection diagnosis in critically ill adults. Proc. Natl
51 (2016). (2006). Acad. Sci. USA 115, E12353–E12362 (2018).
6. Pallen, M. J. Diagnostic metagenomics: potential 18. Voelkerding, K. V., Dames, S. A. & Durtschi, J. D. This study integrates microbial metagenomic and
applications to bacterial, viral and parasitic infections. Next-​generation sequencing: from basic research to host response NGS data to improve accuracy in
Parasitology 141, 1856–1862 (2014). diagnostics. Clin. Chem. 55, 641–658 (2009). diagnosing lower respiratory tract infections.
7. Chan, J. Z. et al. Metagenomic analysis of tuberculosis 19. Wilson, M. R. et al. Actionable diagnosis of 26. Lin, L. & Zhang, J. Role of intestinal microbiota and
in a mummy. N. Engl. J. Med. 369, 289–290 neuroleptospirosis by next-​generation sequencing. metabolites on gut homeostasis and human diseases.
(2013). N. Engl. J. Med. 370, 2408–2417 (2014). BMC Immunol. 18, 2 (2017).
8. Chiu, C. Y. Viral pathogen discovery. Curr. Opin. This case report describes the first use of clinical 27. Greninger, A. The challenge of diagnostic
Microbiol. 16, 468–478 (2013). metagenomics for actionable diagnosis and metagenomics. Expert Rev. Mol. Diagn. 18, 605–615
This review covers one of the earliest applications treatment in a critically ill patient with a mysterious (2018).
of metagenomic sequencing for use in the detection neurological infection. 28. Khare, R. et al. Comparative evaluation of two
and discovery of novel viral pathogens. 20. Nutman, A. & Marchaim, D. ‘How to do it’-molecular commercial multiplex panels for detection of
9. Moustafa, A. et al. The blood DNA virome in 8,000 investigation of a hospital outbreak. Clin. Microbiol. gastrointestinal pathogens by use of clinical stool
humans. PLOS Pathog. 13, e1006292 (2017). Infect. https://doi.org/10.1016/j.cmi.2018.09.017 specimens. J. Clin. Microbiol. 52, 3667–3673
10. Rascovan, N., Duraisamy, R. & Desnues, C. (2018). (2014).
Metagenomics and the human virome in asymptomatic 21. Loman, N. J. et al. A culture-​independent sequence-​ 29. Leber, A. L. et al. Multicenter evaluation of BioFire
individuals. Annu. Rev. Microbiol. 70, 125–141 based metagenomics approach to the investigation FilmArray meningitis/encephalitis panel for detection
(2016). of an outbreak of Shiga-​toxigenic Escherichia coli of bacteria, viruses, and yeast in cerebrospinal
11. Somasekar, S. et al. Viral surveillance in serum samples O104:H4. JAMA 309, 1502–1510 (2013). fluid specimens. J. Clin. Microbiol. 54, 2251–2261
from patients with acute liver failure by metagenomic This study describes the use of metagenomic (2016).
next-​generation sequencing. Clin. Infect. Dis. 65, sequencing and comparative bacterial genome 30. Ruggiero, P., McMillen, T., Tang, Y. W. & Babady, N. E.
1477–1485 (2017). analysis to investigate a global public health Evaluation of the BioFire FilmArray respiratory panel
12. Hampton-​Marcell, J. T., Lopez, J. V. & Gilbert, J. A. outbreak. and the GenMark eSensor respiratory viral panel on
The human microbiome: an emerging tool in forensics. 22. Oniciuc, E. A. et al. The present and future of whole lower respiratory tract specimens. J. Clin. Microbiol.
Microb. Biotechnol. 10, 228–230 (2017). genome sequencing (WGS) and whole metagenome 52, 288–290 (2014).

NATuRe RevIews | GeNeTiCS volume 20 | JUNE 2019 | 353


REVIEWS
31. Tang, Y. W. et al. Clinical evaluation of the Luminex 51. Naccache, S. et al. Distinct Zika virus lineage in 74. Woods, C. et al. A host transcriptional signature for
NxTAG respiratory pathogen panel. J. Clin. Microbiol. Salvador, Bahia, Brazil. Emerg. Infect. Dis. 22, presymptomatic detection of infection in humans
54, 1912–1914 (2016). 1788–1792 (2016). exposed to influenza H1N1 or H3N2. PLOS ONE 8,
32. Lefterova, M. I., Suarez, C. J., Banaei, N. & Pinsky, B. A. 52. Wylie, T. N., Wylie, K. M., Herter, B. N. & Storch, G. A. e52198 (2013).
Next-​generation sequencing for infectious disease Enhanced virome sequencing using targeted sequence 75. Zaas, A. et al. Gene expression signatures diagnose
diagnosis and management: a report of the association capture. Genome Res. 25, 1910–1920 (2015). influenza and other symptomatic respiratory viral
for molecular pathology. J. Mol. Diagn. 17, 623–634 53. Presidential Council. National action plan for infections in humans. Cell Host Microbe 6, 207–217
(2015). combating antibiotic-​resistant bacteria (The White (2009).
33. Blauwkamp, T. A. et al. Analytical and clinical validation House, Washington, 2015). This paper is one of the earliest to demonstrate
of a microbial cell-​free DNA sequencing test for 54. Quince, C., Walker, A., Simpson, J., Loman, N. & the potential use of host gene expression
infectious disease. Nat. Microbiol. https://doi.org/ Segata, N. Shotgun metagenomics, from sampling signatures to diagnose infections.
10.1038/s41564-018-0349-6 (2019). to analysis. Nat. Biotechnol. 35, 833–844 (2017). 76. Zhang, Y. et al. Identifying and analyzing different
This paper describes the analytical and clinical 55. Snitkin, E. et al. Tracking a hospital outbreak of cancer subtypes using RNA-​seq data of blood
validation of an mNGS assay for sepsis. carbapenem-​resistant Klebsiella pneumoniae with platelets. Oncotarget 8, 87494–87511 (2017).
34. Deurenberg, R. H. et al. Application of next generation whole-​genome sequencing. Sci. Transl Med. 4, 77. McClain, M. et al. A Genomic signature of influenza
sequencing in clinical microbiology and infection 148ra116 (2012). infection shows potential for presymptomatic
prevention. J. Biotechnol. 243, 16–24 (2017). This study is the first to demonstrate the potential detection, guiding early therapy, and monitoring
35. Gargis, A. S., Kalman, L. & Lubin, I. M. Assuring of whole-​genome bacterial sequencing using NGS clinical responses. Open Forum Infect. Dis 3, ofw007
the quality of next-​generation sequencing in clinical to track transmission of a hospital outbreak of (2016).
microbiology and public health laboratories. J. Clin. carbapenem-​resistant K. pneumoniae. 78. Sweeney, T., Wong, H. & Khatri, P. Robust classification
Microbiol. 54, 2857–2865 (2016). 56. Naccache, S. et al. A cloud-​compatible bioinformatics of bacterial and viral infections via integrated host
36. Miller, S. et al. Laboratory validation of a clinical pipeline for ultrarapid pathogen identification from gene expression diagnostics. Sci. Transl Med. 8,
metagenomic sequencing assay for pathogen next-​generation sequencing of clinical samples. 346ra391 (2016).
detection in cerebrospinal fluid. Preprint at bioRxiv Genome Res. 24, 1180–1192 (2014). 79. Emerson, J. B. et al. Schrodinger’s microbes: tools
https://doi.org/10.1101/330381 (2019). This paper describes the sequence-​based for distinguishing the living from the dead in microbial
This paper describes the clinical validation of ultrarapid pathogen identification metagenomic ecosystems. Microbiome 5, 86 (2017).
an mNGS assay for diagnosis of meningitis and analysis pipeline for use in infectious disease 80. Banerjee, A. et al. RNA-​seq analysis of peripheral
encephalitis from cerebrospinal fluid. diagnostics. blood mononuclear cells reveals unique transcriptional
37. Schlaberg, R. et al. Validation of metagenomic next-​ 57. Hong, D. et al. Liquid biopsy for infectious diseases: signatures associated with disease progression in
generation sequencing tests for universal pathogen sequencing of cell-​free plasma to detect pathogen dengue patients. Transl Res. 186, 62–78 (2017).
detection. Arch. Pathol. Lab Med. 141, 776–786 DNA in patients with invasive fungal disease. Diagn. 81. Lee, H. J. et al. Integrated pathogen load and dual
(2017). Microbiol. Infect. Dis. 92, 210–213 (2018). transcriptome analysis of systemic host-​pathogen
This paper summarizes the clinical validation of 58. Schlaberg, R. et al. Viral pathogen detection by interactions in severe malaria. Sci. Transl Med. 10,
two mNGS assays for neurological infections and metagenomics and pan-​viral group polymerase chain eaar3619 (2018).
pneumonia. reaction in children with pneumonia lacking identifiable 82. Marques, A. Laboratory diagnosis of Lyme disease:
38. Rampini, S. K. et al. Broad-​range 16S rRNA gene etiology. J. Infect. Dis. 215, 1407–1415 (2017). advances and challenges. Infect. Dis. Clin. North Am.
polymerase chain reaction for diagnosis of culture-​ 59. Jovel, J. et al. Characterization of the gut microbiome 29, 295–307 (2015).
negative bacterial infections. Clin. Infect. Dis. 53, using 16S or shotgun metagenomics. Front. Microbiol. 83. Debiasi, R. & Tyler, K. Molecular methods for
1245–1251 (2011). 7, 459 (2016). diagnosis of viral encephalitis. Clin. Microbiol. Rev. 17,
39. Salipante, S. J. et al. Rapid 16S rRNA next-​generation 60. Young, V. The role of the microbiome in human health 903–925 (2004).
sequencing of polymicrobial clinical samples for and disease: an introduction for clinicians. BMJ 356, 84. Landry, M. & St George, K. Laboratory diagnosis of
diagnosis of complex bacterial infections. PLOS ONE j831 (2017). Zika virus infection. Arch. Pathol. Lab Med. 141,
8, e65226 (2013). 61. Samarkos, M., Mastrogianni, E. & Kampouropoulou, O. 60–67 (2017).
This paper describes the use of targeted 16S The role of gut microbiota in Clostridium difficile 85. Nakagawa, H. & Fujita, M. Whole genome sequencing
rRNA NGS for diagnosis of polymicrobial bacterial infection. Eur. J. Intern. Med. 50, 28–32 (2018). analysis for cancer genomics and precision medicine.
infections. 62. Shogbesan, O. et al. A Systematic review of the Cancer Sci. 109, 513–522 (2018).
40. Wagner, K., Springer, B., Pires, V. P. & Keller, P. M. efficacy and safety of fecal microbiota transplant for 86. Feng, H., Shuda, M., Chang, Y. & Moore, P. Clonal
Molecular detection of fungal pathogens in clinical Clostridium difficile infection in immunocompromised integration of a polyomavirus in human Merkel cell
specimens by 18S rDNA high-​throughput screening in patients. Can. J. Gastroenterol. Hepatol. 2018, carcinoma. Science 319, 1096–1100 (2008).
comparison to ITS PCR and culture. Sci. Rep. 8, 6964 1394379 (2018). This paper describes the discovery of a novel
(2018). 63. van Nood, E. et al. Duodenal infusion of donor feces polyomavirus associated with a rare skin cancer
41. Basein, T. et al. Clinical utility of universal PCR and for recurrent Clostridium difficile. N. Engl. J. Med. using NGS.
its real-​world impact on patient management. Open 368, 407–415 (2013). 87. Allegretti, M. et al. Tearing down the walls: FDA
Forum Infect. Dis 4, S627 (2017). This paper demonstrates the therapeutic potential approves next generation sequencing (NGS) assays
42. Corless, C. E. et al. Contamination and sensitivity of manipulating the microbiome with donor faecal for actionable cancer genomic aberrations. J. Exp. Clin.
issues with a real-​time universal 16S rRNA PCR. transplantation to treat refractory C. difficile Cancer Res. 37, 47 (2018).
J. Clin. Microbiol. 38, 1747–1752 (2000). disease. 88. Saha, A., Kaul, R., Murakami, M. & Robertson, E. S.
43. Quick, J. et al. Multiplex PCR method for MinION and 64. Boulangé, C., Neves, A., Chilloux, J., Nicholson, J. Tumor viruses and cancer biology: modulating signaling
Illumina sequencing of Zika and other virus genomes & Dumas, M. Impact of the gut microbiota on pathways for therapeutic intervention. Cancer Biol.
directly from clinical samples. Nat. Protoc. 12, inflammation, obesity, and metabolic disease. Genome Ther. 10, 961–978 (2010).
1261–1276 (2017). Med. 8, 42 (2016). 89. Kanwal, F. et al. Risk of hepatocellular cancer in HCV
44. Faria, N. R. et al. Establishment and cryptic 65. Kukurba, K. & Montgomery, S. RNA sequencing and patients treated with direct-​acting antiviral agents.
transmission of Zika virus in Brazil and the Americas. analysis. Cold Spring Harb. Protoc. 2015, 951–969 Gastroenterology 153, 996–1005 (2017).
Nature 546, 406–410 (2017). (2015). 90. Burd, E. Validation of laboratory-​developed molecular
45. Grubaugh, N. et al. Genomic epidemiology reveals 66. Wang, Z., Gerstein, M. & Snyder, M. RNA-​Seq: assays for infectious diseases. Clin. Microbiol. Rev. 23,
multiple introductions of Zika virus into the United a revolutionary tool for transcriptomics. Nat. Rev. 550–576 (2010).
States. Nature 546, 401–405 (2017). Genet. 10, 57–63 (2009). This paper summarizes the essential requirements
46. Thézé, J. et al. Genomic epidemiology reconstructs This Review provides an overview of RNA-​seq for validation of infectious disease assays in a
the introduction and spread of Zika virus in central for transcriptomics and its applications. clinical laboratory.
America and Mexico. Cell Host Microbe 23, 855–864 67. Ahn, S. et al. Gene expression-​based classifiers 91. Food and Drug Administration. Infectious disease
(2018). identify Staphylococcus aureus infection in mice next generation sequencing based diagnostic devices:
This study introduces the use of the metagenomic and humans. PLOS ONE 8, e48979 (2013). microbial identification and detection of antimicrobial
sequencing with spiked primer enrichment 68. Bouquet, J. et al. Longitudinal transcriptome analysis resistance and virulence markers (FDA, 2016).
technique for simultaneous targeted and reveals a sustained differential gene expression This draft guidance from the FDA covers
untargeted pathogen detection and genome signature in patients treated for acute Lyme disease. considerations for validation and approval of
assembly. mBio 7, e00100–00116 (2016). sequencing-​based diagnostic devices for infectious
47. Quick, J. et al. Real-​time, portable genome sequencing 69. Zaas, A., Aziz, H., Lucas, J., Perfect, J. & Ginsburg, G. diseases.
for Ebola surveillance. Nature 530, 228–232 Blood gene expression signatures predict invasive 92. DuPont, H. L., Levine, M. M., Hornick, R. B.
(2016). candidiasis. Sci. Transl Med. 2, 21ra17 (2010). & Formal, S. B. Inoculum size in shigellosis and
This study describes deployment of a portable 70. Anderson, S. et al. Diagnosis of childhood tuberculosis implications for expected mode of transmission.
nanopore sequencer for real-​time actionable and host RNA expression in Africa. N. Engl. J. Med. J. Infect. Dis. 159, 1126–1128 (1989).
sequencing of clinical samples during the Ebola 370, 1712–1723 (2014). 93. Corman, V. M. et al. Assay optimization for molecular
outbreak in West Africa. 71. Singhania, A. et al. A modular transcriptional signature detection of Zika virus. Bull. World Health Organ. 94,
48. Garcia-​Garcia, G. et al. Assessment of the latest identifies phenotypic heterogeneity of human 880–892 (2016).
NGS enrichment capture methods in clinical context. tuberculosis infection. Nat. Commun. 9, 2308 (2018). 94. Hasan, M. et al. Depletion of human DNA in spiked
Sci. Rep. 6, 20948 (2016). 72. Zak, D. E. et al. A blood RNA signature for tuberculosis clinical specimens for improvement of sensitivity of
49. Briese, T. et al. Virome capture sequencing enables disease risk: a prospective cohort study. Lancet 387, pathogen detection by next-​generation sequencing.
sensitive viral diagnosis and comprehensive virome 2312–2322 (2016). J. Clin. Microbiol. 54, 919–927 (2016).
analysis. mBio 6, e01491-15 (2015). 73. HIPC-​CHI Signatures Project Team & HIPC-​I 95. Matranga, C. et al. Enhanced methods for unbiased
50. Metsky, H. C. et al. Capturing sequence diversity in Consortium. Multicohort analysis reveals baseline deep sequencing of Lassa and Ebola RNA viruses from
metagenomes with comprehensive and scalable probe transcriptional predictors of influenza vaccination clinical and biological samples. Genome Biol. 15, 519
design. Nat. Biotechnol. 37, 160–168 (2019). responses. Sci. Immunol. 2, eaal4656 (2017). (2014).

354 | JUNE 2019 | volume 20 www.nature.com/nrg


MICROBIAL GENOMICS
96. O’Neil, D., Glowatz, H. & Schlumpberger, M. for next-​generation sequencing in infectious disease 133. Loose, M., Malla, S. & Stout, M. Real-​time selective
Ribosomal RNA depletion for efficient use of RNA-​seq diagnostics. mBio 6, e01888-15 (2015). sequencing using nanopore technology. Nat. Methods
capacity. Curr. Protoc. Mol. Biol. 103, 4.19.1–4.19.8 114. Goodacre, N., Aljanahi, A., Nandakumar, S., 13, 751–754 (2016).
(2013). Mikailov, M. & Khan, A. S. A reference viral database 134. Stakaityte, G. et al. Merkel cell polyomavirus:
97. Gu, W. et al. Depletion of abundant sequences by (RVDB) to enhance bioinformatics analysis of high-​ molecular insights into the most recently discovered
hybridization (DASH): using Cas9 to remove unwanted throughput sequencing for novel virus detection. human tumour virus. Cancers (Basel) 6, (1267–1297
high-​abundance species in sequencing libraries and mSphere 3, e00069-18 (2018). (2014).
molecular counting applications. Genome Biol. 17, 115. May, M. Automated sample preparation. NIST Special 135. Brinkmann, A. et al. Development and preliminary
41 (2016). Publication 1222, 1–17 (2016). evaluation of a multiplexed amplification and next
98. Feehery, G. et al. A method for selectively enriching 116. Levy, S. E. & Myers, R. M. Advancements in next-​ generation sequencing method for viral hemorrhagic
microbial DNA from contaminating vertebrate host generation sequencing. Annu. Rev. Genomics Hum. fever diagnostics. PLOS Negl. Trop. Dis. 11, e0006075
DNA. PLOS ONE 8, e76096 (2013). Genet. 17, 95–115 (2016). (2017).
99. Charalampous, T. et al. Rapid diagnosis of lower 117. Castro-​Wallace, S. L. et al. Nanopore DNA sequencing 136. Quan, J. et al. FLASH: a next-​generation CRISPR
respiratory infection using nanopore-​based clinical and genome assembly on the International Space diagnostic for multiplexed detection of antimicrobial
metagenomics. Preprint at bioRxiv https://doi.org/ Station. Sci. Rep. 7, 18022 (2017). resistance sequences. Preprint at bioRxiv https://doi.org/
10.1101/387548 (2018). 118. Simner, P. J., Miller, S. & Carroll, K. C. Understanding 10.1101/426338 (2018).
100. Thoendel, M. et al. Comparison of microbial DNA the promises and hurdles of metagenomic next-​ 137. Langelier, C. et al. Metagenomic sequencing detects
enrichment tools for metagenomic whole genome generation sequencing as a diagnostic tool for infectious respiratory pathogens in hematopoietic cellular
sequencing. J. Microbiol. Methods 127, 141–145 diseases. Clin. Infect. Dis. 66, 778–788 (2018). transplant patients. Am. J. Respir. Crit. Care Med.
(2016). This is a concise yet comprehensive review of some 197, 524–528 (2018).
101. Salter, S. et al. Reagent and laboratory contamination of the clinical applications of mNGS for diagnosis 138. Zinter, M. S. et al. Pulmonary metagenomic sequencing
can critically impact sequence-​based microbiome of infectious diseases. suggests missed infections in immunocompromised
analyses. BMC Biol. 12, 87 (2014). 119. Afshinnekoo, E., Ahsanuddin, S. & Mason, C. E. children. Clin. Infect. Dis https://doi.org/10.1093/cid/
102. Li, R. et al. Comparison of DNA-, PMA-, and RNA-​ Globalizing and crowdsourcing biomedical research. ciy802 (2018).
based 16S rRNA Illumina sequencing for detection Br. Med. Bull. 120, 27–33 (2016). 139. Zhou, Y. et al. Metagenomic approach for identification
of live bacteria in water. Sci. Rep. 7, 5752 (2017). 120. Brooks, J. P. et al. The truth about metagenomics: of the pathogens associated with diarrhea in stool
103. Naccache, S. et al. Diagnosis of neuroinvasive astrovirus quantifying and counteracting bias in 16S rRNA specimens. J. Clin. Microbiol. 54, 368–375 (2016).
infection in an immunocompromised adult with studies. BMC Microbiol. 15, 66 (2015). 140. Ivy, M. I. et al. Direct detection and identification of
encephalitis by unbiased next-​generation sequencing. 121. Boja, E. et al. Right data for right patient-​a prosthetic joint infection pathogens in synovial fluid by
Clin. Infect. Dis. 60, 919–923 (2015). precisionFDA NCI-​CPTAC multi-​omics mislabeling metagenomic shotgun sequencing. J. Clin. Microbiol.
104. Strong, M. et al. Microbial contamination in next challenge. Nat. Med. 24, 1301–1302 (2018). 56, e00402-18 (2018).
generation sequencing: implications for sequence-​ 122. McDonald, D. et al. American gut: an open platform 141. Milani, C. et al. Gut microbiota composition and
based analysis of clinical samples. PLOS Pathog. 10, for citizen science microbiome research. mSystems 3, Clostridium difficile infection in hospitalized elderly
e1004437 (2014). e00031-18 (2018). individuals: a metagenomic study. Sci. Rep. 6, 25945
105. McIntyre, A. et al. Comprehensive benchmarking and 123. Babayan, A. & Pantel, K. Advances in liquid biopsy (2016).
ensemble approaches for metagenomic classifiers. approaches for early detection and monitoring of 142. Tang, K. W. & Larsson, E. Tumour virology in the era of
Genome Biol. 18, 182 (2017). cancer. Genome Med. 10, 21 (2018). high-​throughput genomics. Philos. Trans. R. Soc. Lond.
106. Jackson, S. A., Kralj, J. G. & Lin, N. J. Report on the 124. Norton, M. E. et al. Cell-​free DNA analysis for B Biol. Sci. 372, 20160265 (2017).
NIST/DHS/FDA workshop: standards for pathogen noninvasive examination of trisomy. N. Engl. J. Med. 143. Aravanis, A. M., Lee, M. & Klausner, R. D. Next-​
detection for biosurveillance and clinical applications 372, 1589–1597 (2015). generation sequencing of circulating tumor DNA for
(National Institute for Standards and Technology, 125. Jain, M., Olsen, H., Paten, B. & Akeson, M. early cancer detection. Cell 168, 571–574 (2017).
2018). The Oxford Nanopore MinION: delivery of nanopore
107. Pine, P. et al. Evaluation of the External RNA Controls sequencing to the genomics community. Genome Biol. Author contributions
Consortium (ERCC) reference material using a modified 17, 239 (2016). The authors contributed equally to all aspects of the article.
Latin square design. BMC Biotechnol. 16, 54 (2016). 126. Greninger, A. et al. Rapid metagenomic identification
108. Avraham, R. et al. A highly multiplexed and sensitive of viral pathogens in clinical samples by real-​time Competing interests
RNA-​seq protocol for simultaneous analysis of host nanopore sequencing analysis. Genome Med. 7, 99 C.Y.C. is the director of the UCSF–Abbott Viral Diagnostics
and pathogen transcriptomes. Nat. Protoc. 11, (2015). and Discovery Center (VDDC) and receives research support
1477–1491 (2016). 127. Mitsuhashi, S. et al. A portable system for rapid from Abbott Laboratories. C.Y.C. and S.A.M. are inventors on
109. Flygare, S. et al. Taxonomer: an interactive bacterial composition analysis using a nanopore-​ a patent application on algorithms related to SURPI+ soft-
metagenomics analysis portal for universal pathogen based sequencer and laptop computer. Sci. Rep. 7, ware titled ‘Pathogen Detection using Next-​G eneration
detection and host mRNA expression profiling. 5657 (2017). Sequencing’ (PCT/US/16/52912).
Genome Biol. 17, 111 (2016). 128. Kerkhof, L., Dillon, K., Häggblom, M. & McGuinness, L.
110. Kim, D., Song, L., Breitwieser, F. & Salzberg, S. Profiling bacterial communities by MinION sequencing Publisher’s note
Centrifuge: rapid and sensitive classification of ribosomal operons. Microbiome 5, 116 (2017). Springer Nature remains neutral with regard to jurisdictional
of metagenomic sequences. Genome Res. 26, 129. Tyler, A. et al. Evaluation of Oxford Nanopore’s claims in published maps and institutional affiliations.
1721–1729 (2016). MinION sequencing device for microbial whole genome
111. Wood, D. & Salzberg, S. Kraken: ultrafast sequencing applications. Sci. Rep. 8, 10931 (2018). Reviewer information
metagenomic sequence classification using exact 130. Oikonomopoulos, S., Wang, Y., Djambazian, H., Nature Reviews Genetics thanks J. C. Lagier, A. Nitsche and
alignments. Genome Biol. 15, R46 (2014). Badescu, D. & Ragoussis, J. Benchmarking of the J. Dekker for their contribution to the peer review of this work.
112. Roy, S. et al. Standards and guidelines for validating Oxford Nanopore MinION sequencing for quantitative
next-​generation sequencing bioinformatics pipelines: and qualitative assessment of cDNA populations.
a joint recommendation of the Association for Sci. Rep. 6, 31602 (2016). Related links
Molecular Pathology and the College of American 131. Street, T. et al. Molecular diagnosis of orthopedic-​ External RNA Controls Consortium (ERCC):
Pathologists. J. Mol. Diagn. 20, 4–27 (2018). device-related infection directly from sonication fluid http://jimb.stanford.edu/ercc/
This draft guidance from the Association for by metagenomic sequencing. J. Clin. Microbiol. 55, FDA-​ARGOS:
Molecular Pathology and College of American 2334–2347 (2017). https://www.ncbi.nlm.nih.gov/bioproject/231221
Pathologists reviews standards and guidelines 132. Gardy, J. & Loman, N. Towards a genomics-​informed, FDA Reference Viral Database (RVDB):
for validation of NGS bioinformatics pipelines. real-​time, global pathogen surveillance system. https://hive.biochemistry.gwu.edu/rvdb
113. Goldberg, B., Sichtig, H., Geyer, C., Ledeboer, N. Nat. Rev. Genet. 19, 9–20 (2018). National Center for Biotechnology Information (NCBI)
& Weinstock, G. Making the leap from research This Review describes efforts to deploy genomics Nucleotide database: https://www.ncbi.nlm.nih.gov/nucleotide/
laboratory to clinic: challenges and opportunities globally for real-​time, global pathogen surveillance.

NATuRe RevIews | GeNeTiCS volume 20 | JUNE 2019 | 355

You might also like