0% found this document useful (0 votes)
25 views48 pages

DNA Sequencing

Uploaded by

qanitakhan2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views48 pages

DNA Sequencing

Uploaded by

qanitakhan2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

DNA Sequencing

LECTURE 7
DR NIDA BAIG
OBJECTIVES
 Compare and contrast the chemical (Maxam/Gilbert) and the
chain termination (Sanger) sequencing methods.
 List the components and the molecular reactions that occur in
chain termination sequencing.
 Discuss the advantages of dye primer and dye terminator
sequencing.
 Derive a text DNA sequence from raw sequencing data.
 Describe examples of alternative sequencing methods, such as
bisulfite sequencing and pyrosequencing.
 Define bioinformatics and describe electronic systems for the
communication and application of sequence information.
 Recount the events of the Human Genome Project.
OUTLINE
DIRECT SEQUENCING
Manual Sequencing
Automated Fluorescent Sequencing
PYROSEQUENCING
BISULFITE DNA SEQUENCING
BIOINFORMATICS
THE HUMAN GENOME PROJECT
DNA Sequencing
In the clinical laboratory, DNA sequence information (the
order of nucleotides in the DNA molecule) is used
routinely for a variety of purposes:
mutations,
typing microorganisms,
identifying human haplotypes, and designating
polymorphisms.
Ultimately, targeted therapies will be directed at abnormal
DNA sequences detected by these techniques
Direct Sequencing
The importance of knowing the order, or sequence, of
nucleotides on the DNA chain was appreciated in the
earliest days of molecular analysis.
Elegant genetic experiments with microorganisms
detected molecular changes indirectly at the nucleotide
level.
Direct determination of the nucleotide sequence, or DNA
sequencing, is the most definitive molecular method to
identify genetic lesions
Manual Sequencing
Direct determination of the order, or sequence, of
nucleotides in a DNA polymer is the most specific and
direct method for identifying genetic lesions (mutations)
or polymorphisms, especially when looking for changes
affecting only one or two nucleotides.
Two types of sequencing methods have been used most
extensively: the Maxam-Gilbert method and the Sanger
method.
Chemical (Maxam-Gilbert)
Sequencing
The Maxam-Gilbert chemical sequencing method was
developed in the late 1970s by Allan M. Maxam and
Walter Gilbert.
Maxam-Gilbert sequencing requires a double- or single-
stranded version of the DNA region to be sequenced, with
one end radioactively labeled.
For sequencing, the labeled fragment, or template, is
aliquoted into four tubes.
Each aliquot is treated with a different chemical with or
without high salt
Chemical (Maxam-Gilbert)
Sequencing
 Upon addition of a strong reducing agent, such as 10%
piperidine, the single-stranded DNA will break at specific
nucleotides.
 After the reactions, the piperidine is evaporated, and the
contents of each tube are dried and resuspended in
formamide for gel loading. The fragments are then
separated by size on a denaturing polyacrylamide gel
 The denaturing conditions (formamide, urea, and heat)
prevent the single strands of DNA from hydrogen bonding
with one another or folding up so that they migrate
through the gel strictly according to their size
Chemical (Maxam-Gilbert)
Sequencing
 The migration speed is important because single-base
resolution is required to interpret the sequence properly.
 After electrophoresis, the gel apparatus is disassembled;
the gel is removed to a sheet of filter paper, and it is dried
on a gel dryer.
 The dried gel is exposed to light sensitive film.
 Alternatively, wet gels can be exposed directly.
 The sequence is inferred from the bands on the film. The
smallest (fastest-migrating) band represents the base
closest to the labeled end of the fragment.
Chemical (Maxam-Gilbert) Sequencing
 The lane in which that band appears identifies the nucleotide.
 Bands in the purine (G +A) or pyrimidine (C + T) lane are called
based on whether they are also present in the G- or C-only lanes.
 Note how the sequence is read from the bottom (5′ end of the
DNA molecule) to the top (3′ end of the molecule) of the gel
 Although Maxam-Gilbert sequencing is a relatively efficient way
to determine short runs of sequence data, the method is not
practical for high throughput sequencing of long fragments.
Chemical (Maxam-Gilbert) Sequencing
In addition, the hazardous chemicals hydrazine and
piperidine require more elaborate precautions for use and
storage.
This method has therefore been replaced by the dideoxy
chain termination sequencing method for most sequencing
applications
Dideoxy (Sanger) Sequencing
The original dideoxy chain termination sequencing
methods required a single-stranded template.
Templates up to a few thousand bases long could be
produced using M13 bacteriophage, a bacterial virus with
a single stranded DNA genome.
This virus replicates by infecting Escherichia coli, in
which the viral single-stranded circular genome is
converted to a double-stranded plasmid, called the
replication factor (RF)
Dideoxy (Sanger) Sequencing
The plasmid codes for viral gene products that use the
bacterial transcription and translation machinery to make
new single-stranded genomes and viral proteins.
To use M13 for template preparation, the RF is isolated
from infected bacteria, cut with restriction enzymes, and
the fragment to be sequenced is ligated into the RF.
When the recombined RF is reintroduced into the host
bacteria, M13 continues its life cycle producing new
phages, some of which carry the inserted fragment
Dideoxy (Sanger) Sequencing
 When the phages are spread on a lawn of host bacteria, plaques
(clear spaces) of lysed bacteria formed by phage replication
contain pure populations of recombinant phage.
 The single- stranded DNA can then be isolated from the phage
by picking plugs of agar from the plaques and boiling them to
isolate the single-stranded phage DNA.
 Dideoxy chain termination (Sanger) sequencing is a
modification of the DNA replication process.
 A short, synthetic single-stranded DNA fragment (primer)
complementary to sequences just 5′ to the region of DNA to be
sequenced is used for priming dideoxy sequencing reactions
Dideoxy (Sanger) Sequencing
 For detection of the products of the sequencing reaction, the primer
may be attached covalently at the 5′ end to a 32P-labeled nucleotide or
a fluorescent dye-labeled nucleotide.
 An alternative detection strategy is to incorporate 32P- or 35S-labeled
deoxynucleotides in the nucleotide sequencing reaction mix.
 The latter is called internal labeling.
 Just as in the in vivo DNA replication reaction, an in vitro DNA
synthesis reaction would result in polymerization of deoxynucleotides
to make full-length copies of the DNA template
 For sequencing, modified dideoxynucleotide (ddNTP) derivatives are
added to the reaction mixture.
 Dideoxynucleotides lack the hydroxyl group found on the 3′ ribose
carbon of the deoxynucleotides (dNTPs)
Dideoxy (Sanger) Sequencing
 DNA synthesis will stop upon incorporation of a ddNTP into the
growing DNA chain (chain termination) because without the
hydroxyl group at the 3′ sugar carbon, the 5′-3′ phosphodiester
bond cannot be established to incorporate a subsequent nucleotide.
 The newly synthesized chain will terminate, therefore, with the
ddNTP.
 To perform a manual dideoxy sequencing reaction, a 1:1 mixture
of template and primer is placed into four separate reaction tubes
in sequencing buffer
Sequencing buffer is usually provided with the sequencing
enzyme and contains ingredients necessary for the polymerase
activity.
Dideoxy (Sanger) Sequencing
 Mixtures of all four dNTPs and one of the four ddNTPs are then added to
each tube, with a different ddNTP in each of the four tubes.
 The ratio of ddNTPs:dNTPs is critical for generation of a readable
sequence. If the concentration of ddNTPs is too high, polymerization will
terminate too frequently early along the template.
 If the ddNTP concentration is too low, infrequent or no termination will
occur. In the beginning days of sequencing, optimal ddNTP:dNTP ratios
were determined empirically (by experimenting with various ratios).
 Modern sequencing reagent mixes have preoptimized nucleotide mixes
Dideoxy (Sanger) Sequencing
 With the addition of DNA polymerase enzyme to the four
tubes, the reaction begins.
 After about 20 minutes, the reactions are terminated by
addition of a stop buffer.
 The stop buffer consists of 20 mM EDTA to chelate cations
and stop enzyme activity, formamide to denature the
products of the synthesis reaction, and gel loading dyes
(bromophenol blue and/or xylene cyanol).
 It is important that all four reactions be carried out for equal
time.
 Maintaining equal reaction times will provide consistent
band intensities in all four lanes of the gel sequence, which
facilitates final reading of the sequence
Dideoxy (Sanger) Sequencing
 The sets of synthesized fragments are then loaded onto a denaturing
polyacrylamide gel
 The products of each of the sequencing reactions are loaded into
four adjacent lanes, labeled A, C, G, or T, corresponding to the
ddNTP in the four reaction tubes.
 Once the gel is dried and exposed to x-ray film, the fragment
patterns can be visualized from the signal on the 32P-labeled
primer or nucleotide.
 All fragments from a given tube will end in the same ddNTP; for
example, all the fragments synthesized in the ddCTP tube end in C.
 The four-lane gel electrophoresis pattern of the products of the four
sequencing reactions is called a sequencing ladder
Dideoxy (Sanger) Sequencing
 The ladder is read to deduce the DNA sequence.
 From the bottom of the gel, the smallest (fastest migrating)
fragment is the one in which synthesis terminated closest to the
primer.
 The identity of the ddNTP at a particular position is determined by
the lane in which the band appears.
 If the smallest band is in the ddATP lane, then the first base is an A.
The next larger fragment is the one that was terminated at the next
position on the template.
 The lane that has the next larger band identifies the next nucleotide
in the sequence. In the figure, the next largest band is found in the
ddGTP lane, so the next base is a G.
 The sequence is thus read from the bottom (smallest, 5′-most) to the
top (largest, 3′-most) fragments across mor within lanes to
determine the identity and order of nucleotides in the sequence.
Dideoxy (Sanger) Sequencing
 Depending on the reagents and gel used, the number of
bases per sequence read averages 300–400.
 Advances in enzyme and gel technology have increased
this capability to over 500 bases per read.
 Sequencing reads can also be lengthened by loading the
same ladders in intervals of 2–6 hours so that the larger
bands are resolved with longer (e.g., 8-hour) migrations,
whereas smaller bands will be resolved simultaneously in
a 1–2–hour migration that was loaded 6–7 hours later.
 Sequencing technology has been improved significantly
from the first routine manual sequencing procedures.
Dideoxy (Sanger) Sequencing
 Recombinant polymerase enzymes, such as Sequenase, and the heat
stable enzymes Thermosequenase and Therminator are now
available; in vitro removal of the exonuclease activity of these
enzymes makes them faster and more processive (i.e., they stay
with the template longer, producing longer sequencing ladders).
 In addition, these engineered enzymes more efficiently incorporate
ddNTPs and nucleotide analogs such as dITP (deoxyinosine
triphosphate) or 7-deaza-dGTP, which are used to deter secondary
structure (internal folding and hybridization) in the template and
sequencing products.
 Furthermore, most sequencing methods in current use are
performed with double-stranded templates, eliminatingn the tedious
preparation of single-stranded versions of the DNA to be sequenced
Dideoxy (Sanger) Sequencing
 Using the heat-stable enzymes such as Therminator and
Thermosequenase, the sequencing reaction can be performed in
a thermal cycler (cycle sequencing).
 With cycle sequencing, timed manual starting and stopping of
the sequencing reactions are not necessary.
 The labor savings in this regard increase the number of
reactions that can be performed simultaneously; for example, a
single operator can set up 96 sequencing reactions (i.e.,
sequence 24 fragments) in a 96-well plate.
 Finally, improvements in fluorescent dye technology have led
to the automation of the sequencing process and, more
importantly, sequence determination.
Approaches to Automate Sequencing
 There are two approaches to automated fluorescent sequencing: dye
primer and dye terminator sequencing
 The goal of both approaches is the same: to label the fragments
synthesized during the sequencing reaction according to their
terminal ddNTP.
 Thus, fragments ending in ddATP, read as A in the sequence, will be
labeled with a “green” dye; fragments ending in ddCTP, read as C in
the sequence, will be labeled with a “blue” dye; fragments ending in
ddGTP, read as G in the sequence, will be labeled with a “black” or
“yellow” dye; and fragments ending in ddTTP, read as T in the
sequence,
 will be labeled with a “red” dye.
 This facilitates reading of the sequence by the automated sequence
Approaches to Automated Sequencing
 In dye primer sequencing, the four different fluorescent dyes are
attached to four separate aliquots of the primer.
 The dye molecules are attached covalently to the 5′ end of the primer
during chemical synthesis, resulting in four versions of the same
primer with different dye labels.
 The primer labeled with each “color” is added to four separate
reaction tubes, one each with ddATP, ddCTP, ddGTP, or ddTTP, as
shown in.
 After addition of the rest of the components of the sequencing
reaction (see the section above on manual sequencing) and of a heat
stable polymerase, the reaction is subjected to cycle sequencing in a
thermal cycler.
 The products of the sequencing reaction are then labeled at the 5′ end,
the dye color associated with the ddNTP at the end of the fragment.
Approaches to Automated Sequencing
 Dye terminator sequencing is performed with one of the
four fluorescent dyes attached to each of the ddNTPs
instead of to the primer.
 The primer is unlabeled.
 A major advantage of this approach is that all four
sequencing reactions are performed in the same tube (or
well of a plate) instead of in four separate tubes.
 After addition of the rest of the reaction components and
cycle sequencing, the product fragments are labeled at the
3′ end.
 As with dye primer sequencing, the “color” of the dye
corresponds to the ddNTP that terminated the strand
Pyrosequencing
 Chain termination sequencing is the most widely used method
to determine DNA sequence.
 Other methods have been developed that yield the same
information but not with the throughput capacity of the chain
termination method.
 Pyrosequencing is an example of a method designed to
determine a DNA sequence without having to make a
sequencing ladder.
 This procedure relies on the generation of light (luminescence)
when nucleotides are added to a growing strand of DNA.
 With this system, there are no gels, fluorescent dyes, or
ddNTPs.
Pyrosequencing
 The pyrosequencing reaction mix consists of a single stranded DNA
template, sequencing primer, sulfurylase and luciferase, plus the
two substrates adenosine 5′ phosphosulfate (APS) and luciferin.
 Sequentially, one of the four dNTPs is added to the reaction. If the
nucleotide is complementary to the base in the template strand next
to the 3′ end of the primer, DNA polymerase extends the primer.
 Pyrophosphate (PPi) is released with the formation of the
phosphodiester bond between the dNTP and the primer.
 The PPi is converted to ATP by sulfurylase in the presence of APS.
The ATP is used to generate a luminescent signal by luciferase-
catalyzed conversion of luciferin to oxyluciferin
Pyrosequencing
 The process is repeated with each of the four nucleotides again added
sequentially to the reaction.
 The generation of a signal indicates which nucleotide is the next correct
base in the sequence.
 Results from a pyrosequencing reaction consist of single peaks of
luminescence associated with the addition of the complementary
nucleotide.
 If a sequence contains a repeated nucleotide, for instance, GTTAC, the
results would be: dG peak, dT peak (double the height of the dG peak),
dApeak, dC peak.
 Pyrosequencing is most useful for short- to moderate sequence analysis.
 It is therefore used mostly for mutation or single nucleotide polymorphism
(SNP) detection and typing rather than for generating new sequences.
 It has been used for applications in infectious disease typing and HLA
typing
Bisulfite DNA Sequencing
 Bisulfite DNA sequencing, or methylation-specific sequencing,
is a modification of chain termination sequencing designed to
detect methylated nucleotides.
 Methylation of cytosine residues in DNA is an important part of
regulation of gene expression and chromatin structure.
 Methylated DNA is also involved in cell differentiation and is
implicated in a number of diseases, including several types of
cancer.
 For bisulfite sequencing, 2–4 g of genomic DNA is cut with
restriction enzymes to facilitate denaturation.
 The enzymes should not cut within the region to be sequenced.
 The restriction digestion products are resolved on an agarose gel,
and the fragments of the size of interest are purified from the gel
Bisulfite DNA Sequencing
 The purified fragments are denatured with heat (97C for 5
minutes) and exposed to bisulfite solution (sodium bisulfite,
NaOH and hydroquinone) for 16–20 hours.
 During this incubation, the cytosines in the reaction are
deaminated, converting them to uracils, whereas the 5-methyl
cytosines are unchanged.
 After the reaction, the treated template is cleaned, precipitated,
and resuspended for use as a template for PCR amplification.
 The PCR amplicons are then sequenced in a standard chain
termination method.
 Methylation is detected by comparing the treated sequence with an
untreated sequence and noting where in the treated sequence C/G
base pairs are not changed to U/G; that is, the sequence will be
altered relative to controls at the unmethylated C residues
Bisulfite DNA Sequencing
 Nonsequencing detection methods have also been devised to
detect DNA methylation, such as using restriction enzymes to
detect restriction sites generated or destroyed by the C>U changes.
 Other methods use PCR primers that will bind only to the
converted or non converted sequences so that the presence or
absence of PCR product indicates the methylation status.
 These methods, however, are not always applicable to detection of
methylation in unexplored sequences.
 As the role of methylation and epigenetics in human disease is
increasingly recognized, bisulfite sequencing has become a
popular method in the research laboratory.
 To date, however, this method has had limited use in clinical
analysis
Bioinformatics
 Information technology has had to encompass the vast amount
of data arising from the growing numbers of sequence discovery
methods, especially direct sequencing and array technology.
 This deluge of information requires careful storage,
organization, and indexing of large amounts of data.
 Bioinformatics is the merger of biology with information
technology.
 Part of the practice in this field is biological analysis in silico;
that is, by computer rather than in the laboratory.
 Bioinformatics dedicated specifically to handling sequence
information is sometimes termed computational biology
The Human Genome Project
 The first complete genome sequence of a clinically important
organism was that of Epstein-Barr virus published in 1984.
 The 170,000–base pair sequence was determined using the M13
template preparation/chain termination manual sequencing method.
 In 1985 and 1986 the possibility of mapping or sequencing the
human genome was discussed at meetings at the University of
California, Santa Cruz; Cold Spring Harbor, New York; and at the
Department of Energy in Santa Fe, New Mexico.
 The idea was controversial because the two to five billion dollar cost
of the project might not justify the information gained, most of
which would be sequences of “junk,” or non–gene-coding DNA.
 Furthermore, there was no available technology up to the massive
task

You might also like