Automated sequencing methods refer to techniques that streamline and accelerate the process of
determining the order of nucleotides (A, T, C, G) in a DNA (or RNA) molecule. These methods
have revolutionised genetics and molecular biology by enabling rapid and high-throughput
sequencing of genomes, leading to signi cant advancements in research, diagnostics, and
biotechnology.
Principles of Automated Sequencing
The foundational principle for many automated sequencing methods, particularly early ones, is
based on the Sanger dideoxy chain termination method, but with key modi cations for
automation:
• Fluorescent Dye-Labelled Dideoxynucleotides (ddNTPs): Instead of radioactive labels,
automated Sanger sequencing uses uorescent dyes to tag the dideoxynucleotides. Each of
the four ddNTPs (ddATP, ddTTP, ddCTP, ddGTP) is labelled with a different colored
uorescent dye. This allows all four termination reactions to be performed in a single tube.
• Chain Termination: Like traditional Sanger, ddNTPs lack a 3'-hydroxyl group, preventing
further extension of the DNA strand when incorporated by DNA polymerase. This leads to
the generation of DNA fragments of varying lengths, each terminated by a speci c
uorescently labelled ddNTP.
• Capillary Electrophoresis: The resulting DNA fragments are separated by size using high-
resolution capillary electrophoresis. As the fragments pass through a laser beam at the end of
the capillary, the uorescent dye on each fragment is excited and emits light of a speci c
wavelength.
• Automated Detection and Analysis: A detector reads the colours of the emitted light as the
fragments pass, identifying the terminal base of each fragment. This information is then sent
to a computer, which reconstructs the DNA sequence based on the order of the colours.
Types of Automated Sequencing Technologies
While automated Sanger sequencing was the rst major step in automation, subsequent "Next-
Generation Sequencing" (NGS) and "Third-Generation Sequencing" (TGS) technologies have
brought about even higher throughput and new approaches:
1. First-Generation (Automated Sanger Sequencing):
•
Principle: Chain termination using uorescently labelled ddNTPs, followed by capillary
electrophoresis and laser detection.
• Characteristics: Produces relatively long reads (up to 800-1000 base pairs). Still considered
a "gold standard" for validating speci c sequences, but slower and more expensive for large-
scale projects compared to NGS.
• Applications: SNP identi cation, gene cloning veri cation, and sequencing small DNA
fragments.
2. Second-Generation (Next-Generation Sequencing - NGS / Massively Parallel Sequencing):
• Principle: These technologies enable the simultaneous sequencing of millions to billions of
DNA fragments in parallel. They often involve:
◦ Library preparation: Fragmenting DNA and ligating adaptors.
◦ Clonal ampli cation: Creating millions of identical copies of each fragment on a
solid surface (e.g., ow cell).
fl
fl
fi
fl
fl
fi
fl
fl
fi
fi
fi
fi
fi
fi
fi
◦ Sequencing by synthesis: Detecting the incorporation of nucleotides as a new strand
is synthesised. This can involve detecting light signals (e.g., Illumina) or pH changes
(e.g., Ion Torrent).
• Characteristics: High-throughput, signi cantly lower cost per base than Sanger, and much
faster. Produces shorter reads compared to Sanger.
• Examples:
◦ Illumina Sequencing: The most widely used NGS technology. Relies on reversible
terminator chemistry and uorescently labelled nucleotides.
◦ Ion Torrent Sequencing: Detects pH changes released during nucleotide
incorporation.
◦ Pyrosequencing: Detects the release of pyrophosphate upon nucleotide
incorporation through an enzyme cascade that produces light.
• Applications: Whole-genome sequencing (WGS), whole-exome sequencing (WES), RNA
sequencing (RNA-Seq), ChIP-seq, metagenomics, population genetics, and clinical
diagnostics.
3. Third-Generation Sequencing (TGS / Long-Read Sequencing):
• Principle: These methods aim to sequence single DNA molecules without the need for
extensive ampli cation, allowing for much longer reads.
• Characteristics: Very long reads (tens to hundreds of kilobases), ability to detect DNA
modi cations (like methylation) directly, can resolve complex genomic regions (e.g.,
repetitive sequences). Higher error rates than NGS, but rapidly improving.
• Examples:
◦ PacBio SMRT (Single Molecule Real-Time) Sequencing: Uses zero-mode
waveguides (ZMWs) to detect uorescently tagged nucleotides as a DNA
polymerase incorporates them.
◦ Oxford Nanopore Sequencing: Measures changes in electrical current as a single
DNA strand passes through a protein nanopore.
• Applications: Resolving structural variations, de novo genome assembly, direct RNA
sequencing, and epigenetic analysis.
Advantages of Automated Sequencing
• Increased Throughput: Can process a vast number of samples and generate massive
amounts of sequence data in a short time.
• Speed: Signi cantly faster than manual methods, allowing for quicker research and
diagnostic turnaround.
• Accuracy and Reliability: Automation reduces human error and provides more consistent
and reproducible results.
• Reduced Labour: Minimises hands-on time for researchers, freeing them for other tasks.
• Cost-Effectiveness (per base): While the initial investment in automated sequencers can be
high, the cost per base sequenced is dramatically lower compared to manual methods,
especially for large-scale projects.
• Safety: Eliminates the need for hazardous radioactive isotopes used in early manual
sequencing.
• Data Management: Integrated computer systems automatically collect, process, and store
sequencing data, facilitating analysis.
fi
fi
fi
fl
fl
fi
Disadvantages of Automated Sequencing
•High Initial Cost: The purchase and setup of automated sequencing equipment can be very
expensive.
• Technical Expertise: While user-friendly interfaces exist, operating and troubleshooting
these complex machines, as well as analysing the large datasets they produce, still require
specialised training and bioinformatics skills.
• Reagent Costs: Consumables and reagents for automated sequencing can be costly.
• Data Storage and Analysis: The immense volume of data generated by NGS and TGS
requires signi cant storage capacity and powerful computational resources for analysis.
• Bioinformatics Challenges: Interpreting and extracting meaningful biological insights from
large sequencing datasets can be complex and requires specialised bioinformatics pipelines
and expertise.
• Dependence on Technology: Labs become reliant on the speci c instruments and their
associated protocols.
Overall, automated sequencing methods have been pivotal in advancing our understanding of
biology, health, and disease, making large-scale genomic projects feasible and expanding the
applications of DNA sequencing across various scienti c and medical elds.
fi
fi
fi
fi