Jumping Library
Jumping Library
Chromosome jumping (or chromosome hopping) was first described in 1984 by Collins and Weissman.[1]
At the time, cloning techniques allowed for generation of clones of limited size (up to 240kb), and
cytogenetic techniques allowed for mapping such clones to a small region of a particular chromosome to a
resolution of around 5-10Mb. Therefore, a major gap remained in resolution between available
technologies, and no methods were available for mapping larger areas of the genome.[1]
The original technique of chromosome jumping was developed in the laboratories of Collins and Weissman
at Yale University in New Haven, U.S.[1] and the laboratories of Poustka and Lehrach at the European
Molecular Biology Laboratory in Heidelberg, Germany.[2]
Collins and Weissman's method[1] described above encountered some early limitations. The main concern
was with avoiding non-circularized fragments. Two solutions were suggested: either screening junction
fragments with a given probe or adding a second size-selection step after the ligation to separate single
circular clones (monomers) from clones ligated to each other (multimers). The authors also suggested that
other markers such as the λ cos site or antibiotic resistance genes should be considered (instead of the
amber suppressor tRNA gene) to facilitate selection of junction clones.
Poustka and Lehrach[2] suggested that full digestion with rare-cutting restrictions enzymes (such as NotI)
should be used for the first step of the library construction instead of partial digestion with a frequently
cutting restriction enzyme. This would significantly reduce the number of clones from millions to
thousands. However, this could create problems with circularizing the DNA fragments since these
fragments would be very long, and would also lose the flexibility in choice of end points that one gets in
partial digests. One suggestion for overcoming these problems would be to combine the two methods, i.e.
to construct a jumping library from DNA fragments digested partially with a commonly cutting restriction
enzyme and completely with a rare cutting restriction enzyme and circularizing them into plasmids cleaved
with both enzymes. Several of these "combination" libraries were completed in 1986.[2][3]
In 1991, Zabarovsky et al.[4] proposed a new approach for construction of jumping libraries. This approach
included the use of two separate λ vectors for library construction, and a partial filling-in reaction that
removes the need for a selectable marker. This filling-in reaction worked by destroying the specific
cohesive ends (resulting from restriction digests) of the DNA fragments that were nonligated and
noncircularized, thus preventing them from cloning into the vectors, in a more energy-efficient and accurate
manner. Furthermore, this improved technique required less DNA to start with, and also produced a library
that could be transferred into a plasmid form, making it easier to store and replicate. Using this new
approach, they successfully constructed a human NotI jumping library from a lymphoblastoid cell line and
a human chromosome 3-specific NotI jumping library from a human chromosome 3 and mouse hybrid cell
line.[4]
Current method
Second-generation or "Next-Gen" (NGS) techniques have evolved radically: the sequencing capacity has
increased more than ten thousandfold and the cost has dropped by over one million-fold since
2007(National Human Genome Research Institute). NGS has revolutionized the genetic field in many
ways.
Library construction
A library is often prepared by random fragmentation of DNA and ligation of common adaptor
sequences.[5][6] However, the generated short reads challenge the identification of structural variants, such
as indels, translocations, and duplication. Large regions of simple repeats can further complicate the
alignment.[7] Alternatively, a jumping library can be used with NGS for the mapping of structural variation
and scaffolding of de novo assemblies.[8]
Short-jump library
There are two issues related to short-jump libraries. First, a read can
pass through the biotinylated circularization junction and reduce the
effective read length. Second, reads from non-jumped fragments
(i.e. fragments without the circularization junction) are sequenced
and reduce genomic coverage. It has been reported that non-
jumped fragments range from 4% to 13%, depending on the size of
This figure is a schematic
selection. The first problem might be solved by shearing circles into
representation of one of the most
a larger size and select for those larger fragments. The second
recently used methods for creating
problem can be addressed by using a custom barcoded jumping
jumping libraries.
library.[9][10]
This jumping library uses adaptors containing markers for fragment selection in combination with barcodes
for multiplexing. The protocol was developed by Talkowski et al.[9] and based on mate-pair library
preparation for SOLiD sequencing. The selected DNA fragment size is 3.5 – 4.5 kb. Two adaptors were
involved: one containing an EcoP15I recognition site and an AC overhang; the other containing a GT
overhang, a biotinylated thymine, and an oligo barcode. The circularized DNA was digested and the
fragments with biotynylated adaptors were selected for (see Figure 3). The EcoP15I recognition site and
barcode help to distinguish junction fragments from nonjump fragments. These targeted fragments should
contain 25 to 27bp of genomic DNA, the EcoP15I recognition site, the overhang, and the barcode.[9]
Long-jump library
This library construction process is similar to that of the short-jump library except that the condition is
optimized for longer fragments (5 kb).[10]
Fosmid-jump library
This library construction process is also similar to that of short-jump library except that transfection using
the E. coli vector is required for amplification of large (40 kb) DNA fragments. In addition, the fosmids can
be modified to facilitate the conversion into jumping library compatible with certain next generation
sequencers.[8][10]
Paired-end sequencing
The segments resulting from circularization during constructing jumping library are cleaved, and DNA
fragments with markers will be enriched and subjected to paired-end sequencing. These DNA fragments
are sequenced from both ends and generate pairs of reads. The genomic distance between the reads in each
pair is approximately known and used for the assembly process. For example, a DNA clone generated by
random fragmentation is about 200 bp, and a read from each end is around 180 bp, overlapping each other.
This should be distinguished from mate-pair sequencing, which is basically a combination of next
generation sequencing with jumping libraries.
Computational analysis
Different assembly tools have been developed to handle jumping library data. One example is DELLY.
DELLY was developed to discover genomic structural variants and "integrates short insert paired-ends,
long-range mate-pairs and split-read alignments" to detect rearrangements at sequence level.[11]
An example of joint development of new experimental design and algorithm development is demonstrated
by the ALLPATHS-LG assembler.[12]
Confirmation
When used for detection of genetic and genomic changes, jumping clones require validation by Sanger
sequencing.
Applications
Early applications
In the early days, chromosome walking from genetically linked DNA markers was used to identify and
clone disease genes. However, the large molecular distance between known markers and the gene of
interest was complicating the cloning process. In 1987, a human chromosome jumping library was
constructed to clone the cystic fibrosis gene. Cystic fibrosis is an autosomal recessive disease affecting 1 in
2000 Caucasians. This was the first disease in which the usefulness of the jumping libraries was
demonstrated. Met oncogene was a marker tightly linked to the cystic fibrosis gene on human chromosome
7, and the library was screened for a jumping clone starting at this marker. The cystic fibrosis gene was
determined to localize 240kb downstream of the met gene. Chromosome jumping helped reduce the
mapping "steps" and bypass the highly repetitive regions in the mammalian genome.[13] Chromosome
jumping also allowed the production of probes required for faster diagnosis of this and other diseases.[1]
New applications
A jumping library NGS combined approach can be applied to identify such genomic changes. For example,
Slade et al. applied this method to fine map a de novo balanced translocation in a child with Wilms'
tumor.[15] For this study, 50 million reads were generated, but only 11.6% of these could be mapped
uniquely to the reference genome, which represents approximately a sixfold coverage.
Talkowski et al.[9] compared different approaches to detect balanced chromosome alterations, and showed
that modified jumping library in combination with next generation DNA sequencing is an accurate method
for mapping chromosomal breakpoints. Two varieties of jumping libraries (short-jump libraries and custom
barcoded jumping libraries) were tested and compared to standard sequencing libraries. For standard NGS,
200-500bp fragments are generated. About 0.03%–0.54% of fragments represent chimeric pairs, which are
pairs of end-reads that are mapped to two different chromosomes. Therefore, very few fragments cover the
breakpoint area. When using short-jump libraries with fragments of 3.2–3.8kb, the percentage of chimeric
pairs increased to 1.3%. With Custom Barcoded Jumping Libraries, the percentage of chimeric pairs further
increased to 1.49%.[9]
Prenatal diagnosis
Conventional cytogenetic testing cannot offer the gene-level resolution required to predict the outcome of a
pregnancy and whole genome deep sequencing is not practical for routine prenatal diagnosis. Whole-
genome jumping library could complement conventional prenatal testing. This novel method was
successfully applied to identify a case of CHARGE syndrome.[6]
De novo assembly
In metagenomics, regions of the genomes that are shared between strains are typically longer than the reads.
This complicates the assembly process and makes reconstructing individual genomes for a species a
daunting task.[10] Chimeric pairs that are mapped far apart in the genome can facilitate the de novo
assembly process. By using a longer-jump library, Ribeiro et al. demonstrated that the assemblies of
bacterial genomes were of high quality while reducing both cost and time.[16]
Limitation
The cost of sequencing has dropped dramatically while the cost of construction of jumping libraries has not.
Therefore, as new sequencing technologies and bioinformatic tools are developed, jumping libraries may
become redundant.
See also
Chromosome jumping
Bioinformatics
DNA sequencing
References
1. Collins, FS; Weissman, SM (November 1984). "Directional cloning of DNA fragments at a
large distance from an initial probe: a circularization method" (https://www.ncbi.nlm.nih.gov/p
mc/articles/PMC392022). Proceedings of the National Academy of Sciences of the United
States of America. 81 (21): 6812–6. Bibcode:1984PNAS...81.6812C (https://ui.adsabs.harvar
d.edu/abs/1984PNAS...81.6812C). doi:10.1073/pnas.81.21.6812 (https://doi.org/10.1073%2
Fpnas.81.21.6812). PMC 392022 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC392022).
PMID 6093122 (https://pubmed.ncbi.nlm.nih.gov/6093122).
2. Poustka, Annemarie; Lehrach, Hans (1 January 1986). "Jumping libraries and linking
libraries: the next generation of molecular tools in mammalian genetics". Trends in Genetics.
2: 174–179. doi:10.1016/0168-9525(86)90219-2 (https://doi.org/10.1016%2F0168-9525%28
86%2990219-2).
3. Poustka, A; Pohl, TM; Barlow, DP; Frischauf, AM; Lehrach, H (Jan 22–28, 1987).
"Construction and use of human chromosome jumping libraries from NotI-digested DNA".
Nature. 325 (6102): 353–5. Bibcode:1987Natur.325..353P (https://ui.adsabs.harvard.edu/ab
s/1987Natur.325..353P). doi:10.1038/325353a0 (https://doi.org/10.1038%2F325353a0).
PMID 3027567 (https://pubmed.ncbi.nlm.nih.gov/3027567). S2CID 4241410 (https://api.sem
anticscholar.org/CorpusID:4241410).
4. Zabarovsky, ER; Boldog, F; Erlandsson, R; Kashuba, VI; Allikmets, RL; Marcsek, Z;
Kisselev, LL; Stanbridge, E; Klein, G; Sumegi, J (December 1991). "New strategy for
mapping the human genome based on a novel procedure for construction of jumping
libraries". Genomics. 11 (4): 1030–9. doi:10.1016/0888-7543(91)90029-e (https://doi.org/10.
1016%2F0888-7543%2891%2990029-e). PMID 1783374 (https://pubmed.ncbi.nlm.nih.gov/
1783374).
5. Shendure, J; Ji, H (October 2008). "Next-generation DNA sequencing". Nature
Biotechnology. 26 (10): 1135–45. doi:10.1038/nbt1486 (https://doi.org/10.1038%2Fnbt1486).
PMID 18846087 (https://pubmed.ncbi.nlm.nih.gov/18846087). S2CID 6384349 (https://api.se
manticscholar.org/CorpusID:6384349).
6. Talkowski, ME; Ordulu, Z; Pillalamarri, V; Benson, CB; Blumenthal, I; Connolly, S; Hanscom,
C; Hussain, N; Pereira, S; Picker, J; Rosenfeld, JA; Shaffer, LG; Wilkins-Haug, LE; Gusella,
JF; Morton, CC (Dec 6, 2012). "Clinical diagnosis by whole-genome sequencing of a
prenatal sample" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3579222). The New
England Journal of Medicine. 367 (23): 2226–32. doi:10.1056/NEJMoa1208594 (https://doi.
org/10.1056%2FNEJMoa1208594). PMC 3579222 (https://www.ncbi.nlm.nih.gov/pmc/article
s/PMC3579222). PMID 23215558 (https://pubmed.ncbi.nlm.nih.gov/23215558).
7. Meldrum, C; Doyle, MA; Tothill, RW (November 2011). "Next-generation sequencing for
cancer diagnostics: a practical perspective" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC
3219767). The Clinical Biochemist. Reviews / Australian Association of Clinical
Biochemists. 32 (4): 177–95. PMC 3219767 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC
3219767). PMID 22147957 (https://pubmed.ncbi.nlm.nih.gov/22147957).
8. Williams, L. J. S.; Tabbaa, D. G.; Li, N.; Berlin, A. M.; Shea, T. P.; MacCallum, I.; Lawrence, M.
S.; Drier, Y.; Getz, G.; Young, S. K.; Jaffe, D. B.; Nusbaum, C.; Gnirke, A. (16 July 2012).
"Paired-end sequencing of Fosmid libraries by Illumina" (https://www.ncbi.nlm.nih.gov/pmc/a
rticles/PMC3483553). Genome Research. 22 (11): 2241–2249. doi:10.1101/gr.138925.112
(https://doi.org/10.1101%2Fgr.138925.112). PMC 3483553 (https://www.ncbi.nlm.nih.gov/pm
c/articles/PMC3483553). PMID 22800726 (https://pubmed.ncbi.nlm.nih.gov/22800726).
9. Talkowski, ME; Ernst, C; Heilbut, A; Chiang, C; Hanscom, C; Lindgren, A; Kirby, A; Liu, S;
Muddukrishna, B; Ohsumi, TK; Shen, Y; Borowsky, MZ; Daly, MJ; Morton, CC; Gusella, JF
(Apr 8, 2011). "Next-generation sequencing strategies enable routine detection of balanced
chromosome rearrangements for clinical diagnostics and genetic research" (https://www.ncb
i.nlm.nih.gov/pmc/articles/PMC3071919). American Journal of Human Genetics. 88 (4):
469–81. doi:10.1016/j.ajhg.2011.03.013 (https://doi.org/10.1016%2Fj.ajhg.2011.03.013).
PMC 3071919 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3071919). PMID 21473983
(https://pubmed.ncbi.nlm.nih.gov/21473983).
10. Nagarajan, Niranjan; Pop, Mihai (29 January 2013). "Sequence assembly demystified".
Nature Reviews Genetics. 14 (3): 157–167. doi:10.1038/nrg3367 (https://doi.org/10.1038%2
Fnrg3367). PMID 23358380 (https://pubmed.ncbi.nlm.nih.gov/23358380). S2CID 3519991
(https://api.semanticscholar.org/CorpusID:3519991).
11. Rausch, T.; Zichner, T.; Schlattl, A.; Stutz, A. M.; Benes, V.; Korbel, J. O. (7 September 2012).
"DELLY: structural variant discovery by integrated paired-end and split-read analysis" (http
s://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436805). Bioinformatics. 28 (18): i333–i339.
doi:10.1093/bioinformatics/bts378 (https://doi.org/10.1093%2Fbioinformatics%2Fbts378).
PMC 3436805 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436805). PMID 22962449
(https://pubmed.ncbi.nlm.nih.gov/22962449).
12. Gnerre, S; Maccallum, I; Przybylski, D; Ribeiro, FJ; Burton, JN; Walker, BJ; Sharpe, T; Hall,
G; Shea, TP; Sykes, S; Berlin, AM; Aird, D; Costello, M; Daza, R; Williams, L; Nicol, R;
Gnirke, A; Nusbaum, C; Lander, ES; Jaffe, DB (Jan 25, 2011). "High-quality draft assemblies
of mammalian genomes from massively parallel sequence data" (https://www.ncbi.nlm.nih.g
ov/pmc/articles/PMC3029755). Proceedings of the National Academy of Sciences of the
United States of America. 108 (4): 1513–8. Bibcode:2011PNAS..108.1513G (https://ui.adsab
s.harvard.edu/abs/2011PNAS..108.1513G). doi:10.1073/pnas.1017351108 (https://doi.org/1
0.1073%2Fpnas.1017351108). PMC 3029755 (https://www.ncbi.nlm.nih.gov/pmc/articles/P
MC3029755). PMID 21187386 (https://pubmed.ncbi.nlm.nih.gov/21187386).
13. Rommens, J.; Iannuzzi, M.; Kerem, B; Drumm, M.; Melmer, G; Dean, M; Rozmahel, R; Cole,
J.; Kennedy, D; Hidaka, N; et al. (8 September 1989). "Identification of the cystic fibrosis
gene: chromosome walking and jumping". Science. 245 (4922): 1059–1065.
Bibcode:1989Sci...245.1059R (https://ui.adsabs.harvard.edu/abs/1989Sci...245.1059R).
doi:10.1126/science.2772657 (https://doi.org/10.1126%2Fscience.2772657). PMID 2772657
(https://pubmed.ncbi.nlm.nih.gov/2772657).
14. Rowley, J.D. (1979). "chromosome abnormalities in leukemia". Blood Transfus.
Haematology and Blood Transfusion / Hämatologie und Bluttransfusion. 23: 43–52.
doi:10.1007/978-3-642-67057-2_5 (https://doi.org/10.1007%2F978-3-642-67057-2_5).
ISBN 978-3-540-08999-5. PMID 232467 (https://pubmed.ncbi.nlm.nih.gov/232467).
15. Slade, I.; Stephens, P.; Douglas, J.; Barker, K.; Stebbings, L.; Abbaszadeh, F.; Pritchard-
Jones, K.; Cole, R.; Pizer, B.; Stiller, C.; Vujanic, G.; Scott, R. H.; Stratton, M. R.; Rahman, N.
(30 November 2009). "Constitutional translocation breakpoint mapping by genome-wide
paired-end sequencing identifies HACE1 as a putative Wilms tumour susceptibility gene".
Journal of Medical Genetics. 47 (5): 342–347. doi:10.1136/jmg.2009.072983 (https://doi.org/
10.1136%2Fjmg.2009.072983). PMID 19948536 (https://pubmed.ncbi.nlm.nih.gov/1994853
6). S2CID 24354213 (https://api.semanticscholar.org/CorpusID:24354213).
16. Ribeiro, F. J.; Przybylski, D.; Yin, S.; Sharpe, T.; Gnerre, S.; Abouelleil, A.; Berlin, A. M.;
Montmayeur, A.; Shea, T. P.; Walker, B. J.; Young, S. K.; Russ, C.; Nusbaum, C.; MacCallum,
I.; Jaffe, D. B. (24 July 2012). "Finished bacterial genomes from shotgun sequence data" (htt
ps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483556). Genome Research. 22 (11): 2270–
2277. doi:10.1101/gr.141515.112 (https://doi.org/10.1101%2Fgr.141515.112). PMC 3483556
(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483556). PMID 22829535 (https://pubmed.
ncbi.nlm.nih.gov/22829535).
External links
DELLY: Structural variant discovery by integrated paired-end and split-read analysis (http://w
ww.embl.de/~rausch/delly.html)
ALLPATHS-LG: de novo assembly of whole-genome shotgun microreads (http://www.broadi
nstitute.org/software/allpaths-lg/blog/)[1]
1. Butler, J; MacCallum, I; Kleber, M; Shlyakhter, IA; Belmonte, MK; Lander, ES; Nusbaum, C;
Jaffe, DB (May 2008). "ALLPATHS: de novo assembly of whole-genome shotgun
microreads" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2336810). Genome Research.
18 (5): 810–20. doi:10.1101/gr.7337908 (https://doi.org/10.1101%2Fgr.7337908).
PMC 2336810 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2336810). PMID 18340039
(https://pubmed.ncbi.nlm.nih.gov/18340039).