0% found this document useful (0 votes)

29 views92 pages

4 - 7 Genome Assembly To Annotation - Final

A course which give practical insight from genome Assembly to annotation in practical

Uploaded by

Desye Melese

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views92 pages

4 - 7 Genome Assembly To Annotation - Final

A course which give practical insight from genome Assembly to annotation in practical

Uploaded by

Desye Melese

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 92

Genome sequence assembly

Assembly concepts and methods

(some slides courtesy of Mihai Pop, Amel Ghouila)

1
General sequencing and assembly workflow

2
Sequence assembly strategies

Whole genome shotgun

Map-based sequencing

By Commins, J., Toft, C., Fares, M. A. - "Computational Biology Methods and Their Application to the
Comparative Genomics of Endocellular Symbiotic Bacteria of Insects." Biol. Procedures Online
(2009). Accessed via SpringerImages., CC BY-SA 2.5,
https://commons.wikimedia.org/w/index.php?curid=17509619
Building a library

• Break DNA into random fragments (8-10x coverage)

Actual situation

4
Building a library

• Break DNA into random fragments (8-10x coverage)

• Sequence the ends of the fragments
– Amplify the fragments in a vector
– Sequence 800-1000 (500-700) bases at each end of the fragment

5
Assembling the fragments

6
Forward-reverse constraints
• The sequenced ends are facing towards each other
• The distance between the two fragments is known
(within certain experimental error)
Insert
F R
I II

R F

I II

R F
Clone
II I

F R
7
Building Scaffolds

• Break DNA into random fragments (8-10x coverage)

• Sequence the ends of the fragments
• Assemble the sequenced ends
• Build scaffolds

8
Assembly gaps
Physical gaps

Sequencing gaps

sequencing gap - we know the order and orientation of the contigs and have at
least one clone spanning the gap
physical gap - no information known about the adjacent contigs, nor about the DNA
spanning the gap

9
Unifying view of assembly

Assembly

Scaffolding

10
Alignment versus De Novo Assembly

Short Sequence “Reads”

Is a Reference Genome available?

http://www.ncbi.nlm.nih.gov/sites/genome
“Browse by organism groups”

Yes No

?
Alignment to Reference de novo Assembly

11
Alignment versus De Novo Assembly

Short Sequence “Reads”

Is a Reference Genome available?

http://www.ncbi.nlm.nih.gov/sites/genome
“Browse by organism groups”

Yes

Alignment to Reference

12
Workflow: QC & Mapping reads

Input reads Quality check Not OK? Quality- & Adapter-

(fastq files) with FastQC trimming
OK?

Map reads to
reference genome
using e.g. BWA or
Bowtie2
Sort by coordinates using SAMtools
sort or PicardTools SortSam
Output:
Call variants, Sorted BAM file
structural (binary SAM
variation etc sequence
alignment map)
Steps in Alignment/Mapping
1. Get your sequence data

2. Check quality of sequence data

3. Choose an alignment/mapping program

4. Run the alignment

5. View the alignments

6. Downstream Processing

14
Steps in Alignment/Mapping
1. Get your sequence data

2. Check quality of sequence data

3. Choose an alignment/mapping program

4. Run the alignment

5. View the alignments

6. Downstream Processing

15
Public Short Read Repositories
 NIH/NCBI
• Short Read Archive (http://www.ncbi.nlm.nih.gov/sra)
• Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/)
• 1000 Genomes (ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/)
• European Nucleotide Archive (http://www.ebi.ac.uk/ena/)

fastq-dump SRR036642
16
Steps in Alignment/Mapping
1. Get your sequence data

2. Check quality of sequence data

3. Choose an alignment/mapping program

4. Run the alignment

5. View the alignments

6. Downstream Processing

17
Running FastQC

Open FastQC program

Open in browser:
fastqc_report.html
Per base sequence quality p-value =
0.0001

p-value = 0.001

p-value = 0.01

p-value = 0.05

Babraham Bioinformatics http://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Steps in Alignment/Mapping
1. Get your sequence data

2. Check quality of sequence data

3. Choose an alignment/mapping program

4. Run the alignment

5. View the alignments

6. Downstream Processing

19
Short Read Alignment Software
• BFAST MAQ SSAHA and
• BLASTN mrFAST and SSAHA2
• BLAT mrsFAST STAR
• Bowtie MOSAIK TopHat
• Bowtie2 Novoalign ~20 more…
• BWA RUM
• ELAND SHRiMP
• GNUMAP SOAP
• GMAP and SpliceMap
GSNAP
http://en.wikipedia.org/wiki/List_of_sequence_alignment_software

http://tinyurl.com/seqanswers-mapping
20
Issues of Consideration for Alignment
Software

Library types:
• Genomic DNA (for resequencing)
• ChIP DNA (PCR bias)
• RNA-seq cDNA
– mRNA-seq (junction mapping)
– smRNA-seq (adapter trimming)

Adapters
DNA
Fragment

Illumina 21
Issues of Consideration for Alignment
Software

Types of reads

e.g. SRR036642.fastq

• Single-end

e.g. SRR027894_1.fastq, SRR027894_2.fastq

1
• Paired-end
2
Mean, Standard Deviation of Inner Distance

22
Issues of Consideration for Alignment
Software

Platform differences

• Bases (ACTG)

• Colorspace (2-base encoding,SOLiD)

• Read Length

• 454 (homopolymers)

23
Issues of Consideration for Alignment
Software

Software Properties
• Open-source or proprietary ($)
• Accuracy
• Speed of algorithm
• Multi-threaded or single processor
• RAM requirements (2GB vs 50GB for loading index)
• Use of base quality score
• Gapped alignment (indels)

24
Steps in Alignment/Mapping
1. Get your sequence data

2. Check quality of sequence data

3. Choose an alignment/mapping program

4. Run the alignment

5. View the alignments

6. Downstream Processing

25
“Mapping reads to the reference” is
finding where their sequence occurs in the genome

100 bp identified 200 – 500 bp unknown sequence 100 bp identified

Source: Wikimedia, file:Mapping Reads.png

“Mapping reads to the reference”:
naïve text search algorithms are too slow
• Naïve approach: compare each read with every position in the genome
– Takes too long, will not find sequences with mismatches

• Search programs typically create an index of the reference sequence (or

text) and store the reference sequence (text) in an advanced data
structure for fast searching.

• An index is basically like a phone book (with

addresses)  Quickly find address (location)
of a person

Example of algorithm using ‘indexed

seed tables’ to quickly find locations
of exact parts of a read
Read Mapping: General problems

• Read can match equally well at more than one location (e.g.
repeats, pseudo-genes)

• Reads can have imperfect hit to it’s actual position, e.g. if it

carries a break point, SNP, insertion and/or deletion compared
to the reference sequence
Output of read mapping: SAM and BAM files

• SAM = Sequence Alignment Map

• BAM = Binary SAM = compressed SAM
• Sequence Alignment/Map format
• contains information about how sequence reads map to a
reference genome
• Supports paired-end reads and color space from SOLiD.
• Is produced by bowtie, BWA and other mapping tools
SAM and BAM formats/files
• After mapping the FASTQ file to
the reference genome you will end
up with a SAM or BAM alignment
file
• SAM stands for Sequence
Alignment/Map format
• A single SAM file can store
mapped, unmapped, and even QC-
failed reads from a sequencing run,
and indexed to allow rapid access.
This means that the raw
sequencing data can be fully
recapitulated from the SAM/BAM
file.
SAM/BAM file
SAM and BAM formats/files

• SAM is rarely helpful and really

takes up too much space which
is why we use only the BAM in
principle

• A BAM file (.bam) is the binary

version of a SAM file (saving
storage and faster manipulation)
SAM and BAM formats/files
 A SAM file (.sam) is a tab-delimited text
file that contains sequence alignment
data
 SAM files can be opened using a text
editor or viewed using the UNIX "more"
command

 Most alignment programs will supply:

• - a header: describing the format

version, sorting order of the reads,
genomic sequences to which the reads
were mapped
• - an alignment section: contains the
information for each sequence about
where/how it aligns to the reference
genome
http://samtools.sourceforge.net/SAM1.pdf
http://genome.sph.umich.edu/wiki/SAM

CIGAR stands for Concise Idiosyncratic Gapped Alignment Report. It is

a compressed representation of an alignment that is used in the SAM file
format.
Output of read mapping: SAM file
Read_1 0 ENSG00000262694|HG1257_PATCH|72905355|72987235 937 1 70M * 0 0 CCACGAAAACTC…. III….. AS:
i:-12 XS:i:-12 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:11T38T19 YT:Z:UU

Total match proportion = 70/70

Read_2 0 ENSG00000091664|11|22359643|22401049 39302 24 4M1I1M8I56M * 0 0

AATGACAAAGAATAA….. IIII… AS:i:-79 XN:i:0 XM:i:7 XO:i:2 XG:i:9 NM:i:16 MD:Z:1C28C14C0C1A8T2T0
YT:Z:UU

4M1I1M8I56M = 4 Matches, 1 Insertion, 1 Match, 8 Insertions, 56 Matches

Total match proportion = (4 + 1 + 56 ) / 70 = 61/70

CIGAR informs –
 How much of a read has been matched (and has insertions and deletions)
 Where are those matches (and insertions/deletions,)
(http://samtools.github.io/hts-specs/SAMv1.pdf)

 QNAME: Query template NAME. Reads/segments having identical QNAME are

regarded to come from the same template. A QNAME ‘*’ indicates the
information is unavailable.
 Used to group/identify alignments that are together, like paired alignments or a
read that appears in multiple alignments.
Explain flag tool:
https://broadinstitute.github.io/picard/explain-flags.html
POS: 5
CIGAR: 3M1I3M1D2M
Short Read Alignment: Focus on BWA

4
5
Visualizing mapping results

IGV: Integrated Genome Viewer

Harvesting Information from SAM

• Query name, QNAME (SAM) / read_name (BAM).

• FLAG provides the following information:
– are there multiple fragments?
– are all fragments properly aligned?
– is this fragment unmapped?
– is the next fragment unmapped?
– is this query the reverse strand?
– is the next fragment the reverse strand?
– is this the last fragment?
– is this a secondary alignment?
– did this read fail quality controls?
– is this read a PCR or optical duplicate

Source: www.cs.colostate.edu/~cs680/Slides/lecture3.pdf
Steps in Alignment/Mapping
1. Get your sequence data

2. Check quality of sequence data

3. Choose an alignment/mapping program

4. Run the alignment

5. View the alignments

6. Downstream Processing
Visualization of output in Integrated Genome Browser (IGV)

 IGV
• http://www.broadinstitute.org/igv/projects/current/igv_mm.jnlp (Windows )
• http://www.broadinstitute.org/igv/projects/current/igv_lm.jnlp (Mac)
Steps in Alignment/Mapping
1. Get your sequence data

2. Check quality of sequence data

3. Choose an alignment/mapping program

4. Run the alignment

5. View the alignments

6. Downstream Processing
SV/CNV/Variant Calling

• Structural variations (SV) Deletions, duplications, copy-

number variations, insertions, inversions, translocations.
• Copy number Variations (CNV) Deletions or duplications
of genes or relatively large regions of the genome that
affect chromosomes
• Variant Calling (SNPs and small InDels)
 SNPs: affects only 1 nucleotide
 InDels: affects 1 or several nucleotides
Overview of SV/CNV/Variant Calling

Adapted from Scherer et al 2007

VCF (Variant Call Format)
• VCF (Variant Call Format) - Text file format
storing SNPs and InDels information (
http://www.1000genomes.org/node/101)
• Obtaining variants listed in this format is a
multistep procedure involving different tools but
standardized
• Headers (meta-information) + data lines - 8
required fields, tab-delimited
Variants annotation
• Variant annotation programs: SnpEff
- A variant annotation and effect prediction tool.
- Annotates and predicts the effects of variants on genes: Are they in a
gene? In an exon? Do they change protein coding? Do they cause
premature stop codons?
Variants annotation
Biological interpretation
From Variant annotation to data mining
•web-based
•available packages
Aim
• Functional impact of variants (synonymous or not…)
• Gene Ontology Annotation (BP, MF, CC)
• Pathway/Network information
• Predictions of pathogenicity/severity
NB: DAVID (Database for Annotation, Visualization and
Integrated Discovery) to switch between databases
https://david.ncifcrf.gov/
Downstream Processing
Finding and annotating peaks (ChIP-seq)
Assembling/annotating transcripts, identify differential gene expression (RNA-seq)
SNP and structural variation identification, prediction of effects (DNA-seq)
Etc.

Park, Nat Rev Genet, http://grimmond.imb.uq.edu.au/mammalian_transcriptome.html

2009
Tutorials

• https://datacarpentry.org/wrangling-genomics/
• https://genomics.sschmeier.com/ngs-
variantcalling/index.html
• https://learn.gencore.bio.nyu.edu/variant-calling/
De novo genome assembly

• De novo sequencing refers to sequencing a novel genome

where there is no reference sequence available for
alignment.
• Sequence reads are assembled as contigs, and the coverage
quality of de novo sequence data depends on the size and
continuity of the contigs (i.e, the number of gaps in the data).
Alignment versus De Novo Assembly

Short Sequence
“Reads”

Is a Reference Genome
available?
http://www.ncbi.nlm.nih.gov/sites/genome
“Browse by organism groups”
No

de novo
Assembly

61
General strategy of assembling a genome
de novo

Pre-process short
reads (trim, quality
filter…)

Assemble sequences
into contigs

Order contigs
into scaffolds

Annotate genome

62
Choose genome,
gather info ASSEMBLE!
Pipeline for de novo
DNA Library assembly Assemble again and again
preparation
(different tools, kmers)

Sequencing Fill gaps

Quality check Evaluate assembly

contiguity

Trimming Evaluate assembly gene

content

Error correction Choose a final assembly

Merge Re-scaffold
overlapping reads

ASSEMBLE! ANNOTATE!
Best Assembly Advice

• Remember: your goal is to have a genome

assembly.
• But you will not be doing one assembly.
• In the end you will have many assemblies to
choose from.
– Because you will be doing a lot of work!
• Use a lot of assembly tools for a lot of k values.
– Large k can better resolve repeats
– Comes at coverage cost
• The whole process should take a few months.
De novo assembly basics
• Find all overlaps between reads
• Build a graph
• Simplify the graph (sequencing errors)
• Traverse a graph to produce a consensus.
Assembly Algorithms

1. Greedy

2. Overlap-layout- consensus
(OLC)

3. De Bruijn Graph

Schatz M C et al. Genome Res. 2010;20:1165-1173

66
Greedy
Was used in the very early next gen assemblers (e.g. SSAKE, VCAKE)
1.The highest scoring alignment takes on another read with the
highest score
2.The paired end reads are used to generate super contigs
3. Mate pairs could also be used to determine contig order

* Repeats can cause big problems in this

approach
67
Imperfect Overlap Between Reads Can Lead to
Incorrect Assembly in the Greedy Approach

Imperfect
overlap

Correct
!

Incorre
ct

Brief Bioinform. 2009 July; 10(4): 354–366.

68
Greedy Extension Leads to Arrested Assembly
if Multiple Matches are Found
Two Unassembled Reads that Match
Contig

Existing
Contig

Can’t Resolve, so Assembly

Stops

69
Overlap Graph or Overlap-layout-consensus
(OLC)

• Perform better overall

• All against all using k-mers as seeds;
Seed & Extend algorithm is used.
• Good for Long reads (e.g. Sanger or
other >100bp, such as 454, Ion Torrent,
PacBio) due to minimum overlap
threshold
• Examples: CABOG (Celera), ARACHNE
• Newbler developed for 454 is based on
OLC and is now being used for
IonTorrent
Overlap Graph or Overlap-layout-consensus
(OLC)
De Bruijn Graph
• It breaks reads into successive k-mers and the graph maps the k-mers
• Each k-mer is a node and edges are drawn between each k-mer in a
read.
• Repeat sequences create a fork in the graph; alternative sequences
create a bubble.
• The k-mer size can only be determined by “trial and error”.
• A small value of K will create a complex graph but a large value of K
may miss small overlaps. A good starting point would be a k-mer size
that is 2/3 the size of the read
• Good for short reads or small genomes. With long reads and/or large
genomes, may require lots of RAM (e.g., ~0.5 TB for human)

Examples are:
Velvet, SOAPdenovo, ALLPATHS-LG,
ABySS
De novo assembly tools
List of de-novo assemblers
Short read assembler
• SPAdes http://cab.spbu.ru/software/spades/
• Velvet European Bioinformatics Institute ;
https://en.wikipedia.org/wiki/Velvet_assembler
• Soapdenovo;
https://www.animalgenome.org/bioinfo/resources/manuals/SO
AP.html
• ABySS; https://github.com/bcgsc/abyss

• ..\De novo sequence assemblers - Wikipedia.html

De novo assembly tools
Assembly quality assessment/
evaluation tools
• QUAST – Quality Assessment Tool for Genome
Assemblies
• evaluate assemblies both with a reference genome, as well as
without a reference.

• Benchmarking Universal Single-Copy Orthologs

(BUSCO)
• is a tool to assess completeness of genome assembly,
gene set and transcriptome. It is based on the concept of
single-copy orthologs that should be highly conserved
among the closely related species.
Evaluating the assembly
 Genome assembly results:
• contig size and number of contigs produced
• scaffold size and number
• N50 and N90

 Coverage
 GC Content
 Genome annotation
• repeats analysis and annotation
• protein-coding gene annotation (including gene structure
prediction and gene function annotation)
• non-coding RNA gene annotation (including annota tion of
microRNA, tRNA, rRNA, and other ncRNA)
• transposon and tandem repeats annotation

 Comparative genomics and evolution (chromosome structure,

conserved gene families)

76
• Quast (QUality ASsesment Tool) , evaluates
genome assemblies by computing various
metrics, including:
1. N50: length for which the collection of all contigs of that
length or longer covers at least 50% of assembly length.
2. L50: The minimum number X such that X longest contigs
cover at least 50% of the assembly
Evaluating the assembly
Basic statistics
N50 the length of the shortest contig such that the sum of contigs of equal length
or longer is at least 50% of the total length of all contigs OR 50% of entire
assembly is contained in contigs or scaffolds equal to or larger than X.
Contig size (bp)
3000

2000 N50
1200
800
600 N90
400
Total: 8000
N90 = the length of the shortest contig such that the sum of contigs
of equal length or longer is at least 90% of the total length of all
contigs.

78
Contig or Scaffold N50
• Most widely used statistic for genome
assemblies
• Measure of contiguity
• Take all contigs and sort them from shortest to
longest. The N50 is the length of the contig for
which half of the assembly is comprised of
contigs at least this length.
• More informative than mean
Contig or Scaffold N50
• 1,1,1,1,1,1,1,1,2,2,3,4,6,6,8,9,9,9,10,24
– Mean = 5
– N50 = 9

• N50 can be manipulated if you eliminate small

contigs
– Which may be useless anyway

• NG50 – uses genome size instead of assembly

length
Choosing a de novo Assembler

Assemblathon 1
• Genome Res. 2011 21: 2224-2241
Genome Assembly Gold-standard Evalutions (GAGE)
• Genome Res. 2012 22: 557-567
• http://gage.cbcb.umd.edu/results/index.html

81
Genome annotation
• Two main levels:
• Structural annotation = Nucleotide-Protein level
annotation – Finding genes and other biologically
relevant sites thus building up a model of genome as
objects with specific locations
• Functional annotation – Objects are used in
database searches (and experiments) aim is
attributing biologically relevant information to whole
sequence and individual objects
• Annotations is rate limiting step of sequencing
projects
Things we are looking to annotate?
• Protein Coding genes
• CDS
• mRNA
• Promoter and Poly-A Signal
• Alternative spliced RNA
• Pseudogenes
• ncRNA
What are genes?
• Complete DNA segments responsible to make functional
products
• Products
• Proteins
• Functional RNA molecules
• miRNA (micro RNA)
• rRNA (ribosomal RNA)
• snRNA (small nuclear)
• snoRNA (small nucleolar)
• tRNA (transfer RNA)
Pseudogenes
• Non-functional copy of a gene
• Processed pseudogene
• Retro-transposon derived
• No 5’ promoters
• No introns
• Often includes polyA tail
• Non-processed pseudogene
• Gene duplication derived
• Both include events that make the gene non-
funtional
• Frameshift
• Stop codons
• We assume pseudogenes have no function, but we
really don’t know!
Noncoding RNA (ncRNA)

• ncRNA represent 98% of all transcripts in a

mammalian cell
• ncRNA have not been taken into account in gene
counts
• cDNA
• ORF computational prediction
• Comparative genomics looking at ORF
• ncRNA can be:
• Structural
• Catalytic
• Regulatory
Genome Annotation Approaches

• Gene Predictions using software

• Identifying Open Reading Frames
• Looking for well studied splice junction sites.
• Training the algorithm on what is already known
about gene structure.
• Experimental evidence
• Sequencing RNA molecules (ESTs)
• Homology to genes in other species that are already
experimentally validated.
Prokaryotic gene model: ORF-genes

• “Small” genomes, high gene density

• Haemophilus influenza genome 85% genic
• Operons
• One transcript, many genes
• No introns.
• One gene, one protein
• Open reading frames
• One ORF per gene
• ORFs begin with start,
• end with stop codon (def.)

TIGR: http://www.tigr.org/tigr-scripts/CMR2/CMRGenomes.spl
NCBI: http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/micr.html
88
The challenge of eukaryotic genomes
4 million bp E. coli Genome

3 billion bp The Human Genome

50% of genome is repeat sequences!

Gene prediction programs

• Rule-based programs
• Use explicit set of rules to make decisions.
• Example: GeneFinder
• Neural Network-based programs
• Use data set to build rules.
• Examples: Grail, GrailEXP
• Hidden Markov Model-based programs
• Use probabilities of states and transitions between
these states to predict features.
• Examples: Genscan, GenomeScan
Common difficulties

● First and last exons difficult to annotate because they

contain UTRs.
● Smaller genes are not statistically significant so they are
thrown out.
● Algorithms are trained with sequences from known genes
which biases them against genes about which nothing is
known.
● Masking repeats frequently removes potentially
indicative chunks from the untranslated regions of genes
that contain repetitive elements.
Tutorials
● https://www.hadriengourle.com/tutorials/
assembly/
● https://colauttilab.github.io/NGS/deNovoT
utorial.html
● https://www.geneious.com/tutorials/de-
novo-assembly/

Problems and Solutions in Medical Physics Diagnostic
No ratings yet
Problems and Solutions in Medical Physics Diagnostic
157 pages
50 Checkpoints Physics JEE 2025
No ratings yet
50 Checkpoints Physics JEE 2025
2 pages
Induction Report
No ratings yet
Induction Report
6 pages
Lecture_28_Unit6_1
No ratings yet
Lecture_28_Unit6_1
16 pages
Homer: Mapping Reads To The Genome
No ratings yet
Homer: Mapping Reads To The Genome
5 pages
Bioinformatics Workshops
No ratings yet
Bioinformatics Workshops
49 pages
2. Sequence alignment
No ratings yet
2. Sequence alignment
25 pages
3 RNAseq-Mapping LO
No ratings yet
3 RNAseq-Mapping LO
98 pages
Day1 Laros RNASeq Galaxy 2012
No ratings yet
Day1 Laros RNASeq Galaxy 2012
40 pages
Assembly 2 BME130 Lec5 v1
No ratings yet
Assembly 2 BME130 Lec5 v1
28 pages
Analysis of RNA-Seq Data
No ratings yet
Analysis of RNA-Seq Data
71 pages
Lecture3 High Throughput Sequencing 2019
No ratings yet
Lecture3 High Throughput Sequencing 2019
68 pages
MBG2004 Genome-Transcriptome Assembly, Annotation and Comparison Week IX
No ratings yet
MBG2004 Genome-Transcriptome Assembly, Annotation and Comparison Week IX
52 pages
NGS ToolsFormats r1 BDG
No ratings yet
NGS ToolsFormats r1 BDG
32 pages
De Novo Assembly of High-Throughput Short Read Sequences: Chuming Chen
No ratings yet
De Novo Assembly of High-Throughput Short Read Sequences: Chuming Chen
38 pages
Blank en Berg Pittsburgh 2011 Ngs
No ratings yet
Blank en Berg Pittsburgh 2011 Ngs
59 pages
Brief Guide For NGS Transcriptomics: From Gene Expression To Genetics
No ratings yet
Brief Guide For NGS Transcriptomics: From Gene Expression To Genetics
120 pages
Sequence Analysis - Alignment
No ratings yet
Sequence Analysis - Alignment
57 pages
Diploma - Practical
No ratings yet
Diploma - Practical
11 pages
Rnaseq Workshop Slides
No ratings yet
Rnaseq Workshop Slides
110 pages
Workshop Practice 1: Reading and Manipulating Short Reads
No ratings yet
Workshop Practice 1: Reading and Manipulating Short Reads
16 pages
Microbial Genome Sequencing Projects
No ratings yet
Microbial Genome Sequencing Projects
23 pages
Same Nva Tting
No ratings yet
Same Nva Tting
22 pages
RNA-Seq and Transcriptome Analysis: Jessica Holmes
No ratings yet
RNA-Seq and Transcriptome Analysis: Jessica Holmes
98 pages
Bio 2
No ratings yet
Bio 2
39 pages
AssemblyStrategies
No ratings yet
AssemblyStrategies
7 pages
Introduction To Different Resources of Bioinformatics and Application PDF
No ratings yet
Introduction To Different Resources of Bioinformatics and Application PDF
55 pages
Sequence Alignment
No ratings yet
Sequence Alignment
29 pages
msa_MTech
No ratings yet
msa_MTech
17 pages
Sequence Alignment and Searching
No ratings yet
Sequence Alignment and Searching
37 pages
RNA-Seq Module 1
No ratings yet
RNA-Seq Module 1
54 pages
CE6068 Lecture 4
No ratings yet
CE6068 Lecture 4
82 pages
Genomic Sequence Alignment
No ratings yet
Genomic Sequence Alignment
25 pages
Lecture 2
No ratings yet
Lecture 2
36 pages
First Lecture
No ratings yet
First Lecture
89 pages
Principles and Problems of de Novo Genome Assembly
No ratings yet
Principles and Problems of de Novo Genome Assembly
42 pages
List of Online Bioinformatics Tools and Software - Final
No ratings yet
List of Online Bioinformatics Tools and Software - Final
23 pages
4.Alignment Notes
No ratings yet
4.Alignment Notes
32 pages
Sequence Alignment
No ratings yet
Sequence Alignment
17 pages
How_To_Map_Billioons_of_Short_Reads_onto_Genomes
No ratings yet
How_To_Map_Billioons_of_Short_Reads_onto_Genomes
3 pages
Unit 3 Sequence Alignment and Phylogenetic Tree
No ratings yet
Unit 3 Sequence Alignment and Phylogenetic Tree
70 pages
Bioinformatics Seminar3rdOct18
No ratings yet
Bioinformatics Seminar3rdOct18
25 pages
BLAST and Sequence Alignment
No ratings yet
BLAST and Sequence Alignment
36 pages
Chap 03 BioInfo
No ratings yet
Chap 03 BioInfo
15 pages
Sequence Alignment Methods
No ratings yet
Sequence Alignment Methods
32 pages
Présentation Ekin en
No ratings yet
Présentation Ekin en
40 pages
Sequence Alignments: Felix Sappelt Irina Wagner
100% (1)
Sequence Alignments: Felix Sappelt Irina Wagner
34 pages
Bioinformatics: Sequence Alignment Methods
No ratings yet
Bioinformatics: Sequence Alignment Methods
32 pages
A Survey of Whole Genome Alignment Tools and Frameworks Based On Hadoop'S Mapreduce
No ratings yet
A Survey of Whole Genome Alignment Tools and Frameworks Based On Hadoop'S Mapreduce
6 pages
Lab03 - Lab Manual
No ratings yet
Lab03 - Lab Manual
16 pages
UNIT IV _ BLAST (1)
No ratings yet
UNIT IV _ BLAST (1)
21 pages
Bio Tools Booklet
No ratings yet
Bio Tools Booklet
5 pages
Documents - Pub Introduction To Next Generation Sequencing and Variant Calling Karin Kassahn
No ratings yet
Documents - Pub Introduction To Next Generation Sequencing and Variant Calling Karin Kassahn
74 pages
02.-Sequence Analysis PDF
No ratings yet
02.-Sequence Analysis PDF
14 pages
DNA Sequences Analysis: Hasan Alshahrani CS6800
No ratings yet
DNA Sequences Analysis: Hasan Alshahrani CS6800
26 pages
Seanmaro 04 Alignment-Workshop
No ratings yet
Seanmaro 04 Alignment-Workshop
26 pages
RNA Seq R - Final Decode
No ratings yet
RNA Seq R - Final Decode
76 pages
Bioinformatics:: Guide To Bio-Computing and The Internet
No ratings yet
Bioinformatics:: Guide To Bio-Computing and The Internet
34 pages
2023-GenomicaFuncional y Biocomputacion-Day1
No ratings yet
2023-GenomicaFuncional y Biocomputacion-Day1
92 pages
Unit 2.1
No ratings yet
Unit 2.1
77 pages
Chapter 2 Bioinformatics
No ratings yet
Chapter 2 Bioinformatics
9 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
Java Concurrency and Multithreading: Unlock the Secrets of Expert-Level Skills
From Everand
Java Concurrency and Multithreading: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
Bioinformatics - Decoding Nature's Secrets With Machine Learning and Algorithms - LimeRocket Media
No ratings yet
Bioinformatics - Decoding Nature's Secrets With Machine Learning and Algorithms - LimeRocket Media
7 pages
MPS Bbtfinal
No ratings yet
MPS Bbtfinal
227 pages
Journal Pone 0289773
No ratings yet
Journal Pone 0289773
25 pages
Scientificreport
No ratings yet
Scientificreport
14 pages
Immunoinformatic For Potential Vaccine Discovery
No ratings yet
Immunoinformatic For Potential Vaccine Discovery
23 pages
Design of A Novel Multi-Epitop
No ratings yet
Design of A Novel Multi-Epitop
19 pages
Propoxur Mechanism of Toxicity
No ratings yet
Propoxur Mechanism of Toxicity
2 pages
BioMedCentral BMC 2019 - 2020
No ratings yet
BioMedCentral BMC 2019 - 2020
18 pages
Darvesh 2008
No ratings yet
Darvesh 2008
13 pages
Running Record Template
No ratings yet
Running Record Template
3 pages
Desalting Engineering
No ratings yet
Desalting Engineering
3 pages
Isbl2 1.1 - 2 VG 58300 11201 3
No ratings yet
Isbl2 1.1 - 2 VG 58300 11201 3
1 page
Aircon Notes
No ratings yet
Aircon Notes
5 pages
DR 10.01 Instructions For Non-Destructive Testing of Welds REV 05 2011-07
No ratings yet
DR 10.01 Instructions For Non-Destructive Testing of Welds REV 05 2011-07
13 pages
បច្ចេកទេសការងារបេតុង
No ratings yet
បច្ចេកទេសការងារបេតុង
37 pages
Emotional Intelligence
No ratings yet
Emotional Intelligence
5 pages
Neville Goddard Wisdom Words
100% (6)
Neville Goddard Wisdom Words
15 pages
5_The Architect and Town Planning Laws_2025
No ratings yet
5_The Architect and Town Planning Laws_2025
36 pages
A Mother
No ratings yet
A Mother
1 page
(Sahbi Hidri (Eds.) ) Revisiting The Assessment of (B-Ok - CC)
100% (1)
(Sahbi Hidri (Eds.) ) Revisiting The Assessment of (B-Ok - CC)
493 pages
Quasi Turbine
100% (2)
Quasi Turbine
30 pages
Cryptography and Network Security: Sixth Edition by William Stallings
No ratings yet
Cryptography and Network Security: Sixth Edition by William Stallings
40 pages
Dela Pena-September 15,2023
No ratings yet
Dela Pena-September 15,2023
11 pages
Team-Based Development Faqs: © 2010 Informatica Corporation
No ratings yet
Team-Based Development Faqs: © 2010 Informatica Corporation
10 pages
Churchmead School Homework
100% (1)
Churchmead School Homework
4 pages
Ax2012 Enus Deviii 01
No ratings yet
Ax2012 Enus Deviii 01
18 pages
Exam Swissquote
No ratings yet
Exam Swissquote
20 pages
Unit 4
No ratings yet
Unit 4
12 pages
Business Analytics in Supply Chain Management
No ratings yet
Business Analytics in Supply Chain Management
9 pages
Ecology and Integrated Pest Management
No ratings yet
Ecology and Integrated Pest Management
26 pages
Arduino DC Motor
No ratings yet
Arduino DC Motor
12 pages
RECO 2040 Construction Project Management I: Elemental Estimate
No ratings yet
RECO 2040 Construction Project Management I: Elemental Estimate
24 pages
The Club of Rome
No ratings yet
The Club of Rome
9 pages
Motor Protection Switches: MS25, MST25, MS20, MST20
No ratings yet
Motor Protection Switches: MS25, MST25, MS20, MST20
6 pages
Maquina de Ex Pandir Tuberia
No ratings yet
Maquina de Ex Pandir Tuberia
56 pages
Kds
100% (1)
Kds
41 pages