0% found this document useful (0 votes)
82 views33 pages

MBG2004 Introduction - and - Comparative Genomics - Week - I - II

The document discusses computational biology and comparative genomics. It provides information on challenges in computational biology, gene ontology, variant detection methods, molecular evolution, and comparative genomics. Comparative genomics allows comparison of biological information derived from whole genome sequences to study genome evolution and differences between species.

Uploaded by

emirhanseanpaul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views33 pages

MBG2004 Introduction - and - Comparative Genomics - Week - I - II

The document discusses computational biology and comparative genomics. It provides information on challenges in computational biology, gene ontology, variant detection methods, molecular evolution, and comparative genomics. Comparative genomics allows comparison of biological information derived from whole genome sequences to study genome evolution and differences between species.

Uploaded by

emirhanseanpaul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Computational Biology II

MBG2004
Introduction
Assistant Prof. Cemalettin Bekpen

https://tincture.io/biology-really-may-be-our-future-aba5c11d2bdf
Challenges in Computational Biology

Prof. Manolis Kellis


Gene Ontology (GO)
and
Gene Annotation

http://geneontology.org/docs/ontology-documentation/ Some of the slides are adapted from F. Burdet, Arnelia Ireland The UniProt-GOA group
IDEA Gene Set Enrichment Analysis Tool
(New method for Single Cell Transcriptomics)

https://www.nature.com/articles/s41467-020-15298-6.pdf
Variant Detection and Methods
Structural Variations (SVs)
Copy Number Variations (CNVs)

ttps://www.pacb.com/applications/whole-genome-sequencing/variant-detection/ Some of the slides are adapted from Peter N. Robinson


Molecular Evolution

https://academic.oup.com/mbe
https://sciwri.club/archives/7530
Computational for Biological Sciences II
MBG2004
Comparative Genomics
Assistant Prof. Cemalettin Bekpen

Nguyen et al. BMC Genomics 2012, 13:584


What is Comparative Genomics?
It is the comparison of one genome to another.

Genomics DNA (Gene)


Transcription

Transcriptomics RNA
Functional Translation
Genomics
Proteomics PROTEIN
Enzymatic
reaction
Metabolomics METABOLITE
Definition and Development of Comparative Genomics
Comparative genomics can be simply defined as the comparison of biological
information derived from whole-genome sequences. Comparative genomics
therefore began in 1995, when the first two whole organism genomes (for the
bacteria Haemophilus influenzae RD and Mycoplasma genitalium G37) were
published. Very soon thereafter came bioinformatics tools to compare the genome
sequences themselves, and the RNAs, proteins, and gene annotations that can
be derived from them.

These tools are constantly evolving to deal with the exponential proliferation of
sequenced genomes driven by advances in sequencing technology, and to
become more comprehensive and user-friendly. With nearly 2000 genomes now
available and >10 000 in the pipeline (August 2011), the use of comparative
genomic approaches is reaching maturity.

V. de Crécy-Lagard, A. Hanson, in Brenner's Encyclopedia of Genetics (Second Edition), 2013


Increase in the use of comparative genomics methods.

Results of a PubMed search using ‘comparative genomics’ as input. A total of 3124 references were
retrieved, none of which were published before 1995. The breakdown by year is presented, showing an
exponential growth phase followed by a stabilization phase in the past 5 years.

V. de Crécy-Lagard, A. Hanson, in Brenner's Encyclopedia of Genetics (Second Edition),


Comparative Genomics

V. de Crécy-Lagard, A. Hanson, in Brenner's Encyclopedia of Genetics (Second Edition),


Comparative Genomics
Legend for the following figure
(A) Geographical origins. (B) Comparative genome analysis. The outer circle represents the 12 chromosomes of SAT, along with
the densities of genes, DNA transposons, RNA transposons, and other types of genome components labeled as shown in the
color matrix (Left). Moving inward, the five sequenced and assembled genomes are symbolized by different colored circles. The
heat map beside each circle indicates the average number of indels per kilobase in 50 kb-bins for each genome in comparison
to SAT. The inner heat map illustrates the similarity among the six genomes. Blank points show the association
constant dN/dS ratios of entire genes estimated by site models for 2,272 1:1 orthologous gene families. SAT centromere
positions are signified by black triangles (▲). (C) Phylogeny of the six AA-genome species with BRA as an outgroup. Estimates
of divergence time are given at each node, all supported with 100% bootstrap values.
Comparative genomics of the six AA-genome Oryza species.

Zhang et al., 2014


Comparative Genomics

Santos et al., 2010


https://www.23andme.com/
What are some questions that comparative
genomics can address?
How has the organism evolved?
What differentiates species?
Which non-coding regions are important?
Which genes are required for organisms to survive in a certain environment?
Comparisons of Genomes at Different Phylogenetic
Distances Are Appropriate to Address Different Questions

Hardison RC (2003) Comparative Genomics. PLOS Biology 1(2): e58


Comparing the Chromosomes
General features comparison between human chromosome 1 and
equivalent of mouse chromosomes

This shows a comparison of Human Chromosome 1 and equivalent chromosomes in mice.


https://www.eurekalert.org/multimedia/pub/82857.php
The genetic similarity (or homology) of superficially
dissimilar species is amply demonstrated here
The full complement of human chromosomes can be
cut, schematically at least, into about 150 pieces (only
about 100 are large enough to appear in this
illustration), then reassembled into a reasonable
approximation of the mouse genome. The colors of the
mouse chromosomes and the numbers alongside
indicate the human chromosomes containing
homologous segments. This piecewise similarity
between the mouse and human genomes means that
insights into mouse genetics are likely to illuminate
human genetics as well.
Comparing the regulatory regions

General features comparison between orthologous TF OSs.

Y Cheng et al. Nature 515, 371-375 (2014) doi:10.1038/nature13985


Comparing the Number of Segmental Duplication

Human (Build 36) Mouse (Build 37)

48% in human 13% in mouse

(Marques-Bonet T. et al., 2009) (Marques and Eichler CSH Symp. 2009 )


TLR1 TLR2 TLR3

TLR1 TLR2 TLR3 TLR3’

Comparing repetitive part of the genome ?

Problematic, Why ?

You should be really Careful !!!


Comparing the expressed genes
Genome-wide transcriptomics analysis of anatomically dissected regions in
mammalian brains uncovers regional and species-specific expression

Multiple regions of the human, pig, and mouse brain were dissected and analyzed.
A uniform manifold approximation and projection (UMAP) analysis (middle) shows the global expression patterns of 1710 samples in
the human brain, with the cerebellum as the outlier. The HPA Brain Atlas (right) shows the expression of individual genes, for
example, synaptosomal-associated protein 25 (SNAP25), in the different brain regions in the three mammalian species.
Sjöstedt E et al. 2020
Comparative Genomics

Warren et al., 2008


When the genome for the platypus was sequenced (Warren et al., 2008),
comparative genomics was put to its strangest test yet, because the platypus
is a very strange animal. Classified as a mammal because it makes milk and
has fur, the platypus also possesses features of reptiles and birds, such as egg
laying. Furthermore, the animal's mouth physically resembles a duck's bill,
and males can deliver snake-like venom through spurs on their legs.

In platypus DNA, scientists found genes for egg laying—a feature of reptiles—
as well as for lactation—a characteristic of all mammals. The researchers also
noted that genetic sequences responsible for venom production in the male
platypus appear to have arisen from duplications in a group of genes that
evolved from ancestral reptile genomes. Further study of this odd puzzle
piece of a genome will help scientists see the big picture of mammalian
evolution from a novel perspective.
Chromosome evolution at the origin of the ancestral vertebrate genome

Duplication of Whole Genome


1R 2R 3R

Reconstructed evolutionary history of karyotypes from Chordata to Amniota.


On the right, a simplified species tree of the Chordata is shown, with WGD events depicted by red stars. The eight lineages
represented from left to right are mammals, birds, teleost fish, holostocean fish (gar), cartilaginous fish, cyclostomes (lamprey,
hagfish), tunicates (ciona), and cephalochordates (amphioxus). On the left, successive reconstructed karyotypes are shown, with
one color for each of the 17 pre-1R chromosomes. The length of each pre-1R chromosome is proportional to its number of genes.

Sacerdot et al., 2018


Example; Comparative Genomics IRGM Gene

Bekpen et al., 2010


Example; Comparative Genomics LRRC37 Gene Family

Bekpen et al., 2009


Example; Evolution at the regulatory region of the LRRC37 family.

A model for the evolution of promoter regions of LRRC37 is depicted. (A) The LRRC37 family acquired alternative promoter regions, first, from the BPTF
promoter region within the macaque lineage; and, second, from the DND1 promoter after the split between New World and Old World monkeys, respectively.
An additional promoter, which is amplified from human testis and macaque testis tissues, is detected just upstream of the predicted long coding exon
containing the ATG start codon in macaque and human. (B) A schematic repre- sentation of the regulation of gene expression within the LRRC37 family. The
LRRC37 family evolved from testis-specific expression in mouse to ubiquitous or tissue-specific expression, such as cerebellum and thymus, in human.

Bekpen et al., 2009

You might also like