0% found this document useful (0 votes)

17 views

Lecture Bioinfo Databases

Uploaded by

Khushal Khan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Lecture Bioinfo Databases

Uploaded by

Khushal Khan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Who needs to study Bioinformatics?

•What is Bioinformatics?

“Bioinformatics is about searching biological databases, comparing

sequences, looking at protein structures, and (more generally) asking
biological questions with a computer”

•Introduced by French scientist Jean-Michel Claverie in late 80s

(“bioinformatique”)

•Saves you months of work!

Before the era of Bioinformatics

• Only two ways to perform

experiments,

1. In vivo

2. In vitro

• We are now in the age of In

Silico biology!
Bioinformatics is a must do!
Bioinformatics in context
Mathematics/
Genomics computer
science

Molecular
biology Bioinformatics Biophysics

Ethical, legal, and

social implications Molecular
evolution
What does this mean?

• Think of Bioinformatics as a tool!

• Now you are equipped with computational tools to answer biological

questions
The biological foundations of Bioinformatics
• Proteins and Nucleic acids

• Proteins are made up of amino acids

while nucleic acids are made up of
nucleotides

• How best to represent proteins and

nucleic acids?
• Need a formula to describe their
composition
• The identity of the protein is determined
from the composition and the precise
order of amino acids it contains
The Birth of Bioinformatics

• Protein sequences started to accumulate in 1960s

• People started manual comparisons (pre-computer era)

• With the advent of computers, people started to write algorithms from

scratch to analyze “sequence data”

• This was the genesis of bioinformatics

“The holy grail of Bioinformatics”
GCTCCTCACTGTCTGTGTTTATTCTTTTAGCTTCTTCAGA
TCTTTTAGTCTGAGGAAGCCTGGCATGTGCAAATGAAG > 500, 000 genes
TTAACCTAA... sequenced

Expected number of unique

protein structures:
~ 700-1,000
The core of Bioinformatics to date
•Relationships between

TDQAAFDTNIVTLTRFVM
EQGRKARGTGEMTQLLNS
LCTAVKAISTAVRKAGIA
HLYGIAGSTNVTGDQVKK
LDVLSNDLVINVLKSSFA
TCVLVTEEDKNAIIVEPE
KRGKYVVCFDPLDGSSNI
DCLVSIGTIFGIYRKNST
sequence
DEPSEKDALQPGRNLVAA
GYALYGSATMLV

Sequence 3D structure protein functions

•Properties and evolution of genes, genomes, proteins, metabolic

pathways in cells

•Use of this knowledge for prediction, modelling, and design

From sequence to structure

• Proteins adapt a three-

dimensional (3D) structure,
which is functionally important

• Structure is determined by the

composition and order of amino 1. Hydrophobic amino acids (e.g., Valine, Leucine) do not
want to be on the surface
acids in that protein 2. Hydrophilic love to be on the surface to interact with
water (e.g., Serine)
3. Also affected by the electric charge on some residues
and their size
In Short!
• Proteins have a unique order and composition of amino acids, simply
referred to as the ‘sequence’

• Sequence determines the 3D shape of the protein, simply referred to as

the ‘structure’

• Structure determines the molecular activities of proteins, simply referred

to as the ’function’

• Sequence -> Structure -> Function (but not always!)

What about DNA & RNA?
• DNA & RNA are made up of
nucleotide chains

• Nucleotides consist of carbohydrates,

phosphate, and one out of five
nitrogen bases

• Adenine, Guanine, Cytosine,

Thymine, and Uracil or simply A, T,
G, and C
What should be cheaper and faster? DNA/RNA
or protein sequencing?

DNA/RNA sequencing is faster and cheaper simply

because of fewer characters, four nucleotides vs. twenty
What do we mean by complementarity?

T is always facing A, while G is always facing C in one-

to-one reciprocal relationship
How can this knowledge help us?

If we know the sequence of one strand, we can get the

sequence of the other strand
Example
• 5’-ATGCTGA-3’

• What is the complimentary sequence?

• 5’-ATGCTGA-3’
• 3’-TACGACT-5’

• How is this reported?

• 5’-ATGCTGA-3’ and 5’-TCAGCAT-3’
What is a Database?

A database is an organized collection of related information

What are the advantages of using databases?

• Easy and quick retrieval of information

• Provide backup support

Biological Databases
•Need to collect and store biological data and its associated knowledge
into databases

•Fundamental to the survival of science

Two kinds of Biological Databases

1. Primary
• Contain primary sequence information (nucleotide or protein) and associated
annotations

1. Secondary
• Summarize the results from primary databases
Primary Databases

• Nucleotide sequence databases

• Protein sequence databases

Nucleotide Sequence Databases
• Genbank
• Perhaps the best known database
• Contains all publically available annotated DNA sequences
• Exchanges data daily with the DNA Data Bank of Japan (DDBJ) and European
Molecular Biology Laboratory (EMBL)
• Contains roughly 179 million sequence entries (Dec 2014)
• Prior submission of sequence into Genbank/DDBJ/EMBL is a prerequisite for
publishing new sequence in any scientific journal
• Submission is easy and can be done electronically
• Each entry has a unique id known as the “Accession Number (AN)”
Accession number

• A unique identifier of each record in the database

• Usually alpha-numeric in nature

Why do we need accession numbers?
• Common names lead to non-specific results
• A search on “Cytochrome” will output many different types of cytochromes (a,
b, c, and others)

• Cannot distinguish among species

• Search on “Insulin” will return insulin sequences from many organisms
Example Genbank Entry
Secondary Databases
PROSITE
• Sometimes a newly sequenced protein gives no hits to sequence
databases

• How do we determine its function then?

“In some cases, the structure and function of an unknown protein which is
too distantly related to any protein of known structure to detect its affinity
by overall sequence alignment may be identified by its possession of a
particular cluster of residues types classified as a motifs. The motifs, or
templates, or fingerprints, arise because of particular requirements of
binding sites that impose very tight constraint on the evolution of portions
of a protein sequence” - A. M. Lesk, 1988

This Study Resource Was: Cell Division Worksheet
No ratings yet
This Study Resource Was: Cell Division Worksheet
4 pages
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
No ratings yet
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
50 pages
Bioinformatics: Farhan Haq, PHD Department of Biosciences Cui
No ratings yet
Bioinformatics: Farhan Haq, PHD Department of Biosciences Cui
24 pages
CE6068 Lecture 2
No ratings yet
CE6068 Lecture 2
95 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
33 pages
Basics of Bioinformatics
100% (7)
Basics of Bioinformatics
99 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
34 pages
lec-01
No ratings yet
lec-01
93 pages
Lecture 3 Database
No ratings yet
Lecture 3 Database
81 pages
BIF101 FINAL TERM Questions BY Zainab Arshad
No ratings yet
BIF101 FINAL TERM Questions BY Zainab Arshad
34 pages
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
No ratings yet
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
48 pages
BIOINFORMATICS-basic
No ratings yet
BIOINFORMATICS-basic
10 pages
Lecture 01
No ratings yet
Lecture 01
20 pages
1
No ratings yet
1
36 pages
Unit II Major Databases in Bioinformatics
No ratings yet
Unit II Major Databases in Bioinformatics
54 pages
Bio in For Matics
No ratings yet
Bio in For Matics
26 pages
Bioinformatics
No ratings yet
Bioinformatics
47 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
66 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Lecture 5- DataBase
No ratings yet
Lecture 5- DataBase
18 pages
genomicsproteomics-180414063127
No ratings yet
genomicsproteomics-180414063127
46 pages
Lec2 Databases
No ratings yet
Lec2 Databases
135 pages
Biological Information on Artificial Intelligence
No ratings yet
Biological Information on Artificial Intelligence
20 pages
Introduction To Bioinformatics: High-Throughput Biological Data and Evolution
No ratings yet
Introduction To Bioinformatics: High-Throughput Biological Data and Evolution
39 pages
BIF401 Midterm Short Notes
No ratings yet
BIF401 Midterm Short Notes
45 pages
Joint Beca-Ilri Hub, Slu and Unesco Advanced Genomics and Bioinformatics
No ratings yet
Joint Beca-Ilri Hub, Slu and Unesco Advanced Genomics and Bioinformatics
27 pages
Unit 6 - Bioinformatics
No ratings yet
Unit 6 - Bioinformatics
41 pages
BBL 434 - Bioinformatics: D. Sundar
100% (1)
BBL 434 - Bioinformatics: D. Sundar
22 pages
Bioinformatics: Nadiya Akmal Binti Baharum (PHD)
No ratings yet
Bioinformatics: Nadiya Akmal Binti Baharum (PHD)
54 pages
Sequence Analysis &alignment
100% (1)
Sequence Analysis &alignment
2 pages
Class12 Biological Database
No ratings yet
Class12 Biological Database
23 pages
22BIO201: Intelligence of Biological System-1: by Dr. S. S. Kalaivani Amrita School of AI Google Scholar Link
No ratings yet
22BIO201: Intelligence of Biological System-1: by Dr. S. S. Kalaivani Amrita School of AI Google Scholar Link
19 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
19 pages
Data Base in Bioinformatics
No ratings yet
Data Base in Bioinformatics
30 pages
Databases Bioinformatics
No ratings yet
Databases Bioinformatics
42 pages
An Over View of Tics
No ratings yet
An Over View of Tics
24 pages
L2-Centraldogma
No ratings yet
L2-Centraldogma
41 pages
Biological Data Bases
No ratings yet
Biological Data Bases
36 pages
Class04- Biological databases - 2022
No ratings yet
Class04- Biological databases - 2022
14 pages
Bioinformatics: Bioengineering Summer Camp June 2001
No ratings yet
Bioinformatics: Bioengineering Summer Camp June 2001
22 pages
CH12
No ratings yet
CH12
8 pages
Bio in For Matics
No ratings yet
Bio in For Matics
160 pages
Bioinformatics: An Introduction: Dr. Dr. H. Yuwono, M.Biomed. Fakultas Kedokteran Unsri/RSMH Palembang
No ratings yet
Bioinformatics: An Introduction: Dr. Dr. H. Yuwono, M.Biomed. Fakultas Kedokteran Unsri/RSMH Palembang
26 pages
Lecture 1
No ratings yet
Lecture 1
53 pages
BCH 505 Bioinformatics 3(2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3(2 2) Databases
17 pages
Lecture2-DataMining for Bioinformatics
No ratings yet
Lecture2-DataMining for Bioinformatics
7 pages
Introduction To Molecular Biology: EECS 458 Jing Li, Ph.D. Eecs, Cwru
No ratings yet
Introduction To Molecular Biology: EECS 458 Jing Li, Ph.D. Eecs, Cwru
60 pages
Why Bioinformatics?: Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Why Bioinformatics?: Zoya Khalid Zoya - Khalid@nu - Edu.pk
22 pages
Biol BDs Singapore
No ratings yet
Biol BDs Singapore
24 pages
Introduction To Data Mining For Bioinformatics: Fall 2005 Peter Van Der Putten (Putten - at - Liacs - NL)
No ratings yet
Introduction To Data Mining For Bioinformatics: Fall 2005 Peter Van Der Putten (Putten - at - Liacs - NL)
50 pages
Unit 8
No ratings yet
Unit 8
102 pages
Bioinformatics Molecular Biology
No ratings yet
Bioinformatics Molecular Biology
24 pages
Human Genome Project
86% (36)
Human Genome Project
39 pages
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
No ratings yet
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
42 pages
Databases - Final
No ratings yet
Databases - Final
50 pages
Biological Databases PDF
No ratings yet
Biological Databases PDF
13 pages
Lecture 18 Read Watch Learn
No ratings yet
Lecture 18 Read Watch Learn
2 pages
Bioinform-Tica-Pdf-May-6-2010-12-38-Pm-3-5-Meg
No ratings yet
Bioinform-Tica-Pdf-May-6-2010-12-38-Pm-3-5-Meg
105 pages
Introduction To Biological Sequences
No ratings yet
Introduction To Biological Sequences
23 pages
Bioinformatics: Merging Biology and Technology
From Everand
Bioinformatics: Merging Biology and Technology
Mani Devar
No ratings yet
Introduction to Bioinformatics, Sequence and Genome Analysis
From Everand
Introduction to Bioinformatics, Sequence and Genome Analysis
Jerry H. Swift
No ratings yet
Construction of cDNA Library
100% (1)
Construction of cDNA Library
7 pages
Transgenic Animals: Production and Application
No ratings yet
Transgenic Animals: Production and Application
15 pages
UCB MCB 104 Midterm 1 Key 2015
No ratings yet
UCB MCB 104 Midterm 1 Key 2015
11 pages
Marine Oil Spills PDF
No ratings yet
Marine Oil Spills PDF
178 pages
Recombinant DNA Technology: Dr. P. Balaji Head in Biotechnology MGR College, Hosur
100% (1)
Recombinant DNA Technology: Dr. P. Balaji Head in Biotechnology MGR College, Hosur
78 pages
Gmo - PPT Biotech
No ratings yet
Gmo - PPT Biotech
16 pages
Fixing Gene Expression Activity Student Handout
No ratings yet
Fixing Gene Expression Activity Student Handout
4 pages
Biomolecules
No ratings yet
Biomolecules
2 pages
Direct Cardiac Reprogramming: Basics and Future Challenges: Andrianto Andrianto Eka Prasetya Budi Mulia Kevin Luke
No ratings yet
Direct Cardiac Reprogramming: Basics and Future Challenges: Andrianto Andrianto Eka Prasetya Budi Mulia Kevin Luke
7 pages
AP生物
No ratings yet
AP生物
8 pages
Org Chem II Guiding Questions
No ratings yet
Org Chem II Guiding Questions
10 pages
Midterm in BIOCHEM 2020-2021: Reviewer Test
No ratings yet
Midterm in BIOCHEM 2020-2021: Reviewer Test
3 pages
Benzodiazepine Receptors
No ratings yet
Benzodiazepine Receptors
13 pages
The Structure and Composition of The Cell Membrane Relation To Its Function HANDOUTS
No ratings yet
The Structure and Composition of The Cell Membrane Relation To Its Function HANDOUTS
4 pages
Biotech Examp 3Q
No ratings yet
Biotech Examp 3Q
1 page
Huang Et Al 2006
No ratings yet
Huang Et Al 2006
11 pages
As SNAB Revision Notes
No ratings yet
As SNAB Revision Notes
49 pages
Various Expression System
No ratings yet
Various Expression System
15 pages
Mitosis and Meiosis: Comprehension Questions
No ratings yet
Mitosis and Meiosis: Comprehension Questions
2 pages
Lesson Plan - BIO3251 - 2023-24
No ratings yet
Lesson Plan - BIO3251 - 2023-24
3 pages
gate-life-sciences-question-paper-2024-21-45
No ratings yet
gate-life-sciences-question-paper-2024-21-45
25 pages
Xenobiotic Metabolism PDF
No ratings yet
Xenobiotic Metabolism PDF
50 pages
Back to Amido Black: Uncovering touch DNA in blood-contaminated fingermarks
No ratings yet
Back to Amido Black: Uncovering touch DNA in blood-contaminated fingermarks
8 pages
MP Week 3 Gene Transfer Notes
No ratings yet
MP Week 3 Gene Transfer Notes
4 pages
5.1 The Structure of Plasma Membranes Student Notes
No ratings yet
5.1 The Structure of Plasma Membranes Student Notes
24 pages
INDONESIAN JOURNAL OF CLINICAL PATHOLOGY AND MEDICAL LABORATORY Majalah Patologi Klinik Indonesia Dan Laboratorium Medik
No ratings yet
INDONESIAN JOURNAL OF CLINICAL PATHOLOGY AND MEDICAL LABORATORY Majalah Patologi Klinik Indonesia Dan Laboratorium Medik
11 pages
What Causes Rigor Mortis?: Science, Tech, Math
No ratings yet
What Causes Rigor Mortis?: Science, Tech, Math
6 pages
Chromosome Structure - Definition, Function and Examples
No ratings yet
Chromosome Structure - Definition, Function and Examples
9 pages
Mechanism of Blood Clotting Extensic Pathway Factors Affecting Blood Clotting
No ratings yet
Mechanism of Blood Clotting Extensic Pathway Factors Affecting Blood Clotting
18 pages

Lecture Bioinfo Databases

Uploaded by

Lecture Bioinfo Databases

Uploaded by

Who needs to study Bioinformatics?

“Bioinformatics is about searching biological databases, comparing

•Introduced by French scientist Jean-Michel Claverie in late 80s

•Saves you months of work!

• Only two ways to perform

• We are now in the age of In

Ethical, legal, and

• Think of Bioinformatics as a tool!

• Now you are equipped with computational tools to answer biological

• Proteins are made up of amino acids

• How best to represent proteins and

• Protein sequences started to accumulate in 1960s

• People started manual comparisons (pre-computer era)

• With the advent of computers, people started to write algorithms from

• This was the genesis of bioinformatics

Expected number of unique

Sequence 3D structure protein functions

•Properties and evolution of genes, genomes, proteins, metabolic

•Use of this knowledge for prediction, modelling, and design

• Proteins adapt a three-

• Structure is determined by the

• Sequence determines the 3D shape of the protein, simply referred to as

• Structure determines the molecular activities of proteins, simply referred

• Sequence -> Structure -> Function (but not always!)

• Nucleotides consist of carbohydrates,

• Adenine, Guanine, Cytosine,

DNA/RNA sequencing is faster and cheaper simply

T is always facing A, while G is always facing C in one-

If we know the sequence of one strand, we can get the

• What is the complimentary sequence?

• How is this reported?

A database is an organized collection of related information

• Easy and quick retrieval of information

• Provide backup support

•Fundamental to the survival of science

• Nucleotide sequence databases

• Protein sequence databases

• A unique identifier of each record in the database

• Usually alpha-numeric in nature

• Cannot distinguish among species

• How do we determine its function then?

You might also like