0% found this document useful (0 votes)
2 views

unit 1

Uploaded by

arsalarslan369
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

unit 1

Uploaded by

arsalarslan369
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Bioinformatics

Bioinformatics is an interdisciplinary scientific field that develops methods


for storing, retrieving, organizing and analyzing biological data.
 A major activity in bioinformatics is to develop software tools to
generate useful biological knowledge.
 Bioinformatics is a management information system for molecular
biology and has many practical applications.
 Bioinformatics is a new discipline that involves molecular biology and
computer science.
 Presently genomic research, sequencing of human genome and advances
in disease related issues have required and helped in developing this at
fast level.
 So we can say that in Bioinformatics, computers are required to store,
retrieve, analyze or predict the composition or the structure of
biomolecules. It is the fascinating hybrid of computer science and
biology.
History

 In 1970 Paulien Hogeweg, coined the term "Bioinformatics" to


refer to the study of information processes in biotic systems.
 One early contributor to bioinformatics was Elvin A. Kabat, who
pioneered biological sequence analysis in 1970 with his
comprehensive volumes of antibody sequences released with Tai
Te Wu between 1980 and 1991.
 Another significant pioneer in the field was Margaret Oakley
Dayhoff. At the beginning of the "genomic revolution", the term
bioinformatics was re-discovered to refer to the creation and
maintenance of a database to store biological information such as
nucleotide sequences and amino acid sequences.
Brief history of bioinformatics: Databases

• The first biological database - Protein Identification Resource


was established in 1972 by Margaret Dayhoff
• Dayhoff and co-workers organized the proteins into families and
superfamilies based on degree of sequence similarity
• Idea of sequence alignment was introduced as well as special
tables that reflected the frequency of changes observed in the
sequences of a group of closely related proteins
• Currently there are several huge Protein Banks : SwissProt, PIR
International, etc.
• The first DNA database was established in 1979. Currently there
are several powerful databases: GenBank, EMBL, DDBJ, etc.
Brief history of bioinformatics: other important
steps

• Development of sequence retrieval methods (1970-80s)


• Development of principles of sequence alignment (1980s)
• Prediction of RNA secondary structure (1980s)
• Prediction of protein secondary structure and 3D (1980-90s)
• The FASTA and BLAST methods for DB search (1980-90s)
• Prediction of genes (1990s)
• Studies of complete genome sequences (late 1990s –2000s)
Collection and retrieval of data. Alignment
methods.

• Sequencing (DNA, proteins)


• Submission of sequences to the databases
• Computer storage of sequences
• Development of sequence formats
• Conversion of one sequence format to another
• Development of retrieval and alignment methods
Prediction, reconstruction and classification

• Prediction of secondary and 3D structure of RNA


and proteins
• Gene prediction in prokaryotes and eukaryotes
• Prediction of promoters and other functional sites
• Reconstruction of phylogeny
• Genome analysis
• Classification of proteins and genes
Aims of bioinformatics
The aims of bioinformatics:
1. First, at its simplest bioinformatics organizes data in a
way that allows researchers to access existing
information and to submit new entries as they are
produced.
2. While data-curation, the process of extraction of
important information from scientific texts, such as
research articles by experts, to be converted into an
electronic format, such as an entry of a biological
database is an essential task, the information stored in
these databases is essentially useless until analyzed.
 The second aim is to develop tools and resources that
aid in the analysis of data.
 For example, having sequenced a particular protein or
DNA it is of interest to compare it with previously
characterized sequences.
 This needs more than just a simple text-based search
and programs such as FASTA and PSI-BLAST must
consider what comprises a biologically significant
match. Development of such resources dictates
expertise in computational theory as well as a thorough
understanding of biology.
 The third aim is to use these tools to analyses the
data and interpret the results in a biologically
meaningful manner. Traditionally, biological
studies examined individual systems in detail, and
frequently compared those with a few that are
related.
Major Bioinformatics Applications

 Genome Annotation,
 Protein Structure Prediction,
 Systems Biology,
 Biomarker Discovery and
 Molecular Epidemiology.
Areas of current and future development of bioinformatics
•Molecular biology and genetics
• Phylogenetic and evolutionary sciences
• Different aspects of biotechnology including pharmaceutical and microbiological
industries
• Medicine
• Agriculture
•Eco-management
Use of internet in bioinformatics
Internet is the most potential tool of this information age and it is
serving as a platform for Bioinformatics tool. It provides the
opportunity to search that information, which was available only by
reaching to the information centre.

Areas of Services
The Internet provides various facilities for Bioinformatics, such as;
• Bioinformatics research • Courses
• Resources
• Biological databases • Construction tools
• Software resources
• WWW search tools • Courses of Bioinformatics
Bioinformatics Applications

Literature Retrieval
Searching PubMed
The literature citation database at the National Center for
Biotechnology Information is called PubMed. Use PubMed
to search journals and other literature on any biological or
chemical item of interest. Full articles are not provided in
this database, only citations and abstracts are available to
view. PubMed central contains full articles.
Nucleotide Applications
Information Retrieval
There are numerous databases around the world containing
information useful for computational biologists. The main
ones are: the National Center for Biotechnology Information
(NCBI), the European Bioinformatics Institute (EBI), and the
DNA Database of Japan (DDBJ). The following applications
are tools which search these sites to find a particular
sequence or to identify a sequence already known.
Sequence Retrieval – Find the nucleotide sequence for a
gene of interest.
Sequence Identification – Find function and possible origin
of gene from a sequence.
Sequence Analysis
With these applications we can align two sequences, align multiple
sequences, and perform phylogenic analyses.
One reason we would do this is to determine what parts of the
sequences are conserved from one species to the next.
Another reason would be to see how much an organism has diverged
from other organisms simply by comparing their DNA sequences.
The more similar two gene sequences are to one another, the more
closely the organisms are related.
And the more dissimilar the two sequences, the farther the two genes
are in relation. With this application we can compare sequences to
determine how organisms have diverged possibly as a result of
evolution.
Single Sequence Alignments – Compares desired
sequence to a database with many sequences in it for
similarity.
Aligning Two Sequences – Compare two sequences with
one another for similarity and % identity.
Multiple Sequence Alignments – Compare multiple
sequences for similarity so that we may conclude %
identity of sequence.
Restriction Enzyme Mapping – Determine cut sites in a
sequence.
Oligo-Primer Properties Calculator - This program will
calculate the melting point temperature and the OD of your
oligo.
Sequence Translation
Computational biologists need to analyze their nucleotide
sequences, and the best way to do that is to study the
protein product. The following programs will either
convert your DNA sequence into an amino acid (protein)
sequence or it will take your protein and convert it into its
complimentary DNA (cDNA) sequences.
Translation – Converts nucleotide sequences into protein
sequences.
Backtranslation – Converts protein sequences into
nucleotide sequences or complimentary DNA (cDNA).
Protein Applications
Information Retrieval
The numerous information retrieval sites on the Internet can
give very valuable information concerning the sequence and
properties of a protein. Numerous databases exist and each
database is accessible through convenient search
programs.
Protein Sequence Retrieval – Allows user to retrieve
sequence from protein name, accession number, or GI
identification number.
Protein Identification – Allows user to retrieve a protein
name or accession and GI numbers from polypeptide
sequence.
Protein Analysis
After obtaining the identity or sequence of a protein, there are several
valuable tools that allow further analysis of the protein. Information
can be obtained concerning the characteristic properties of the
proteins from the sequence. Another valuable tool is sequence
alignment applications that establish the degree of similarity between
two proteins or multiple proteins.
Determining Protein Sequence Properties – User can find
molecular weight (MW), isoelectric point (pI), titration curves,
hydrophobicity etc. for particular protein.
Protein Sequence Alignment – Align a single sequence to sequences
in a database.
Pairwise Sequence Alignment – Align two protein sequences to
each other.
Multiple Sequence Alignment – Align many sequences against a
single sequence.
Bioinformatics is being used in following fields:
 Microbial genome applications
 Molecular medicine
 Personalised medicine
 Preventative medicine
 Gene therapy
 Drug development
 Antibiotic resistance
 Evolutionary studies
 Waste cleanup
 Biotechnology
 Climate change Studies
 Alternative energy sources
 Crop improvement
 Forensic analysis
 Bio-weapon creation
 Insect resistance
 Improve nutritional quality
 Development of Drought resistant
 varieties
 Vetinary Science
Important Bioinformatics Databases
GenBank www.ncbi.nlm.nih.gov nucleotide sequences
Ensembl www.ensembl.org human/mouse
genome (and others)
PubMed www.ncbi.nlm.nih.gov literature references
NR www.ncbi.nlm.nih.gov protein sequences
SWISS-PROT www.expasy.ch protein sequences
InterPro www.ebi.ac.uk protein domains
OMIM www.ncbi.nlm.nih.gov genetic diseases
Enzymes www.chem.qmul.ac.uk enzymes
PDB www.rcsb.org/pdb/ protein structures
KEGG www.genome.ad.jp metabolic pathways
UCSC http://genome.ucsc.edu/ Genome browser
SNPedia http://snpedia.com/index.php/SNPedia
UniProt http://www.uniprot.org/
PRIDE http://www.ebi.ac.uk/pride/archive/
miRBase http://www.mirbase.org/
Internet educational resources for Bioinformatics:

NCBI: sequence data repository, US Bioinformatics center.


http://www.ncbi.nlm.nih.gov/
EBI: sequence data repository, European Bioinformatics center.
http://www.ebi.ac.uk/
Pasteur: France bioinfo. center, Bio Netbook is an excellent database of Internet
information for biosciences, Bioinformatics. http://bioweb. pasteur.fr/intro-uk.html
Bio Netbook
ExPASy/SWISSPROT: protein sequence data center. http://www.expasy.
Sanger: European sequencing, Bioinformatics center. http://www.sanger.ac.uk/
Weizmann: Israel Bioinformatics center. http://bioinformatics.weizmann.ac.il/
GenomeWeb: Bioinformatics resources.
http://www.hgmp.mrc.ac.uk/GenomeWeb/
CSHL: US sequencing, Bioinformatics center. http://www.cshl.org/
WUSTL: US sequencing, Bioinformatics center. http://www.ibc.wustl.edu/
Stanford genome center US sequencing, Bioinformatics center.
http://genome-www.stanford.edu/
TIGR: US sequencing, Bioinformatics center. http://www.tigr.org/
Celera: US commercial sequencing, Bioinformatics center.
http://www.celera.com/
GenomeNet: Japan Bioinformatics center
http://www.genome.ad.jp/
Bionet: Usenet network news for biology. http://www.bio.net/
BioMedNet: Bioinformatics resources including HMS Beagle,
http://www.bmn.com/
BioInform mostly commercial news, services - good list of
companies in Bioinformatics. http://www.bioinform.com/

You might also like