0% found this document useful (0 votes)
294 views25 pages

Presentation of Bioinformatics

This document provides an outline and overview of a bioinformatics course. It begins with definitions of bioinformatics and related fields. It discusses the transition to "new biology" with high-throughput data generation. The motivation for bioinformatics is explained as managing the rapid growth and analysis of biological data. Sources of biological data from various omics studies are reviewed. The course plan is then outlined, covering topics like molecular biology, databases, sequence alignment, phylogenetics, microarrays, and networks over 15 lectures.

Uploaded by

Mudassar Samar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
294 views25 pages

Presentation of Bioinformatics

This document provides an outline and overview of a bioinformatics course. It begins with definitions of bioinformatics and related fields. It discusses the transition to "new biology" with high-throughput data generation. The motivation for bioinformatics is explained as managing the rapid growth and analysis of biological data. Sources of biological data from various omics studies are reviewed. The course plan is then outlined, covering topics like molecular biology, databases, sequence alignment, phylogenetics, microarrays, and networks over 15 lectures.

Uploaded by

Mudassar Samar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Introduction

Definitions
Related Fields
The New
Biology
Motivation and
Background

Bioinformatics
Lecture 1

Sources of
Biological
Data
Course Plan

Muhammad Usman Ghani Khan


UET Lahore

Outline

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data

1 Introduction

Definitions
Related Fields
The New Biology
Motivation and Background

Course Plan

2 Sources of Biological Data


3 Course Plan

Definitions
over 43,000 definitions available on internet
Definition 1: Bioinformatics is the application of computer
technology to the management and analysis of biological
data1

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Definition 2: Biologists doing stuff with computers?

Sources of
Biological
Data

Definition 3: The design, construction and use of software


tools to generate, store, annotate, access and analyse data
and information relating to Molecular Biology

Course Plan

* Here we consider the use of Bioinformatics tools rather


than their design and construction
* Here we consider the access and analysis of data and
information items rather than their generation, storage or
annotation
1

European Bioinformatics Institute (EBI)

Definitions
Every application of computer science to biology
Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

* Sequence analysis, images analysis, sample management,


population modeling,

Analysis of data coming from large-scale biological


projects
* Genomes, transcriptomes, proteomes, metabolomes, etc

Solving biological problems with computation?


Collecting, storing and analysing biological data?
Informatics - library science?
But: I do not think all biological computing is
bioinformatics, e.g. mathematical modelling is not
bioinformatics, even when connected with biology-related
problems. In my opinion, bioinformatics has to do with
management and the subsequent use of biological
information, particular genetic information. Richard
Durbin

Definitions

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

What is not bioinformatics?


* Biologically-inspired computation, e.g., genetic algorithms
and neural networks
* However, application of neural networks to solve some
biological problem, could be called bioinformatics
* What about DNA computing?

Related Fields

Computational biology Application of computing to


biology (broad definition)

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

* Often used interchangeably with bioinformatics

Biometry: the statistical analysis of biological data

Sources of
Biological
Data

Biophysics: An interdisciplinary field which applies


techniques from the physical sciences to understanding
biological structure and function2

Course Plan

Mathematical biology tackles biological problems, but the


methods it uses to tackle them need not be numerical and
need not be implemented in software or hardware.

British Biophysical Society

Related Fields

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Computational biology and bioinformatics overlap; both use


computational techniques to try to understand biological
phenomena; but comp biol has more of an emphasis on
mathematical modelling to explain biological mechanisms,
whereas bioinformatics has more to do with the storage and
synthesis of experimental data (eg. pattern recognition and
data mining).

New Biology

Traditional Biology
Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data

Small team working on a specialized topic


Well defined experiment to answer precise questions
New high-throughput biology

Course Plan

Large international teams using cutting edge technology


defining the project
Results are given raw to the scientific community without
any underlying hypothesis

Examples of High Throughput

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Complete genome sequencing


Simultaneous expression analysis of thousands of genes
(DNA microarrays, SAGE)
Large-scale sampling of the proteome
Protein-protein analysis large-scale 2-hybrid (yeast, worm)
Large-scale 3D structure production (yeast)
Metabolism modeling
Biodiversity

Motivation
Rapid growth of biological related data
explosion of publicly available biological materials
Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

* Modern molecular biology and especially genomics has led


to vast quantities of data: DNA/ protein sequence, gene
expression.
* This mainly consists of vast strings/ matrices of letters/
numbers, which in their raw form are not very interesting.

Management problem:
how to handle this data?
* Analysis
* Understand
* Presentation

Approaches:
* Computing techniques are very good for extracting useful
patterns.
* Boinformatics consists of methods to remove these issues.

Motivation

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

In order to extract useful information, it is necessary to


understand biological principles involved.
In this course we will introduce some basic molecular
biology/ genomics and look at ways in which computers
can be used to analyse it.

Motivation

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Sample Ultimate Problems


What is the role of a particular gene?
Does a particular gene help cause a disease?
How does a drug affect a cell?
Can we insert a gene into corn to protect it against
diseases or pests?
Can we design a drug to accomplish a particular purpose?
Can we build a cell that eats pollution?

Motivation

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Why would a student choose this course?


To prepare for graduate study in Bioinformatics or
Computational Biology.
To prepare for certain jobs in the pharmaceutical or
biotechnology industries. The future is hard to predict.
There are jobs related to high-tech agriculture (new
varieties of plants), industrial organisms, biofuels,
pharmaceuticals (designer drugs).

Outline

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data

1 Introduction

Definitions
Related Fields
The New Biology
Motivation and Background

Course Plan

2 Sources of Biological Data


3 Course Plan

So what data can we generate?

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Biological data can be generated at many different levels


Genomics (DNA)
Transcriptomics (RNA)
Proteomics (proteins)
Metabolomics (small compounds)
Lipidomics (lipids)
Hundreds of omics have been catalogued

How an omics dataset looks like?

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

In most cases datasets present a similar structure


Each sample is characteristed by a large number of
variables (RNA, Proteins, lipids, etc.)
Each variable indicates (usually quantitatively) the
presence of that element in the sample
Due to the high cost of most omics technologies, variables
are much more then samples
* Problems of over-fitting

Research Areas

Genome-scale) Sequence Analysis


Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

* Sequence alignments, motif discovery, genome-wide


association (to study diseases such as cancers)

Computational Evolutionary Biology


* Phylogenetics, evolution modeling

Analysis of Gene Regulation


* Gene expression analysis, alternative splicing, protein-DNA
interactions, gene regulatory networks

Structural Biology
* Drug discovery, protein folding, protein-protein interactions

Synthetic Biology
High throughput Imaging Analysis

Outline

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data

1 Introduction

Definitions
Related Fields
The New Biology
Motivation and Background

Course Plan

2 Sources of Biological Data


3 Course Plan

Course Contents
Lecture 1
Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data

Introduction, Definitions.
Applications, Scope, Motivation.
Lecture 2
Molecular biology Introduction

Course Plan

Structure of DNA, RNA, Proteins


Announcement of term projects
Lecture 3
Bioinformatics Databases; Genbank, ENBL, Prot etc.
Practical demonstration of databanks and their structures.

Course Contents
Lecture 4
Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Database Formats; Fasta, seq, Data


Quiz 1
Lecture 5
Sequence Alignment Sequence Motifs; Gene Finding
Practical demonstration of BioJava/.NetBio tools for
biological related tasks
Lecture 6
Sequence Alignment (Part 2)
Computing with Biological Structures

Course Contents
Lecture 7
Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Phylogenetic Algorithms
Lecture 8
Mid-term break
Lecture 9
Microarray Data Analysis
Lecture 10
Term project presentations and discussion

Course Contents
Lecture 11
Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Comparative Genomics
Lecture 12
Proteomics
Lecture 13
Biological Ontologies; Biological Text Mining
Lecture 14
Genetic Networks
Lecture 15
Final Viva and term project submissions

Term Project Ideas

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Architectures and data management techniques for the life


sciences
Query processing and optimization for biological data
Biological data sharing and update propagation
Query formulation assistance for scientists
Modeling of life sciences data
Biomedical data integration issues in eScience
Laboratory information management systems in biology
(including workflow systems)
Quality assurance in integrated data repositories
Biomedical metadata management (including provenance)
Mining integrated life sciences data and text resources
Standards for biomedical data integration and annotation
Scientific results arising from innovative data integration
solutions

Term Project Ideas

Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Exposing biomedical data for integration purposes (APIs,


Linked Open Data, SPARQL endpoints)
Creation and use of clinical data repositories
Data integration in clinical and translational research
Integration of genotypic and phenotypic data
Challenges and opportunities with big data in the life
sciences
Ethical, legal and social issues with biomedical data
integration

Useful Books
Bryan Bergeron M.D: Bioinformatics Computing, Prentice
Hall, 2002 (freely available on internet).
Introduction
Definitions
Related Fields
The New
Biology
Motivation and
Background

Sources of
Biological
Data
Course Plan

Richard C. Deonier, Simon Tavare & Michael S.


Waterman: Computational Genome Analysis an
Introduction, Springer 2005
Some other helpful books
* Alberts et al- Molecular Biology of the Cell
* Stryer- Biochemistry
* Baldi and Brunak Bioinformatics a machine learning
approach
* Durbin, Eddy, Krogh and Mitchison Biological sequence
analysis
* Kanehisa - Post genome informatics
* Lesk- Introduction to bioinformatics
* Orengo, Jones and Thornton - Bioinformatics

You might also like