Supercomputing & Computational Biology
Presented By: Ruchi Sharma, Swati Agarwal, Jayati Arora,
Vivek Goyal, Satyabrata Sahu
What is Bioinformatics?
Bioinformatics involves the integration of computers, software tools, and databases
in an effort to address biological questions.
Bioinformatics, Computational biology &
Supercomputing
Bioinformatics applies principles of information
sciences and technologies to make the vast, diverse,
and complex life sciences data more understandable
and useful.
Computational biology uses mathematical and
computational approaches to address theoretical and
experimental questions in biology.
Supercomputing is the use supercomputers in biology.
Pioneer of Bioinformatics?
Dr. Margaret Oakley Dayhoff
Industry Overview
Global Bio-Informatics Market –1.2bn In 2004
Overall Market Size Projected to reach USD 1.7
bn by 2006
Indian Bioinformatics Market has touched USD22
Million in 2004
Indian Bioinformatics is growing at the rate of
40% per annum
(Source: CII)
BIOGRID INDIA
Internet
MK Univ.,
Madurai IGIB, Delhi
South
Campus,
DU, Delhi
Service Providers
VPN Network
IMT,
Chandigarh Deptt. Of
Biotecdhnology, Delhi
Univ. of Pune
IIT, , Delhi
CDFD, Hyderabad
NBRC, NII, Delhi
IISc, Gurgaon
Router Bangalore JNU, Delhi
Components of Bioinformatics?
Biological Data
Hardware
Software
Databases
Biological Data
• Biology: a data-rich science, the need for
storing and communicating large datasets has
grown tremendously.
• Type of data
nucleotide sequences
protein sequences
proteins sequence patterns or motifs
macromolecular 3D structure
gene expression data
metabolic pathways
Bioinformatics Hardware
Compute Clusters (collection of computers that are highly
interconnected via a high-speed network)
Login Servers (provide a single secure login point for access to
all other machines)
File Servers (TB of space and is reserved for projects and
database hosting)
Backup Servers
Web Servers
Master & Fail-over Servers (for user authorization &
authentication)
Memory Servers (purpose is to run programs which require
huge amounts of memory)
Workstations
Storage Media
Bioinformatics Hardware
Compute Cluster
Softwares
Operating System:
Linux: Debian and Ubuntu
UNIX: Solaris and IRIX
Bioinformatics Software: used for visualization and
analysis of biological information.
ChemGenome 2.0
PreDDICTA
Bhageerath
Sanjeevini
Trend of OS used
Databases
Primary or derived data
Primary databases: experimental results directly into
database
Secondary databases: results of analysis of primary databases
Aggregate of many databases
Links to other data items
Combination of data
Consolidation of data
Technical design
Flat-files
Relational database (SQL)
Object-oriented database (e.g. CORBA, XML)
Databases
There are two main functions of biological databases:
1.Make biological data available to scientists.
As much as possible of a particular type of information should be available in
one single place (book, site, database). Published data may be difficult to find
or access, and collecting it from the literature is very time-consuming. And
not all data is actually published explicitly in an article (genome sequences!).
2.To make biological data available in computer-readable form.
Since analysis of biological data almost always involves computers, having
the data in computer-readable form (rather than printed on paper) is a
necessary first step.
One of the first biological sequence databases was probably the book "Atlas
of Protein Sequences and Structures" by Margaret Dayhoff and colleagues,
first published in 1965. It contained the protein sequences determined at the
time, and new editions of the book were published well into the 1970s. Its
data became the foundation for the PIR database (Protein Information
Resource).
Research Methodology
Place of Research
SCFBio, IIT, New Delhi
Sources Of Data Collection
Primary (Questionnaires & Interview)
Secondary (Internet & Books)
Sample Size
15
Research
Results
Most of the software are developed in what field?
Genome analysis
Nucleotide databases
Protein structure prediction
Drug designing
0 1 2 3 4 5 6
Has software made bioinformatics easier?
yes
no
can't say
Has supercomputing made research and
development in biology easier?
7
yes no can't say
Have you heard about Super
Computing Facility for Bio-informatics
& Computational Biology?
11%
22%
yes
no
can't say
67%
Have you bought any software by SCFBCB?
yes
no
can't say
Research Analysis
•Drug discovery is a new area in which India is developing with
maximum research going on in this field.
•As per our survey, India is economical in terms of
implementation. Here software development is cost effective.
Initial cost will be high but once the development cost will be
recovered, it will be proved as a very cost effective technology.
Hence it can be concluded that in our country, there is great
potential for the software development.
•Scientists also believe that this technology has helped a lot in
bioinformatics. The development in IT sector is contributing in
development of Bioinformatics to a great extent.
Recommendations
MORE INVESTMENT SHOULD TAKE PLACE IN
RESEARCH IN BIOINFORMATICS AND IT
DEVELOPMENT SO THAT WE CAN HAVE BETTER
SOFTWARES TECHNOLOGY.
NO DOUBT THE COST OF IMPLEMENTATION IS
REASONABLE IN INDIA BUT COST OF
DEVELOPMENT IS HIGH WHICH PULLS BACK THE
R&D IN THIS SECTOR.HENCE STEPS SHOULD BE
TAKEN TO BRING DOWN THE COST OF
DEVELOPMENT AS WELL.
thank
you