0% found this document useful (0 votes)

45 views

A Review Article On Bioinformatics Tools and Software

This document provides an overview of several bioinformatics tools and databases, including: 1) PIR and GeneBank, which are protein and nucleotide databases that allow researchers to submit and analyze biological sequence data. 2) TAIR, a database specific to the model organism Arabidopsis thaliana that contains genome, gene expression, and other data. 3) Descriptions of how these tools work, including submitting data through formats like FASTA, performing BLAST searches to find matching sequences, and accessing bulk data through FTP downloads. The tools provide resources for storing, analyzing, and interpreting large amounts of genomic and proteomic data.

Uploaded by

kashif Waseem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views

A Review Article On Bioinformatics Tools and Software

Uploaded by

kashif Waseem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Page |1

A Review Article on Bioinformatics Tools and Software

Abstract
This is the era of scientific research and discoveries along with inventions. Biological science is
growing at fastest rates as the Bioinformatics come into play with its immense advantages and
applications. Bioinformatics is relatively young science and rapidly growing. There are large
number nucleotide and amino acid data available now and growing exponentially. This large
amount of data was difficult to handle or to analyze. So with the help of computer software
bioinformatics emerged. It helps in storing, analyzing and interpreting this huge biological data
which now called as Proteomics and Genomics. To store this amount of data the need arises for
establishment for Databases. Which store this data via computer programs, and internet ensure
its availability publically worldwide. The most famous data bases are Genebank, NCBI, EMBL,
DDBJ, Swiss-prot, Uniprot, TAIR, GEO etc. These allow the researchers to submit, and analyze
the biological data. Now if this data is not retrieved or analyzed then of no use. Scientists
developed tools and software which enables the researchers to retrieve, analyze and interpret the
data from these databases. Then this data used for number of biological purposes like
Phylogenetic tree construction, drug development and number of other applications in real life.
These tools for comparative studies and structure and function prediction, expression of
biological molecules like DNA and Proteins. Some of which includes Bankit, ENTREZ,
GENEQUIZ, GENSCAN, READSEQ, Modeller, iTASSER. In this review, introduction,
specialty and functioning of the above tools and software, majority of which available freely, and
commonly used in bioinformatics for research and academic purposes.

Introduction
Bioinformatics also called computational biology as it is interdisciplinary field, which come into
existence by the merger of Biology, Statistics, and Computer. The bioinformatics has deep roots,
work on Genomics paced as the DNA structure was discovered by Watson and Crick. The
sequence of 1st gene was determined in 1972 at Laboratory of Molecular Biology of the
University if Ghent Belgium by Walter Fiers et al, it was the gene of MS2 coat protein of
Bacteriophage [1]. Then followed by this same team determined the complete sequence of this
gene in 1974 [2]. The first genome sequence by Sanger in 1977 was bacteriophage DNA [3].
Then in 1995 1st living organism genome was sequenced, H. influenza much faster than previous
one. After this, modifications made in Sanger technique and genome sequencing of different
organisms done up till now at faster pace. In 2001 Human Genome Project completed and
sequenced around 25000 genes of human [4]. Then this area become the main interest of
Bioinformaticians and Biologists they started sequencing different genomes of organisms protein
structures and functions along with gene structure prediction. Computer software and latest
technologies used to build databases of Proteomics and Genomics. Up till now there are number
Page |2

of Prokaryotes, Eukaryotes including plants and animals are sequenced data is stored in
databases. Tools and software also developed for the retrieving, analyzing, interpretation and to
make maximum use of this data.

i. PIR
Protein Information Resource is an online tool for the proteins databases. In 1984
founded by NBRF. It was built to assist the researchers and users to identify and
interpretation of the protein sequence information. It is now PIR-international with
the collaboration of MIPS and JIPD [5].
Specialty
It is comprehensive tool for protein sequence data information. One can easily find
information about any sequence of amino acids. This protein database search tool
provides information about annotated protein sequence, Domain search, combined
global and domain search, and interactive and text searches [6]. Their files also
available by FTP. This tool is available freely on the web online. Can be accessed
from the link http://pir.georgetown.edu/ .
It provides family and super family classification, Search and Analysis of sequences,
similarity search of sequences, and also searching and aligning of the Sequences [7].
Functioning
By clicking the link one can go the PIR official website and there are other multiple
resources available like, iProLink, iPTMnet, Pro protein ontology, UniprotKb.
Input is provided in the form of amino acid sequence like [8]

>P1;CRAB_ANAPL
ALPHA CRYSTALLIN B CHAIN (ALPHA(B)-CRYSTALLIN).
KLVQP MLAAVR DDKKLVQ PMRFTS
DRDPSV YYRNQSEL KRQA HEIPIS
KRSATV PQVLLS QKRPLTV
SDVPERSIPI TREEKPAIAG AQRK*
It is NBRF format. After setting all required parameters and then click on search. It
will search the amino acid sequence on the search button. It will search all possible
annotated sequences and peptides that matching the sequence as in output.

ii. GeneBank
It is an online web-based tool which has large nucleotide database. It is the largest
nucleotide database. It was released in the 1982. First established by Walter Goad and
Los Angeles National Laboratory. This database is maintained by the National Center
for Biotechnology Information in United States. It has exponential growth almost
double every 18 months [9].
Specialty
It has largest nucleotide database, more than 300,000 organism’s nucleotide
sequences are available in this tool. It has more than 150 billion nucleotide and more
Page |3

than 162 million sequences [10]. One can easily find the matching nucleotide
sequences and their annotations. It is built with purpose to facilitate the researches
and for general use to find the matching nucleotide sequences. Submission of
sequences also in the original form and via BankIt and some other tools. It contains
most up to date nucleotide data so NCBI do not restrict the researchers to submit the
data. This tool can be accessed via different resources especially from the official
website https://www.ncbi.nlm.nih.gov/genbank/ . It is primary designed for bulk
submission data like EST, Genome survey Sequence, for the NGS and WGS and
other high through put data form sequencing centers. It has uniform and
comprehensive nucleotide sequence information. Available on internet, free of cost,
for web-based retrieval, analysis services, and also via FTP [11].
Functioning
GeneBank is used to direct submission and via tool submission. Once the sequence is
submitted is thoroughly examined by the staff and for the originality and then
accession number is assigned and make available for the public online. It accepts
simple mRNA or DNA nucleotide sequence, ESTs, STS, and GSS, also in bulk form.
BACs and YACs also cosmid genomic sequence submission done via available tools
to it [12]. Complete microbial genome and Whole Genome Shortgun sequence of
prokaryotes and eukaryotes submission also made.

Sequences can be stored in many formats and different software are used to interpret
them. FASTA is most recognized, accepted and frequently used format for submission. Like >gi|
568815581:c7687550-7668402 Homo sapiens
chromosome 17, GRCh38 Primary Assembly
GATGGGATTGGGGTTTTCCCCTCCCATGTGCT
CATCTAGAGCCACCGTCCAGGGAGCAGGTAGC
TGCTGGGCTCTCCACGACGGTGACACGC--------
Data Retrieving can be done via ENTRENZ system and BLAST sequence similarity
search also done by it [13]. File of data can be received in FASTA format via FTP
and readable formats. Bulk data retrieval done via command line application uses
FTP. PERL and Python are good for biological data retrieval.

iii. TAIR
The Arabidopsis Information Resource is one of the specific model organism
Arabidopsis thaliana data bases [14]. It provides gateway to research by providing
genome and proteome information about the model organism of plant family. It is
comprehensive and widely used database for the research work in plant biology. It
was established in 1999 and funded by the National Science foundation in USA.
Specialty
Information regarding genome, gene expression data, proteomes, variants, mutant
alleles and phenotypes, of plan Arabidopsis thaliana can be accessed from this
organism specific data base. Data base of TAIR also presents sequence
Page |4

polymorphism, Genetic markers, seed stocks, mutants, physical markers, metabolism

and metabolic pathways, gene structure, genome maps, clones for researchers and
public. It also stores information functional and structural annotations and metabolic
pathways. This database is available freely on the web and can be accessed through
https://www.arabidopsis.org/
TAIR has almost 28000 registered users and website is accessed by over 60000
unique visitors per month [15]. Individual gene information can also be obtained in
TAIR.
Functioning
After accessing the webpage of TAIR enter the desired gene for data retrieval. From
top right corner of page select the query sequence you want to find from drop down
list e.g. gene, EST etc. Also enter the name of the gene and then press search. For
example for Stress Responsive gene we do as

And it will give us all the information regarding Stress Responsive gene. We will
click on the one of the desire link of search query page and the gene result will look
like this as output.

Search result data can be downloaded in different formats, genome and annotations
can also be visualized by using tools like SeqViewer and GBrows. And researches
and analysis purposes can be met in this way [16].

iv. GEO
Gene Expression Omnibus is database which store high throughput expression data
which includes functional genomic data sets, obtained by micro-array and sequence
based technologies [17]. It is web based online tool. It store basically gene expression
data. It was created in year 2000 to store microarray gene expression data for the
research community who started producing this data. Moreover the microarray
Page |5

expression data frequently used for number of researches and analysis so GEO data
base provide an easy access to repository. It is maintained and provided by NCBI.
Specialty
It is best online tool for expression data deposit and archiving. The objective being to
facilitate independent evaluation of results, reanalysis, and full access to all parts of
the study. It supports all kind of data archiving like independent evaluation of results
and complete access to all parts of study and processing those results. Most of the
data available in GEO is original data submitted by the researchers. Over 8000
laboratories submitted over .5 million public samples. And GEO contains expression
data of over 1300 different organisms. It can be accessed online from the link
https://www.ncbi.nlm.nih.gov/geo/ , available free of cost for the researchers and
public. The data stored in GEO is MIAME compliant data. Data on GEO can easily
download. It offers simple submission methods and formats which support well
annotated data deposits by researchers.
Functioning
GEO mainly contains data of Microarray expression in the MIAME format and ChIp-
chip data. Simply go to the home page and navigate through required field for data
submission like, Array submission, RT-PCR submission, and high throughput
submissions. GEO has contents like, platform (it describe a specific product set),
series (it is series of sample), samples (individual arrays, and start accession with
GSM), data bases and profiles.

All the data on the GEO can easily downloaded in number of formats. Data is in FTP,
a URL, Entrez GEO data set query download, GEO BLAST query. Bulk data
Page |6

downloading is usually performed through GEO FTP data mining. Others MINiML
and SOFT directories are also used for data downloading. Supplementary and
annotation directories for downloading data over GEO are also available.
v. Sequin
It is software tool used for the submission of sequence data to the GeneBank, DDBJ,
and EMBL. It developed and maintained by the NCBI. It can submit simpler and
complex data to the database [18].
Specialty
It sequence submission tool and can handle simple submission may only contain a
short mRNA sequence and complex submissions like multiple annotations,
phylogenetics and population studies, gapped sequences and other multiple
sequences. It is available as off-line software can be installed on PC very easily. Its
efficiency is very high and produced very good results if Sequin file contains less
than 10.000 sequences. It’s rather slower with larger submissions. It can be
downloaded from the page https://www.ncbi.nlm.nih.gov/projects/Sequin/index.html
downloading through FTP.
Functionality
Get the sequence and open the Sequin software, and click on start the new
submission, now set the parameters and provide other information asked. Sequin can
be used in any of the two modes which are stand-alone and network aware. Sequin
macro send allows the user to send larger files. Some very large sequence file may be
send through Tbl2asn command line program available on Sequin software [18].
vi. BankIt
It is online web-bases tool for the submission of sequence data to GeneBank. It is
maintained and developed by the NCBI [19].
Specialty
A single sequence can be easily submitted through it. Few unrelated sequences or the
sequences with different information and source submitted via it. Smaller batches of
sequences also submitted using the BankIt. It can be accessed by the clicking link
https://www.ncbi.nlm.nih.gov/WebSub/?tool=genbank , its free of cost and available
easily to all researchers and public.
Functioning
Get your sequence for submission and go to the official website of BankIt “Sequence
submission Category”. You will be asked to fill some fields to provide information
prior to sequence submission. Like contact information, nucleotide sequence
information and general information. Submission category like created from the
primary existing data or submission directly sequenced. Source information and PCR
primers fields, and feature information like coding region, exons, intron etc also need
to be filled. Once all the information is provided the BankIt shows last finished flat
file to make corrections if needed. Provide correct accession numbers of gene.
Complete submission as original submission or third party submission of sequence.
Page |7

Primary data is WGS and contig data is not regarded as primary data in submission
queries [19].
vii. Readseq
It is a tool which converts different biosequences in the popular formats like FASTA,
GeneBank and EMBL [20]. Originally it was written in 1989 as sequence analysis
program but a simple command line interface promoted to conversion program for
bioinformatics [21]. It is developed and maintained by EMBL-EBI.
Specialty
There is wide variety of sequence format; we often need to convert one sequence
format into the other, for our convenience. This is the spatiality of Readseq that it
done this for us. Readseq also available off line. Software can be downloaded and
install in the PC. It is available freely on https://www.ebi.ac.uk/Tools/sfc/readseq/
website.
Functionality
Sequence Input: More than one sequences of any of the format from EMBL,
FASTA, GeneBank, GCG, PHYLIP, Swissprot, Uniprot, NBRF and PIR can be given
as input. Partially formatted data is not accepted and data limit is 2Mb. Only a valid
file format sequence can be uploaded [22].
Output Sequence: There are multiple formats supported as output like, EMBL,
GeneBank, FASTA, Clustal, ACEDB, DNA Strider, Flat Feat, GCG, GFF, XML,
PIR, NBRF, MSF and more. Then the case of output letters can also be set, gaps are
removed. After setting all parameters and providing the required information like
email or contact information the file is submitted. The results are sent to email
address and the results sometimes delivered to browser window and ready to
interpret.

viii. Entrez
It is data retrieval tool which is text based and used by the all major databases like
Complete Genome, PubMed, Taxonomy, Protein Structure, Protein Sequence and
many others. It is established in 1991 distributed by NCBI [23]. Initially it was only
consists of PDB and GeneBank nucleotide sequence data.
Specialty
It is global query integrated and retrieval system which provides access to all the data
bases available simultaneously with single query or string. Data retrieval related to
sequence, structure and references can easily be retrieved with Entrez efficiently. It
also provides visuals of protein, gene, and chromosome maps. It is web-based online
tool and freely available for research purposes and public. Can be accessed on NCBI
https://www.ncbi.nlm.nih.gov/Class/MLACourse/Original8Hour/Entrez/index.html
clicking the link.
Functioning
Page |8

The front page of the Entrez provide all access to global query system. It support
Boolean operators and data bases can be searched by providing single query string.
Search tags limits the search to specified fields. The results is the unified page which
contains the search hits in the databases. It also gives links in output to actual search
result in the data bases. The limit field feature allows the user to narrow down its
search results. History feature provide information about the previous performed
queries. The previous queries are referred by the number or combined with Boolean
operators. The result can also be saved in clipboard and MyNCBI account provide
feature of saving indefinite queries. Results can also be emailed. This tool is widely
used in biotechnology for research and study purposes [24].
ix. GeneQuiz
It web-based tool and used for large scale biological sequence analysis. It was 1st
launched in 1993 for automated genome analysis by EBI [25]. As there is large data
available on different databases, so computational analysis become challenge for
researchers and then this tool helps batch sequence analysis and annotation.
Specialty
It can analyze longer sequences and provides single user friendly interface which hide
all the complexities and fully automated web-based tool. For each query sequence it
runs single functional annotation. Which is shown in extensive report allows the user
to track the various aspects of the query in detail. Primary goal of Genequiz is
sequence analysis viewed as automated functional interface and secondary goal is
presentation of the supporting information abstracted from different sequence
analysis. It can process large sequence information in quick and consistent way and
uses the regularly updated sequence databases. One thing must be clear Genequiz
does not process DNA sequence, the DNA sequence must be converted into amino
acid sequence before processing with Genequiz. It only works with protein
sequences. It runs set of analysis that provides functional annotation but it does not
allow user to compare and analyses between different queries. It can be accessed by
the link http://www.sander.ebi.ac.uk/gqsrv/submit online. It is available free of cost
for the researchers to make queries and for study purposes.
Functioning
Genequiz take amino acid sequence as an Input. The sequences can be inserted more
than one. Usually the input sequences are FASTA format and some others also
supported. After submission Genequiz runs automated analysis and look into all
available protein databases like uniprot and swissprot etc., and finally it generates a
web-based report as an output for the user which summarizes the results [26]. It
usually takes five to fifteen minutes per query but it could be time consuming if the
queries are larger and servers being heavily used. There are 4 modules in Genequiz
which are GQreason, GQupdate, GQbrowse, and GQsearch [27]. Genequiz makes
combination of heterogeneous for the reasoning procedure. And uses different
sequence similarity methods in output. Currently FASTA and BLASTP are widely
Page |9

used. It is being more updated by the maintaining authorities for better and efficient
result.
x. GENSCAN
It is software in gene prediction category. Used for gene identification and structure
of gene in genomic DNA. It can also find exons and introns of a gene and their edges,
also used to predict the complete structure of a gene. GENSCAN was developed by
Chris Burg in 1997 [28].
Specialty
It analyzes the genomic DNA sequences from variety of organisms like vertebrates,
invertebrates, human and plants. When a gene sequence provided it determines the
most probable gene structure under the probabilistic approach of gene structure and
composition of genomic data of given organism [29]. It then gives the file on which
exons and genes are printed along with the predicted peptide sequence. It is available
both on web and also an offline version. Can be downloaded and installed in PC.
General feature includes it can predict multiple genes in a sequence, can predict
partial as well as complete genes, and predict continuous sets of genes present on
both strands of the DNA. It uses obvious model parameters to show difference in
gene structure and composition particularly in G+C composition in human genome.
There is data which shows distinct improvement in Genscan results accuracy over
other available tools. Genescan program search against the sequences databases with
BLASTP to detect all possible homologs [30]. GENSCAN web server can be
accessed at link http://genes.mit.edu/genomescan.html . It is available free of cost for
the researcher and study purposes. An offline version can also be downloaded for PC.
Functioning
It functions to predict structure of genes by searching against databases. Genscan
probabilistic model accounts for many essential properties of gene structure and
composition like, number of exons per gene, gene density and the reading frame, and
composition dependent initiation, termination and TATA box and cap signals.
It takes a DNA sequence as an input in specified formats supported by the Genscan.
Usually FASTA format sequence is entered, it also run file format of EMBL,
GeneBank, LOCUS and CDS formats [31].
The output is simply printed on the screen, this out put on the screen can be saved
into a file SEQFILE.out format. The run time is mostly .8 sec/Kb for an average input
file [31]. Following steps involves in its working, first the sequence and parameters
are stored in allocated arrays. Then the sequence is scored using the probabilistic
approach. Then the predicted structure of a gene is displayed on the screen in the
form of a report. The structure of predicted exons and introns on the genes are shown
in the graphical form also [31].
xi. Modeller
It is software used for the homology modeling. Is computer software which produces
models for protein secondary and tertiary structures [32]. It is available online by
P a g e | 10

Sallilab and commonly utilized tool and very powerful tool for homology modeling.
It was initially released in 1989 [32]. Written and maintained by Andrej Sali at
University of California USA [33].
Specialty
It predicts 3D protein 2’ and 3’ prime structure also 4’ from a simple amino acid
sequence. It works on energy minimization principle. It relies on the input query
sequence and target amino acid sequence and the template that’s structure are to be
resolved. It also does the loop modeling for a protein. It ensure the comparative
protein structure modeling by satisfying the spatial arrangements of the atoms [33]
and some other tasks like, loop modeling, optimization of protein structure with
respect to flexibility defined objective function, multiple alignment of protein
structure, clustering, comparison of protein structure etc. It is available as freeware
software to install on PC, Machontosh, Linux operating system and can be accessed
and downloaded from the website of Salilab https://salilab.org/modeller/ , after
installing on the PC its ready to use for homology modelling. It gives automated
protein structures, upon its popularity several third party GUIs are also available for
Modeller, like EasyModeller which also freeware 3rd part GUI for Modeller can be
installed on PC.
Functionality
For its functioning it require 3 things a python Script, A sequence Alignment and a
Template file from PDB as an input. To run it requires a Python script which can be
learned easily. To begin with one has to write a python script an input to modeler, and
then it will be followed by sequences for various proteins and their alignments, which
are (.ali) file for sequence alignment, file [34]. Lastly we have to input the template
structure PDB file which contains alpha carbons and their coordinates. So these are
the three inputs which need to provide to Modeller software. Modeller tries to
substitute the side chain amino acids with other amino acids to minimize the overall
calculated energy from the template structure to create initial model in (.ini) format.
So energy is minimized of the protein structure by creating the entire possible
rotamers configuration. Output structure is provided with lowest energy and most
stable protein 3D structure. Amongst the output we have .log file (log output from the
run), .b file (model generates in PDB), .d file (progress of optimization), .ini file
(initial model generated), .v file (violation profile), .rsr file (restrains in user
format), .sch file (schedule file for optimization process) [35].
xii. iTASSER
Iterative Threading ASSEmbly Refinement is software which is used to predict the
3D proteins structure from amino acid sequence [36]. The process of predicting the
structure also knows as threading. Through threading technique it finds the structure
of given sequence in the PDB. This tool web-based as well as an offline download
version is also available. This tool developed and maintained by Yang Zhang Lab at
the University of Michigan.
P a g e | 11

Specialty
It is the most widely used and successful online software for structure and function
prediction of proteins. It gives high quality 3D structures and predicted functions of
the amino acid sequence provided to it. Use by the researchers and for academic
purposes. In CASP 7 and 8 iTASSER ranked no.1 server among the structure
prediction servers available [37]. It uses Monte Carlo simulations to construct full
protein structures by reassembling the predicted protein fragments from threading
templates. It calculates the best matching scores of amino acid sequence by searching
in PDB, using Z-Scores. It runs the structure and function prediction in as Sequence
to structure to function prediction. It is estimated that around 374891 sequences
predicted to date by the iTASSER server which were submitted by the 91082 user
from 136 countries [38]. This tool can be accessed from the official website of Yang
Zhang Lab by the https://zhanglab.ccmb.med.umich.edu/I-TASSER/ link. It is free
web-bases and downloadable version also available.
Functioning
For the functioning iTasser requires folds from library (available publically), Amino
acid sequences (that’s structure is to be predicted) and a scoring function (which is
developed by the iTASSER team) as an Input, and the output will be a predicted
structure. It will assemble various folds by threading sequences against those folds
and built a tertiary structure. The software will do this in an automated way.
Input: It starts by providing an amino acid sequence usually from 10 to 1500
residues. Just copy paste the sequence in the provided space on the tool. The provided
amino acid sequence should be FASTA format and one can also upload a file of
FASTA. Then set other optional parameters and provide contact information so that
the predicted structure can be sent to email address after some time may be after a
day. After just click on the “Run I-TASSER” to run the software and to get output
[34].
Working: Taking input iTASSER will go and generate all 3D atomic models for
multiple threading alignments. It will do this automatically through the server and
create iterative structural assembly simulations. These assemblies will be evaluated
then. Then it will predict the function by matching the predicted 3D structural models
with already known protein structures and their functions.
Output: iTASSER will output full length secondary and tertiary 3D structures along
with their functional annotations on their ligand binding sites. An estimate of
accuracy is also an output which is represented by the confidence score of the
modeling.

Conclusion
As from the above discussion it is obvious that Bioinformatics has wide application range. And it
has large biological data which implies to functional and structural research and studies. And to
P a g e | 12

use and store this data tools and software help us in this regard. In this review specificity and
functioning of some important tools and software discussed which are commonly used in
bioinformatics. As Bioinformatics is growing field some more advancements are being made in
the utilizing of the Biological data. Tools and software are being modified and updated.
Efficiency and accuracy of tools is more improving with the passage of time. It is predicted that
in coming years bioinformatics will rule the science. It will be easy to diagnose and treat many
incurable diseases. The treatment and diagnose will be based on the genomic and proteomic
studies. The trend of personalized medicines will prevail in future. Biotechnology will be
assisted by bioinformatics in plant (crop) technology.

References
1. Min Jou W, Haegeman G, Ysebaert M, Fiers W., (1972) “Ncleotide sequence of the gene
coding for the bacuteriophage MS2 coat protein,”
2. Fiers W et al., (1976) “Complete nucleotide-sequence of bacteriophage MS2-RNA
primary and secondary structure of replicase gene”
3. Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes CA, Hutchison CA,
Slocombe PM, Smith M., (1977) “Nucleotide sequence of bacteriophage DNA” ,Nature
4. IHGSC (2004). "Finishing the euchromatic sequence of the human genome.”

5. http://pir.georgetown.edu/ Official website of PIR at Georgetown University.

6. Wu, C.; Nebert, D. W. (2004). "Update on genome completion and annotations: Protein
Information Resource”.
7. Winona C. Barker; John S. Garavelli, Hongzhan Huang, Peter B. McGarvey (1999). “The
Protein Information Resource (PIR)”.

8. http://www.bioinformatics.nl/tools/crab_pir.html : from the website PIR format

description

9. Benson D; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Sayers, E. W.; et al.
(2009). "GenBank". Nucleic Acids Research. 37 (Database): D26–D31.
10.GenBank release notes". NCBI
P a g e | 13

11.Wheeler D.L., Barrett,T., Benson,D.A., Bryant,S.H., Canese,K., Church,D.M.,

DiCuccio,M., Edgar,R., Federhen,S., Helmberg,W. et al. (2005) Database resources of
the National Center for Biotechnology Information. Nucleic Acids Res.
12.Smith M.W., Holmsen,A.L., Wei,Y.H., Peterson,M. and Evans,G.A. (1994) Genomic
sequence sampling: a strategy for high resolution sequence-based physical mapping of
complex genomes. Nature Genet
13.Wheeler D.L., Barrett,T., Benson,D.A., Bryant,S.H., Canese,K., Church,D.M.,
DiCuccio,M., Edgar,R., Federhen,S., Helmberg,W. et al. (2005) Database resources of
the National Center for Biotechnology Information. Nucleic Acids Res
14.Lamesch, P; Berardini, TZ; Li, D; Swarbreck, D; Wilks, C; Sasidharan, R; Muller, R;
Dreher, K; Alexander, DL; Garcia-Hernandez, M; Karthikeyan, AS; Lee, CH; Nelson,
WD; Ploetz, L; Singh, S; Wensel, A; Huala, E (2012). "The Arabidopsis Information
Resource (TAIR): improved gene annotation and new tools
15. "TAIR Google Analytics". Google Analytics. Retrieved 12 May 2015
16.Berardini, TZ; Mundodi, S; Reiser, L; Huala, E; Garcia-Hernandez, M; Zhang, P;
Mueller, LA; Yoon, J; Doyle, A; Lander, G; Moseyko, N; Yoo, D; Xu, I; Zoeckler, B;
Montoya, M; Miller, N; Weems, D; Rhee, SY (2004). "Functional annotation of the
Arabidopsis genome using controlled vocabularies". Plant Physiology
17.Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA,
Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL,
Serova N, Davis S, Soboleva A. (2013) “NCBI GEO: archive for functional genomics
data sets—update”. Nucleic Acids Res.
18. https://www.ncbi.nlm.nih.gov/projects/Sequin/index.html “NCBI official website”
19. “The Gene Bank Submissions Handbook”, 2015
20.Sequence files format conversion with command-line readseq, (2003). Current protocols
in bioinformatics
21.http://iubio.bio.indiana.edu/soft/molbio/readseq/java/Readseq2-help.html ; “Readseq
Extended Help” 1999.
22.The EMBL-EBI bioinformatics web and programmatic tools framework.(2006); “Nucleic
acids research”
P a g e | 14

23. NCBI Resource Coordinators (2012). "Database resources of the National Center for
Biotechnology Information". Nucleic Acids Research
24.Fishel R, Lescoe MK, Rao MR, Copeland NG, Jenkins NA, Garber J, Kane M, Kolodner
R. (1993) “The human mutator gene homolog MSH2 and its association with hereditary
nonpolyposis colon cancer”.
25.Andrade, M.A., N.P. Brown, C. Leroy, S. Hoersch, A. de Daruvar, C .Reich, A.
Franchini, J. Tamames, A. Valencia, C. Ouzounis, and C. Sander. 1999. “Automated
genome sequence analysis and annotation”. Bioinformatics. 15, 391-412
26.Scharf, M. et al. 1994; “GeneQuiz: a workbench for sequence analysis”
27. Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) Basic local
alignment search tool. J. Mol. Biol., 215, 403–410.
28. Burge, C. and Karlin, S. (1997) Prediction of complete gene structures in human genomic
DNA.
29. Burge, C. B. (1998) Modeling dependencies in pre-mRNA splicing signals. In Salzberg,
S., Searls, D. and Kasif, S., eds. Computational Methods in Molecular Biology, Elsevier
Science, Amsterdam,
30. Burge, C. B. and Karlin, S. (1998) Finding the genes in genomic DNA
31. Burge, C. & Karlin, S. (1997) Gene structure, exon prediction, and alternative splicing
32. Fiser A, Sali A (2003). "Modeller: generation and refinement of homology-based protein
structure models". Meth. Enzymol. 374: 461–91

33. A. Sali & T.L. Blundell. (1993) “Comparative protein modelling by satisfaction of spatial
restraints”.

34. Fiser, A., R. K. Do, and A. Sali. (2000). “Modeling of loops in protein structures.”
Protein Sci..
35. Karlin, S. & Altschul, S. F. (1990). Proc. Natl. Acad. Sci. USA,
36. Roy A, Kucukural A, Zhang Y (2010). "I-TASSER: a unified platform for automated
protein structure and function prediction
37. Battey, JN; et al. (2007). "Automated server predictions in CASP7". Proteins
38. https://zhanglab.ccmb.med.umich.edu/I-TASSER/ (2018, Jan 23). “ official website”

Bioinformatics Pratical File
No ratings yet
Bioinformatics Pratical File
63 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Bioinfi U3 Part -1
No ratings yet
Bioinfi U3 Part -1
4 pages
Basics of Bioinformatics in Biological Research
No ratings yet
Basics of Bioinformatics in Biological Research
5 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
CH12
No ratings yet
CH12
8 pages
Lecture 5- DataBase
No ratings yet
Lecture 5- DataBase
18 pages
Basics of Bioinformatics in Biological Research
No ratings yet
Basics of Bioinformatics in Biological Research
5 pages
Generating Structural Data Analysis
No ratings yet
Generating Structural Data Analysis
8 pages
BCH 505 Bioinformatics 3(2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3(2 2) Databases
17 pages
Zoya Bioinformatics Assignment
No ratings yet
Zoya Bioinformatics Assignment
36 pages
List of Biological Databases
No ratings yet
List of Biological Databases
9 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
Index: Auroras Technological and Research Institute
No ratings yet
Index: Auroras Technological and Research Institute
56 pages
BIOINFORMATICS - eNOTES
No ratings yet
BIOINFORMATICS - eNOTES
23 pages
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
No ratings yet
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
9 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
Bioinformatics
No ratings yet
Bioinformatics
47 pages
Fundamentals of Bioinformatics
No ratings yet
Fundamentals of Bioinformatics
40 pages
Bio PPT
No ratings yet
Bio PPT
35 pages
Biological Databases
No ratings yet
Biological Databases
13 pages
Biological Databases PDF
No ratings yet
Biological Databases PDF
13 pages
Manual
No ratings yet
Manual
68 pages
Biological Data and Database Biological Data
No ratings yet
Biological Data and Database Biological Data
10 pages
Bioinformatics Note
No ratings yet
Bioinformatics Note
7 pages
Bioinform-Tica-Pdf-May-6-2010-12-38-Pm-3-5-Meg
No ratings yet
Bioinform-Tica-Pdf-May-6-2010-12-38-Pm-3-5-Meg
105 pages
Bio Hist1586267617
No ratings yet
Bio Hist1586267617
8 pages
Biological Databases: - Bio-Informatics
No ratings yet
Biological Databases: - Bio-Informatics
16 pages
Day 1
No ratings yet
Day 1
38 pages
System Biology Assignment
No ratings yet
System Biology Assignment
17 pages
Bioinformatics Overview
100% (1)
Bioinformatics Overview
18 pages
BIO 316_0
No ratings yet
BIO 316_0
43 pages
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
No ratings yet
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
48 pages
ok
No ratings yet
ok
29 pages
Bif501 Handouts PDF Bif
No ratings yet
Bif501 Handouts PDF Bif
197 pages
Capture D'écran . 2023-03-14 À 00.15.22
No ratings yet
Capture D'écran . 2023-03-14 À 00.15.22
54 pages
Sec1 Introduction to Bioinformatics
No ratings yet
Sec1 Introduction to Bioinformatics
20 pages
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
No ratings yet
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
12 pages
Introduction To Bioinformatics (Databases)
No ratings yet
Introduction To Bioinformatics (Databases)
28 pages
Database Dalam Bioinformatika
No ratings yet
Database Dalam Bioinformatika
34 pages
Bioinformatics1
No ratings yet
Bioinformatics1
37 pages
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
100% (2)
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
268 pages
Xu GMX 9 D JN
No ratings yet
Xu GMX 9 D JN
270 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
No ratings yet
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
11 pages
Database
No ratings yet
Database
40 pages
National Center For Biotechnology Information
No ratings yet
National Center For Biotechnology Information
4 pages
Lab 1
No ratings yet
Lab 1
39 pages
Bio Informatics
No ratings yet
Bio Informatics
46 pages
Unit I
No ratings yet
Unit I
28 pages
Databases Bioinformatics
No ratings yet
Databases Bioinformatics
42 pages
Lecture2-DataMining for Bioinformatics
No ratings yet
Lecture2-DataMining for Bioinformatics
7 pages
I Am Sharing 'Document (2) ' With You
No ratings yet
I Am Sharing 'Document (2) ' With You
36 pages
Biological Database 1
No ratings yet
Biological Database 1
50 pages
Bioinformatics Lab Notebook: Comsats University, Islamabad
No ratings yet
Bioinformatics Lab Notebook: Comsats University, Islamabad
27 pages
Biol BDs Singapore
No ratings yet
Biol BDs Singapore
24 pages
List of Biological Databases
100% (1)
List of Biological Databases
8 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
BBX 120
No ratings yet
BBX 120
15 pages
BIO2 11 - 12 Q4 0706 Establishing Species Relationships Using A Cladogram and Phylogenetic Tree
No ratings yet
BIO2 11 - 12 Q4 0706 Establishing Species Relationships Using A Cladogram and Phylogenetic Tree
55 pages
Analyst Cover Letter
No ratings yet
Analyst Cover Letter
2 pages
Biostat Practice 23 07 Categorical
No ratings yet
Biostat Practice 23 07 Categorical
18 pages
Introduction To Biostatistics Student Lecture Notes
100% (2)
Introduction To Biostatistics Student Lecture Notes
130 pages
Brochure - Parul University
No ratings yet
Brochure - Parul University
2 pages
Fundamental Concepts for New Clinical Trialists Scott Evans pdf download
100% (7)
Fundamental Concepts for New Clinical Trialists Scott Evans pdf download
77 pages
jumper-lecture
No ratings yet
jumper-lecture
15 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
7 pages
Applications of Bioinformatics
No ratings yet
Applications of Bioinformatics
19 pages
Lecture 5 Epidemiology and Biostatistics
No ratings yet
Lecture 5 Epidemiology and Biostatistics
20 pages
Diagnosis Worksheet: Page 1 of 2 Citation
No ratings yet
Diagnosis Worksheet: Page 1 of 2 Citation
2 pages
Essentials of Biostatistics - Second Edi
0% (2)
Essentials of Biostatistics - Second Edi
13 pages
Biostatistics CN
No ratings yet
Biostatistics CN
79 pages
Bioinformatics: ABE 2007 Kent Koster Group 3
No ratings yet
Bioinformatics: ABE 2007 Kent Koster Group 3
43 pages
WJBPHS 2023 0128
No ratings yet
WJBPHS 2023 0128
6 pages
Design and Analysis of Bioavailability and Bioequivalence Studies 3rd edition Chapman Hall Crc Biostatistics Series Shein-Chung Chow pdf download
100% (1)
Design and Analysis of Bioavailability and Bioequivalence Studies 3rd edition Chapman Hall Crc Biostatistics Series Shein-Chung Chow pdf download
45 pages
Biostatistics (Introduction)
No ratings yet
Biostatistics (Introduction)
66 pages
Bio PPT
No ratings yet
Bio PPT
14 pages
Phs 813 Medical Statistics
No ratings yet
Phs 813 Medical Statistics
238 pages
Chapter-1 (Introduction To Biostatistics)
No ratings yet
Chapter-1 (Introduction To Biostatistics)
30 pages
John Moult, Krzysztof Fidelis, CASP
No ratings yet
John Moult, Krzysztof Fidelis, CASP
4 pages
Statistician Resume
100% (1)
Statistician Resume
4 pages
Statistic
No ratings yet
Statistic
12 pages
2M Biostatistics & Research Methodology ans pdf
No ratings yet
2M Biostatistics & Research Methodology ans pdf
17 pages
Bio in For Matics
No ratings yet
Bio in For Matics
26 pages
01 - BIOE 211 - Nature of Statistics and Data Processing
No ratings yet
01 - BIOE 211 - Nature of Statistics and Data Processing
26 pages
PAM Abd BLOSUM
No ratings yet
PAM Abd BLOSUM
3 pages
Biostatistics Module
No ratings yet
Biostatistics Module
79 pages

A Review Article On Bioinformatics Tools and Software

Uploaded by

A Review Article On Bioinformatics Tools and Software

Uploaded by

Page |1

A Review Article on Bioinformatics Tools and Software

polymorphism, Genetic markers, seed stocks, mutants, physical markers, metabolism

5. http://pir.georgetown.edu/ Official website of PIR at Georgetown University.

8. http://www.bioinformatics.nl/tools/crab_pir.html : from the website PIR format

11.Wheeler D.L., Barrett,T., Benson,D.A., Bryant,S.H., Canese,K., Church,D.M.,

You might also like