Overview of Ontologies: Ontology Is Defined As A "Formal Specification of A Conceptualization."
Overview of Ontologies: Ontology Is Defined As A "Formal Specification of A Conceptualization."
There's an Endless Variety of World Views, and Almost as Many Ways to Organize and Describe Them The root of the term is the Greek ontos , or being or the nature of things and the nature of existence. Tom Gruber, among others, made the term popular in relation to computer science and artificial intelligence about 15 years ago.
History of ontology
Theory of being as such. It was originally called first philosophy by Aristotle. In the 18th century Christian Wolff contrasted ontology, or general metaphysics, with special metaphysical theories of souls, bodies, or God, claiming that ontology could be a deductive discipline revealing the essences of things. This view was later strongly criticized by David Hume and Immanuel Kant. Ontology was revived in the early 20th century by practitioners of phenomenology and existentialism, notably Edmund Husserl and his student Martin Heidegger. In the English-speaking world, interest in ontology was renewed in the mid20th century by W.V.O. Quine; by the end of the century it had become a central discipline of analytic philosophy. idealism; realism; universal.
Hierarchy in ontology
Relations
Attributes
Name : Ford Explorer Number-of-doors : 4 Engine : {4.0L, 4.6L} Transmission : 6-speed
Ontologies in Biology
The Protein Ontology (PO) provides a unified vocabulary for capturing declarative knowledge about protein domain and to classify that knowledge to allow reasoning. SBO is the Systems Biology Ontology project, another cornerstone of the BioModels.net effort. The goal of SBO is to develop Controlled vocabularies and ontologies tailored specifically for the kinds of problems being faced in Systems biology, especially in the context of computational modeling. The main objective of the Plant Ontology Consortium (POC) is to develop, curate and share controlled vocabularies (ontologies) that describe plant structures and growth and developmental stages, providing a semantic framework for meaningful cross-species queries across databases. The Gene Ontology project, or GO, provides a controlled vocabulary to describe gene and gene product attributes in any organism. It can be broadly split into two parts. The first is the ontology itself--actually three ontologies, each representing a key concept in Molecular Biology: the molecular function of gene products; their role in multi-step biological processes; and their localization to cellular components.
GO example
WHAT GO IS NOT
1. GO is not a way to unify biological databases. Sharing nomenclature is a step toward unification, but is not, in itself, sufficient. 2. GO is not a dictated standard, mandating nomenclature across databases. Groups participate because of self-interest and cooperate to arrive at a consensus. 3. GO does not define homologies between gene products from different organisms. The use of the GO results in shared annotations for gene products from different organisms, and this may reflect an evolutionary relationship, but the shared annotation is in itself not sufficient for such a determination. 4. GO does not allow us to describe genes in terms of which cells or tissues they're expressed in, which developmental stages they're expressed at, or their involvement in disease. It is not necessary for GO to do these things because other ontologies are being developed for these purposes.
OBO is n umbrella organization for structured shared controlled vocabularies and ontologies for use within the genomics and proteomics domains. Of the criteria that ontologies must currently satisfy if they are to be included in the OBO library, the most important for our purposes are: first, inclusion of textual definitions or descriptions designed to ensure that the precise meanings of terms as used within particular ontologies will be clear to a human reader; second, employment of a standard syntax, such as the OWL or OBO flatfile syntax; third, orthogonality to the other ontologies already included in the library. These criteria are designed to support the integration of OBO ontologies, above all by ensuring the compatibility of ontologies pertaining to an identical subject matter. OBO has now added a fourth criterion to assist in achieving such compatibility, namely that the relations (edges) used to connect terms in OBO ontologies should be applied in ways consistent with their definitions as set forth in this paper. The Relation Ontology offered here is designed to put flesh on this criterion. How, exactly, should part_of or located_in be defined in order to ensure maximally reliable curation of each single ontology while at the same time guaranteeing maximal leverage in building a solid base for life-science knowledge integration in general? We describe a rigorous methodology for providing an answer to this question and illustrate its use in the construction of an easily extendible list of ten relations of a type familiar to those working in the bio-ontological field. This list forms the core of the new OBO Relation Ontology. What is distinctive about our methodology is that, while the relations are each provided with rigorous formal definitions, these definitions can at the same time be formulated in such a way that the underlying technical details remain invisible to ontology authors and curators.
TAMBIS Ontology
a conceptual representation of biological concepts and terminology, known as the TaMBIS Ontology (TaO) The aim of the TAMBIS Ontology (T.O.) is thus to capture biological and bioinformatics knowledge in a logical conceptual framework that is constrained in such a way that i) only biologically sensible concepts classify correctly, ii) it can encompass different user views, and iii) it makes biological concepts and their relationships computationally accessible TAMBIS (Transparent Access to Multiple Bioinformatics Information Sources) uses an ontology to enable biologists to ask questions over multiple external databases using a common query interface [1]. The TAMBIS ontology (TaO) [19] describes a wide range of bioinformatics tasks and resources, and has a central role within the TAMBIS system. An interesting difference between the TaO and some of the other ontologies reviewed here, is that the TaO does not contain any instances. The TaO only contains knowledge about bioinformatics and molecular biology concepts and their relationships - the instances they represent still reside in the external databases. As concepts represent instances, a concept can act as a question. The concept Receptor Protein represents the instances of proteins with a receptor function and gathering these instances is answering that question. The TaO is a dynamic ontology, in that it can grow without the need for either conceptualising or encoding new knowledge. In contrast, the other ontologies described here are static - developers must interveen and encode new conceptualisation to form new concepts. The TaO uses rules within the ontology to govern what concepts can be joined to another concept via relationships, to form new concepts. Thus the TaO places great emphasis on relations. A user can form a complex, multisource query, using relationships, in the following manner. Starting with the concept Protein, the TaO is consulted as to which relationships can be used to join Protein to other concepts. Amongst many, the following two are offered: is homologous to Protein and hasAccessionNumber AccessionNumber. Initially, the original Protein is extended to give a new concept Protein isHomologous to Protein (The concept Protein Protein homologue); then the second `protein' is extended with hasAccessionNumber AccessionNumber. The resulting concept (`Protein homologue of Protein with Accession Number') describes proteins which are homologous to protein with a particular accession number. This concept can be used as a source independent query containing no information on how to answer such a query. The rest of the TAMBIS system takes this conceptual query and processes it to an executable program against the external sources [20]. The TaO is available in two forms - a small model that concentrates on proteins and a larger scale model that includes nucleic acids. The small TaO, with 250 concepts and 60 relationships, describes Proteins and enzymes, as well as their motifs, secondary and tertiary structure, functions and processes. There is also supporting material on subcellular structure and chemicals, including cofactors. Motifs extend to detail such as the principal modification sites; function and process to broad classifications such as Hormone and Receptor, and Apoptosis and Lactation; structure extends to detail such as gross architecture - for example, SevenPropellor. Important relationships include is component of, has name, has function and is homologous to, as well as many more. The larger model, with 1500 concepts, broadens these areas to include concepts pertinent to nucleic acid, its children and genes.
TAMBIS aims to aid researchers in biological science by providing a single access point for biological information sources round the world. The access point will be a single interface (via the World Wide Web) which acts as a single information source. It will find appropriate sources of information for user queries and phrase the user questions for each source, returning the results in a consistent manner which will include details of the information source.
Ontologies provide a powerful mechanism for making conceptual information about biology computationally available. Ontologies therefore provide one mechanism by which conceptual information can be attached to the current flood of biological data and thereby help turn data into useful biological knowledge.
The ontology currently contains around 1800 asserted concepts. The concepts covered and the sources with which they are associated are shown below, along with examples of GRAIL constructs in which the concepts are used: Protein and protein sequence (from SWISS-PROT, [Bairoch et al. , 1996]), protein component motifs (from PROSITE, [Bairoch et al. 1997]), protein structure (as classified by CATH [Orengo et al., 1997]) and enzyme function (as defined in Prosite, and the Enzymes and Metabolic Pathways database - EMP, [Selkov et al., 1996]). We can therefore build concepts such as the tertiary structures of proteins which contain motifs that are involved in hydrolase activity: TertiaryStructure which isStructureOf (Protein which hasComponent (Motif which indicatesFunction Hydrolase)) Enzymes and metabolic pathways (as defined in the Enzyme database, [Bairoch, 1996]). This allows the construction of queries regarding enzymes and their reactions, for example enzymes which catalyse reactions which occur in the metabolism of thymine. Enzyme which catalyses (Reaction which occursIn (Metabolism which isMetabolismOf Thymine)). Expressed sequence tags (as defined by dbEST, [Boguski et al., 1993]). We can therefore create the concept of ESTs that code for proteins that contain glycosylation sites. EST which codesFor (Protein which hasComponent GlycosylationSite). Nucleic acids, their component motifs, gene function and expression [Stoesser et al. 1997, Stoesser et al. 1998]. The concept given below should be relatively self-explanatory. Gene which codesFor (Protein which hasFunction TransmembraneTransport). Sequence homology (BLAST, [Altschul et al., 1990]). Using ideas of homology we can create concepts linked to specific bioinformatics processes, for example the concept of the set of proteins homologous to a protein with a specific accession number. Protein which isHomologousTo (Protein which hasAccessionNumber P12345). Taxonomy (as defined at the NCBI web site [NCBI]). TaxonomicRank which < isRankOf PoeciliaReticulata isRankOf AmoebaProteus> i.e. the taxonomic rank common to both Poecilia reticulata and Amoeba proteus .