0% found this document useful (0 votes)
133 views

KEGG

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances used for bioinformatics research. It contains systems information on pathways and modules, genomic information on complete genomes and genes, chemical information on compounds and reactions, and health information on diseases and drugs. KEGG aims to integrate biological building blocks and wiring diagrams to represent biological systems computationally and aid in interpreting genomic and other 'omics' data.

Uploaded by

mcdonald212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
133 views

KEGG

KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances used for bioinformatics research. It contains systems information on pathways and modules, genomic information on complete genomes and genes, chemical information on compounds and reactions, and health information on diseases and drugs. KEGG aims to integrate biological building blocks and wiring diagrams to represent biological systems computationally and aid in interpreting genomic and other 'omics' data.

Uploaded by

mcdonald212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

KEGG

KEGG (Kyoto Encyclopedia of Genes and Genomes) is a


collection of databases dealing with genomes, biological
KEGG
pathways, diseases, drugs, and chemical substances. KEGG is
utilized for bioinformatics research and education, including data
analysis in genomics, metagenomics, metabolomics and other
omics studies, modeling and simulation in systems biology, and
translational research in drug development.

The KEGG database project was initiated in 1995 by Minoru


Kanehisa, professor at the Institute for Chemical Research, Kyoto Content
University, under the then ongoing Japanese Human Genome Description Bioinformatics
Program.[1][2] Foreseeing the need for a computerized resource resource for
that can be used for biological interpretation of genome sequence
deciphering the
data, he started developing the KEGG PATHWAY database. It is a
genome
collection of manually drawn KEGG pathway maps representing
experimental knowledge on metabolism and various other Organisms All
functions of the cell and the organism. Each pathway map contains Contact
a network of molecular interactions and reactions and is designed
to link genes in the genome to gene products (mostly proteins) in Research center Kyoto University
the pathway. This has enabled the analysis called KEGG pathway Laboratory Kanehisa
mapping, whereby the gene content in the genome is compared Laboratories (ht
with the KEGG PATHWAY database to examine which pathways tp://kanehisa.j
and associated functions are likely to be encoded in the genome. p/)

According to the developers, KEGG is a "computer Primary citation PMID 10592173


representation" of the biological system.[3] It integrates building (https://pubme
blocks and wiring diagrams of the system—more specifically, d.ncbi.nlm.nih.g
genetic building blocks of genes and proteins, chemical building ov/10592173)
blocks of small molecules and reactions, and wiring diagrams of
Release date 1995
molecular interaction and reaction networks. This concept is
realized in the following databases of KEGG, which are Access
categorized into systems, genomic, chemical, and health Website www.kegg.jp (ht
information.[4] tp://www.kegg.j
p/)
Systems information
Web service URL REST see
PATHWAY: pathway maps for cellular and KEGG API (htt
organismal functions
p://www.kegg.jp/
MODULE: modules or functional units of genes
kegg/rest/)
BRITE: hierarchical classifications of biological
entities Tools

Genomic information Web KEGG Mapper


(http://www.keg
GENOME: complete genomes
g.jp/kegg/mapp
GENES: genes and proteins in the complete
er.html)
genomes
ORTHOLOGY: ortholog groups of genes in the complete genomes
Chemical information
COMPOUND, GLYCAN: chemical compounds and glycans
REACTION, RPAIR, RCLASS: chemical reactions
ENZYME: enzyme nomenclature
Health information
DISEASE: human diseases
DRUG: approved drugs
ENVIRON: crude drugs and health-related substances

Databases

Systems information

The KEGG PATHWAY database, the wiring diagram database, is the core of the KEGG resource. It is a
collection of pathway maps integrating many entities including genes, proteins, RNAs, chemical
compounds, glycans, and chemical reactions, as well as disease genes and drug targets, which are stored as
individual entries in the other databases of KEGG. The pathway maps are classified into the following
sections:

Metabolism
Genetic information processing (transcription, translation, replication and repair, etc.)
Environmental information processing (membrane transport, signal transduction, etc.)
Cellular processes (cell growth, cell death, cell membrane functions, etc.)
Organismal systems (immune system, endocrine system, nervous system, etc.)
Human diseases
Drug development

The metabolism section contains aesthetically drawn global maps showing an overall picture of
metabolism, in addition to regular metabolic pathway maps. The low-resolution global maps can be used,
for example, to compare metabolic capacities of different organisms in genomics studies and different
environmental samples in metagenomics studies. In contrast, KEGG modules in the KEGG MODULE
database are higher-resolution, localized wiring diagrams, representing tighter functional units within a
pathway map, such as subpathways conserved among specific organism groups and molecular complexes.
KEGG modules are defined as characteristic gene sets that can be linked to specific metabolic capacities
and other phenotypic features, so that they can be used for automatic interpretation of genome and
metagenome data.

Another database that supplements KEGG PATHWAY is the KEGG BRITE database. It is an ontology
database containing hierarchical classifications of various entities including genes, proteins, organisms,
diseases, drugs, and chemical compounds. While KEGG PATHWAY is limited to molecular interactions
and reactions of these entities, KEGG BRITE incorporates many different types of relationships.

Genomic information
Several months after the KEGG project was initiated in 1995, the first report of the completely sequenced
bacterial genome was published.[5] Since then all published complete genomes are accumulated in KEGG
for both eukaryotes and prokaryotes. The KEGG GENES database contains gene/protein-level information
and the KEGG GENOME database contains organism-level information for these genomes. The KEGG
GENES database consists of gene sets for the complete genomes, and genes in each set are given
annotations in the form of establishing correspondences to the wiring diagrams of KEGG pathway maps,
KEGG modules, and BRITE hierarchies.

These correspondences are made using the concept of orthologs. The KEGG pathway maps are drawn
based on experimental evidence in specific organisms but they are designed to be applicable to other
organisms as well, because different organisms, such as human and mouse, often share identical pathways
consisting of functionally identical genes, called orthologous genes or orthologs. All the genes in the
KEGG GENES database are being grouped into such orthologs in the KEGG ORTHOLOGY (KO)
database. Because the nodes (gene products) of KEGG pathway maps, as well as KEGG modules and
BRITE hierarchies, are given KO identifiers, the correspondences are established once genes in the
genome are annotated with KO identifiers by the genome annotation procedure in KEGG.[4]

Chemical information

The KEGG metabolic pathway maps are drawn to represent the dual aspects of the metabolic network: the
genomic network of how genome-encoded enzymes are connected to catalyze consecutive reactions and
the chemical network of how chemical structures of substrates and products are transformed by these
reactions.[6] A set of enzyme genes in the genome will identify enzyme relation networks when
superimposed on the KEGG pathway maps, which in turn characterize chemical structure transformation
networks allowing interpretation of biosynthetic and biodegradation potentials of the organism.
Alternatively, a set of metabolites identified in the metabolome will lead to the understanding of enzymatic
pathways and enzyme genes involved.

The databases in the chemical information category, which are collectively called KEGG LIGAND, are
organized by capturing knowledge of the chemical network. In the beginning of the KEGG project, KEGG
LIGAND consisted of three databases: KEGG COMPOUND for chemical compounds, KEGG
REACTION for chemical reactions, and KEGG ENZYME for reactions in the enzyme nomenclature.[7]
Currently, there are additional databases: KEGG GLYCAN for glycans[8] and two auxiliary reaction
databases called RPAIR (reactant pair alignments) and RCLASS (reaction class).[9] KEGG COMPOUND
has also been expanded to contain various compounds such as xenobiotics, in addition to metabolites.

Health information

In KEGG, diseases are viewed as perturbed states of the biological system caused by perturbants of genetic
factors and environmental factors, and drugs are viewed as different types of perturbants.[10] The KEGG
PATHWAY database includes not only the normal states but also the perturbed states of the biological
systems. However, disease pathway maps cannot be drawn for most diseases because molecular
mechanisms are not well understood. An alternative approach is taken in the KEGG DISEASE database,
which simply catalogs known genetic factors and environmental factors of diseases. These catalogs may
eventually lead to more complete wiring diagrams of diseases.

The KEGG DRUG database contains active ingredients of approved drugs in Japan, the US, and Europe.
They are distinguished by chemical structures and/or chemical components and associated with target
molecules, metabolizing enzymes, and other molecular interaction network information in the KEGG
pathway maps and the BRITE hierarchies. This enables an integrated analysis of drug interactions with
genomic information. Crude drugs and other health-related substances, which are outside the category of
approved drugs, are stored in the KEGG ENVIRON database. The databases in the health information
category are collectively called KEGG MEDICUS, which also includes package inserts of all marketed
drugs in Japan.

Subscription model
In July 2011 KEGG introduced a subscription model for FTP download due to a significant cutback of
government funding. KEGG continues to be freely available through its website, but the subscription
model has raised discussions about sustainability of bioinformatics databases.[11][12]

See also
Comparative Toxicogenomics Database - CTD integrates KEGG pathways with
toxicogenomic and disease data
ConsensusPathDB, a molecular functional interaction database, integrating information from
KEGG
Gene ontology
PubMed
Uniprot
Gene Disease Database

References
1. Kanehisa M, Goto S (2000). "KEGG: Kyoto Encyclopedia of Genes and Genomes" (https://w
ww.ncbi.nlm.nih.gov/pmc/articles/PMC102409). Nucleic Acids Res. 28 (1): 27–30.
doi:10.1093/nar/28.1.27 (https://doi.org/10.1093%2Fnar%2F28.1.27). PMC 102409 (https://w
ww.ncbi.nlm.nih.gov/pmc/articles/PMC102409). PMID 10592173 (https://pubmed.ncbi.nlm.ni
h.gov/10592173).
2. Kanehisa M (1997). "A database for post-genome analysis". Trends Genet. 13 (9): 375–6.
doi:10.1016/S0168-9525(97)01223-7 (https://doi.org/10.1016%2FS0168-9525%2897%2901
223-7). PMID 9287494 (https://pubmed.ncbi.nlm.nih.gov/9287494).
3. Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki
M, Hirakawa M (2006). "From genomics to chemical genomics: new developments in
KEGG" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347464). Nucleic Acids Res. 34
(Database issue): D354–7. doi:10.1093/nar/gkj102 (https://doi.org/10.1093%2Fnar%2Fgkj10
2). PMC 1347464 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347464).
PMID 16381885 (https://pubmed.ncbi.nlm.nih.gov/16381885).
4. Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M (2014). "Data,
information, knowledge and principle: back to metabolism in KEGG" (https://www.ncbi.nlm.ni
h.gov/pmc/articles/PMC3965122). Nucleic Acids Res. 42 (Database issue): D199–205.
doi:10.1093/nar/gkt1076 (https://doi.org/10.1093%2Fnar%2Fgkt1076). PMC 3965122 (http
s://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965122). PMID 24214961 (https://pubmed.ncbi.
nlm.nih.gov/24214961).
5. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ,
Tomb JF, Dougherty BA, Merrick JM, et al. (1995). "Whole-genome random sequencing and
assembly of Haemophilus influenzae Rd" (https://semanticscholar.org/paper/bedebd529202
4baef3d53b4bac79a4f9d99c7687). Science. 269 (5223): 496–512.
Bibcode:1995Sci...269..496F (https://ui.adsabs.harvard.edu/abs/1995Sci...269..496F).
doi:10.1126/science.7542800 (https://doi.org/10.1126%2Fscience.7542800). PMID 7542800
(https://pubmed.ncbi.nlm.nih.gov/7542800). S2CID 10423613 (https://api.semanticscholar.or
g/CorpusID:10423613).
6. Kanehisa M (2013). "Chemical and genomic evolution of enzyme-catalyzed reaction
networks". FEBS Lett. 587 (17): 2731–7. doi:10.1016/j.febslet.2013.06.026 (https://doi.org/1
0.1016%2Fj.febslet.2013.06.026). hdl:2433/178762 (https://hdl.handle.net/2433%2F17876
2). PMID 23816707 (https://pubmed.ncbi.nlm.nih.gov/23816707). S2CID 40074657 (https://a
pi.semanticscholar.org/CorpusID:40074657).
7. Goto S, Nishioka T, Kanehisa M (1999). "LIGAND database for enzymes, compounds and
reactions" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC148189). Nucleic Acids Res. 27
(1): 377–9. doi:10.1093/nar/27.1.377 (https://doi.org/10.1093%2Fnar%2F27.1.377).
PMC 148189 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC148189). PMID 9847234 (http
s://pubmed.ncbi.nlm.nih.gov/9847234).
8. Hashimoto K, Goto S, Kawano S, Aoki-Kinoshita KF, Ueda N, Hamajima M, Kawasaki T,
Kanehisa M (2006). "KEGG as a glycome informatics resource" (https://doi.org/10.1093%2F
glycob%2Fcwj010). Glycobiology. 16 (5): 63R–70R. doi:10.1093/glycob/cwj010 (https://doi.o
rg/10.1093%2Fglycob%2Fcwj010). PMID 16014746 (https://pubmed.ncbi.nlm.nih.gov/16014
746).
9. Muto A, Kotera M, Tokimatsu T, Nakagawa Z, Goto S, Kanehisa M (2013). "Modular
architecture of metabolic pathways revealed by conserved sequences of reactions" (https://w
ww.ncbi.nlm.nih.gov/pmc/articles/PMC3632090). J Chem Inf Model. 53 (3): 613–22.
doi:10.1021/ci3005379 (https://doi.org/10.1021%2Fci3005379). PMC 3632090 (https://www.
ncbi.nlm.nih.gov/pmc/articles/PMC3632090). PMID 23384306 (https://pubmed.ncbi.nlm.nih.
gov/23384306).
10. Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M (2010). "KEGG for representation
and analysis of molecular networks involving diseases and drugs" (https://www.ncbi.nlm.nih.
gov/pmc/articles/PMC2808910). Nucleic Acids Res. 38 (Database issue): D355–60.
doi:10.1093/nar/gkp896 (https://doi.org/10.1093%2Fnar%2Fgkp896). PMC 2808910 (https://
www.ncbi.nlm.nih.gov/pmc/articles/PMC2808910). PMID 19880382 (https://pubmed.ncbi.nl
m.nih.gov/19880382).
11. Galperin MY, Fernández-Suárez XM (2012). "The 2012 Nucleic Acids Research Database
Issue and the online Molecular Biology Database Collection" (https://www.ncbi.nlm.nih.gov/
pmc/articles/PMC3245068). Nucleic Acids Res. 40 (Database issue): D1–8.
doi:10.1093/nar/gkr1196 (https://doi.org/10.1093%2Fnar%2Fgkr1196). PMC 3245068 (http
s://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245068). PMID 22144685 (https://pubmed.ncbi.
nlm.nih.gov/22144685).
12. Hayden, EC (2013). "Popular plant database set to charge users" (http://www.nature.com/ne
ws/popular-plant-database-set-to-charge-users-1.13642). Nature.
doi:10.1038/nature.2013.13642 (https://doi.org/10.1038%2Fnature.2013.13642).
S2CID 211729309 (https://api.semanticscholar.org/CorpusID:211729309).

External links
KEGG website (http://www.kegg.jp/)
GenomeNet mirror site (http://www.genome.jp/kegg/)
The entry for KEGG (https://web.archive.org/web/20120402032246/http://metadatabase.org/
wiki/KEGG) in MetaBase

Retrieved from "https://en.wikipedia.org/w/index.php?title=KEGG&oldid=1124102180"

You might also like