Biodiversity Informatics
Biodiversity Informatics
Overview
Biodiversity informatics (different but linked to bioinformatics) is the application of information technology
methods to the problems of organizing, accessing, visualizing and analyzing primary biodiversity data.
Primary biodiversity data is composed of names, observations and records of specimens, and genetic and
morphological data associated to a specimen. Biodiversity informatics may also have to cope with
managing information from unnamed taxa such as that produced by environmental sampling and
sequencing of mixed-field samples. The term biodiversity informatics is also used to cover the
computational problems specific to the names of biological entities, such as the development of algorithms
to cope with variant representations of identifiers such as species names and authorities, and the multiple
classification schemes within which these entities may reside according to the preferences of different
workers in the field, as well as the syntax and semantics by which the content in taxonomic databases can
be made machine queryable and interoperable for biodiversity informatics purposes...
One major goal for biodiversity informatics is the creation of a complete master list of currently recognised
species of the world. This goal has been achieved to a large extent by the Catalogue of Life project which
lists >2 million species in its 2022 Annual Checklist.[8] A similar effort for fossil taxa, the Paleobiology
Database[9] documents some 100,000+ names for fossil species, out of an unknown total number.
Application of the Linnaean system of binomial nomenclature for species, and uninomials for genera and
higher ranks, has led to many advantages but also problems with homonyms (the same name being used for
multiple taxa, either inadvertently or legitimately across multiple kingdoms), synonyms (multiple names for
the same taxon), as well as variant representations of the same name due to orthographic differences, minor
spelling errors, variation in the manner of citation of author names and dates, and more. In addition, names
can change through time on account of changing taxonomic opinions (for example, the correct generic
placement of a species, or the elevation of a subspecies to species rank or vice versa), and also the
circumscription of a taxon can change according to different authors' taxonomic concepts. One proposed
solution to this problem is the usage of Life Science Identifiers (LSIDs) for machine-machine
communication purposes, although there are both proponents and opponents of this approach.
Organisms can be classified in a multitude of ways (see main page Biological classification), which can
create design problems for Biodiversity Informatics systems aimed at incorporating either a single or
multiple classification to suit the needs of users, or to guide them towards a single "preferred" system.
Whether a single consensus classification system can ever be achieved is probably an open question,
however the Catalogue of Life has commissioned activity in this area[10] which has been succeeded by a
published system proposed in 2015 by M. Ruggiero and co-workers.[11]
Biodiversity Maps
Biodiversity maps provide a cartographic representation of spatial biodiversity data.[12] This data can be
used in conjunction with Species Checklists to help with biodiversity conservation efforts. Biodiversity
maps can help reveal patterns of species distribution and range changes. This may reflect biodiversity loss,
habitat degradation, or changes in species composition. Combined with urban development data, maps can
inform land management by modeling scenarios which might impact biodiversity.
Biodiversity maps can be produced in a variety of ways: traditionally range maps were hand-drawn based
on literature reports but increasingly large-scale data, e.g. from citizen science projects (e.g. iNaturalist) and
digitized museum collections (e.g. VertNet) are used. GIS tools such as ArcGIS or R packages such as
dismo can specifically aid in species distribution
modeling (ecological niche modeling) and even
predict impacts of ecological change on
biodiversity.[13] GBIF, OBIS, and IUCN are
large web-based repositories of species spatial-
temporal data that source many existing
biodiversity maps.
Biodiversity Maps
An overview of the state of knowledge on
(National Biodiversity https://maps.biodiversityireland.ie/
the distribution of Ireland's biodiversity
Data Centre)
Biodiversity Maps that depict patterns to https://savingnature.com/our-
Saving Nature
guide conservation efforts. biodiversity-maps/
As a secondary source of biodiversity data, relevant scientific literature can be parsed either by humans or
(potentially) by specialized information retrieval algorithms to extract the relevant primary biodiversity
information that is reported therein, sometimes in aggregated / summary form but frequently as primary
observations in narrative or tabular form. Elements of such activity (such as extracting key taxonomic
identifiers, keywording / index terms, etc.) have been practiced for many years at a higher level by selected
academic databases and search engines. However, for the maximum Biodiversity Informatics value, the
actual primary occurrence data should ideally be retrieved and then made available in a standardized form
or forms; for example both the Plazi and INOTAXA projects are transforming taxonomic literature into
XML formats that can then be read by client applications, the former using TaxonX-XML[15] and the latter
using the taXMLit format. The Biodiversity Heritage Library is also making significant progress in its aim
to digitize substantial portions of the out-of-copyright taxonomic literature, which is then subjected to
optical character recognition (OCR) so as to be amenable to further processing using biodiversity
informatics tools.
Current activities
At the 2009 e-Biosphere conference in the U.K.,[20] the following themes were adopted, which is
indicative of a broad range of current Biodiversity Informatics activities and how they might be categorized:
A post-conference workshop of key persons with current significant Biodiversity Informatics roles also
resulted in a Workshop Resolution that stressed, among other aspects, the need to create durable, global
registries for the resources that are basic to biodiversity informatics (e.g., repositories, collections); complete
the construction of a solid taxonomic infrastructure; and create ontologies for biodiversity data.[21]
Example projects
Global:
The Global Biodiversity Information Facility (GBIF), and the Ocean Biogeographic
Information System (OBIS) (for marine species)
The Species 2000, ITIS (Integrated Taxonomic Information System), and Catalogue of Life
projects
Global Names
EOL, The Encyclopedia of Life project
The Consortium for the Barcode of Life project
The Map of Life project
The Reptile Database project
The AmphibiaWeb project
The uBio Universal Biological Indexer and Organizer, from the Woods Hole Marine
Biological Laboratory
The Index to Organism Names (ION) from Clarivate Analytics, providing access to scientific
names of taxa from numerous journals as indexed in the Zoological Record
The Interim Register of Marine and Nonmarine Genera (IRMNG)
ZooBank, the registry for nomenclatural acts and relevant systematic literature in zoology
The Index Nominum Genericorum, compilation of generic names published for organisms
covered by the International Code of Botanical Nomenclature, maintained at the
Smithsonian Institution in the U.S.A.
The International Plant Names Index
MycoBank, documenting new names and combinations for fungi
The List of Prokaryotic names with Standing in Nomenclature (LPSN) - Official register of
valid names for bacteria and archaea, as governed by the International Code of
Nomenclature of Bacteria
The Biodiversity Heritage Library project - digitising biodiversity literature
Wikispecies, open source (community-editable) compilation of taxonomic information,
companion project to Wikipedia
TaxonConcept.org, a Linked Data project that connects disparate species databases
Instituto de Ciencias Naturales. Universidad Nacional de Colombia. Virtual Collections and
Biodiversity Informatics Unit
ANTABIF. The Antarctic Biodiversity Information Facility gives free and open access to
Antarctic Biodiversity data, in the spirit of the Antarctic Treaty.
Genesys, database of plant genetic resources maintained in national, regional and
international gene banks
VertNet, Access to vertebrate primary occurrence data from data sets worldwide.
Fauna Europaea
Atlas of Living Australia
Pan-European Species directories Infrastructure (PESI)
Symbiota
iDigBio, Integrated Digitized Biocollections (USA)
i4Life project
Sistema de Información sobre Biodiversidad de Colombia
India Biodiversity Portal (IBP)
Bhutan Biodiversity Portal (BBP)
Weed Identification and Knowledge in the Western Indian Ocean (WIKWIO)
LifeWatch is proposed by ESFRI as a pan-European research (e-)infrastructure to support
Biodiversity research and policy-making.
A listing of over 600 current biodiversity informatics related activities can be found at the TDWG
"Biodiversity Information Projects of the World" database.[22]
See also
Web-based taxonomy
List of biodiversity databases
References
1. Soberón, J., & Peterson, A. T. (2004). Biodiversity informatics: Managing and applying
primary biodiversity data. Philosophical Transactions of the Royal Society B: Biological
Sciences, 359(1444), 689–698.
2. Krishtalka L, Humphrey PS (2000). "Can Natural History Museums Capture the Future?" (htt
ps://doi.org/10.1641%2F0006-3568%282000%29050%5B0611%3ACNHMCT%5D2.0.CO%
3B2). BioScience. 50 (7): 611–617. doi:10.1641/0006-
3568(2000)050[0611:CNHMCT]2.0.CO;2 (https://doi.org/10.1641%2F0006-3568%282000%
29050%5B0611%3ACNHMCT%5D2.0.CO%3B2).
3. Peterson AT, Vieglais D (2001). "Predicting Species Invasions Using Ecological Niche
Modeling: New Approaches from Bioinformatics Attack a Pressing Problem" (http://www.cria.
org.br/eventos/mfmpe/19_20jun2002_docs/BioScience%202001.pdf) (PDF). BioScience. 51
(5): 363–371. doi:10.1641/0006-3568(2001)051[0363:PSIUEN]2.0.CO;2 (https://doi.org/10.1
641%2F0006-3568%282001%29051%5B0363%3APSIUEN%5D2.0.CO%3B2).
4. "Bioinformatics for Biodiversity?" (http://www.sciencemag.org/content/vol289/issue5488/inde
x.dtl). Science. 289: 2229–2440. 2000.
5. "Biodiversity Informatics" (https://web.archive.org/web/20100127183756/http://www.biomedc
entral.com/1471-2105/10?issue=S14). BMC Bioinformatics. 10 Suppl 14. 2009. Archived
from the original (http://www.biomedcentral.com/1471-2105/10?issue=S14) on 2010-01-27.
Retrieved 2009-11-15.
6. " "Biodiversity Informatics", The Term" (http://www.bgbm.org/BioDivInf/TheTerm.htm).
Retrieved 2009-08-06.
7. Bisby FA; et al. (2000). "The Quiet Revolution: Biodiversity Informatics and the Internet".
Science. 289 (5488): 2309–2312. Bibcode:2000Sci...289.2309B (https://ui.adsabs.harvard.e
du/abs/2000Sci...289.2309B). doi:10.1126/science.289.5488.2309 (https://doi.org/10.1126%
2Fscience.289.5488.2309). PMID 11009408 (https://pubmed.ncbi.nlm.nih.gov/11009408).
S2CID 31852825 (https://api.semanticscholar.org/CorpusID:31852825).
8. "Catalogue of Life - 2016 Annual Checklist : The 2016 Annual Checklist" (http://www.catalog
ueoflife.org/annual-checklist/2016/info/ac). www.catalogueoflife.org. Retrieved 2021-09-08.
9. "the Paleobiology Database" (http://paleodb.org/). Retrieved 2009-08-06.
10. "Towards a management hierarchy (classification) for the Catalogue of Life. Draft Discussion
Document by Dr. Dennis P. Gordon, May 2009" (https://web.archive.org/web/200908082015
30/http://www.catalogueoflife.org/info_hierarchy.php). Archived from the original (http://www.
catalogueoflife.org/info_hierarchy.php) on 2009-08-08. Retrieved 2009-08-06.
11. Ruggiero, M.A.; Gordon, D.P.; Orrell, T.M.; Bailly, N.; Bourgoin, T.; Brusca, R.C.; et al. (2015).
"A higher level classification of all living organisms" (https://www.ncbi.nlm.nih.gov/pmc/articl
es/PMC4418965). PLOS ONE. 10 (4): e0119248. Bibcode:2015PLoSO..1019248R (https://u
i.adsabs.harvard.edu/abs/2015PLoSO..1019248R). doi:10.1371/journal.pone.0119248 (http
s://doi.org/10.1371%2Fjournal.pone.0119248). PMC 4418965 (https://www.ncbi.nlm.nih.gov/
pmc/articles/PMC4418965). PMID 25923521 (https://pubmed.ncbi.nlm.nih.gov/25923521).
12. "Biodiversity Maps: Transforming Data into Visual Tools into Meaningful Action for
Biodiversity Conservation -" (https://biodiversityphilippines.org/biodiversity-map-transformin
g-data-into-visual-tools-into-meaningful-action-for-biodiversity-conservation/). 2016-11-30.
Retrieved 2022-05-05.
13. Elith, Jane; Franklin, Janet (2013), "Species Distribution Modeling" (https://linkinghub.elsevi
er.com/retrieve/pii/B978012384719500318X), Encyclopedia of Biodiversity, Elsevier,
pp. 692–705, doi:10.1016/b978-0-12-384719-5.00318-x (https://doi.org/10.1016%2Fb978-0-
12-384719-5.00318-x), ISBN 978-0-12-384720-1, S2CID 82987545 (https://api.semanticsch
olar.org/CorpusID:82987545), retrieved 2022-05-05
14. Jetz, Walter; McPherson, Jana M.; Guralnick, Robert P. (2012). "Integrating biodiversity
distribution knowledge: toward a global map of life" (https://doi.org/10.1016%2Fj.tree.2011.0
9.007). Trends in Ecology & Evolution. 27 (3): 151–159. doi:10.1016/j.tree.2011.09.007 (http
s://doi.org/10.1016%2Fj.tree.2011.09.007). PMID 22019413 (https://pubmed.ncbi.nlm.nih.go
v/22019413).
15. "TaxonX" (https://sourceforge.net/projects/taxonx/). SourceForge. Retrieved 2021-09-08.
16. "Taxonomic Concept Transfer Schema (TCS)" (http://www.tdwg.org/standards/117/).
Biodiversity Information Standards (TDWG).
17. "Structured Descriptive Data" (http://www.tdwg.org/standards/116/). Biodiversity Information
Standards (TDWG).
18. "Access to Biological Collection Data (ABCD)" (http://www.tdwg.org/standards/115/).
Biodiversity Information Standards (TDWG).
19. "GitHub - tdwg/tapir: TDWG Access Protocol for Information Retrieval (TAPIR)" (https://githu
b.com/tdwg/tapir). GitHub. 16 June 2020. Retrieved 2021-09-08.
20. "Home" (http://www.e-biosphere09.org/). e-biosphere09.org.
21. "Archived copy" (https://web.archive.org/web/20120226041400/http://www.e-biosphere09.or
g/assets/files/workshop/Resolution.pdf) (PDF). www.e-biosphere09.org. Archived from the
original (http://www.e-biosphere09.org/assets/files/workshop/Resolution.pdf) (PDF) on 26
February 2012. Retrieved 12 January 2022.
22. "TDWG: Biodiversity Information Projects of the World" (https://web.archive.org/web/200907
14155531/http://www.tdwg.org/biodiv-projects/). www.tdwg.org. Archived from the original (ht
tp://www.tdwg.org/biodiv-projects/) on 14 July 2009. Retrieved 12 January 2022.
Further reading
OECD Megascience Forum Working Group on Biological Informatics (1999). Final Report of
the OECD Megascience Forum Working Group on Biological Informatics, January 1999 (http
s://web.archive.org/web/20090305004424/http://www.gbif.org/GBIF_org/facility/BIrepfin).
pp. 1–74. Archived from the original (https://www.gbif.org/GBIF_org/facility/BIrepfin) on 2009-
03-05. Retrieved 2018-03-21.
Canhos, V.P.; Souza, S.; Giovanni, R. & Canhos, D.A.L. (2004). "Global biodiversity
informatics: setting the scene for a "new world" of ecological modeling" (https://doi.org/10.17
161%2Fbi.v1i0.3). Biodiversity Informatics. 1: 1–13. doi:10.17161/bi.v1i0.3 (https://doi.org/1
0.17161%2Fbi.v1i0.3).
Soberón, J. & Peterson, A.T. (2004). "Biodiversity informatics: managing and applying
primary biodiversity data" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1693343). Phil.
Trans. R. Soc. Lond. B359 (1444): 689–698. doi:10.1098/rstb.2003.1439 (https://doi.org/10.1
098%2Frstb.2003.1439). PMC 1693343 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC169
3343). PMID 15253354 (https://pubmed.ncbi.nlm.nih.gov/15253354).
Chapman, A.D. (2005). Uses of Primary Species-Occurrence Data (https://web.archive.org/w
eb/20100216210830/http://www2.gbif.org/UsesPrimaryData.pdf) (PDF). Copenhagen:
Global Biodiversity Information Facility. pp. 1–106. Archived from the original (http://www2.g
bif.org/UsesPrimaryData.pdf) (PDF) on 2010-02-16. Retrieved 2009-08-12.
Johnson, N.F. (2007). "Biodiversity informatics". Annual Review of Entomology. 52: 421–
438. doi:10.1146/annurev.ento.52.110405.091259 (https://doi.org/10.1146%2Fannurev.ento.
52.110405.091259). PMID 16956323 (https://pubmed.ncbi.nlm.nih.gov/16956323).
Sarkar, I.N. (2007). "Biodiversity informatics: organizing and linking information across the
spectrum of life" (https://doi.org/10.1093%2Fbib%2Fbbm037). Briefings in Bioinformatics. 8
(5): 347–357. doi:10.1093/bib/bbm037 (https://doi.org/10.1093%2Fbib%2Fbbm037).
PMID 17704120 (https://pubmed.ncbi.nlm.nih.gov/17704120).
Guralnick, R.P.; Hill, A (2009). "Biodiversity Informatics: Automated Approaches for
Documenting Global Biodiversity Patterns and Processes" (https://doi.org/10.1093%2Fbioinf
ormatics%2Fbtn659). Bioinformatics. 25 (4): 421–428. doi:10.1093/bioinformatics/btn659 (htt
ps://doi.org/10.1093%2Fbioinformatics%2Fbtn659). PMID 19129210 (https://pubmed.ncbi.nl
m.nih.gov/19129210).
External links
Biodiversity Informatics (http://journals.ku.edu/index.php/jbi) (journal)