0% found this document useful (0 votes)
16 views

Assignment of Bioinformatics

Notes of computer

Uploaded by

Hira Sheikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Assignment of Bioinformatics

Notes of computer

Uploaded by

Hira Sheikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Assignment of Bioinformatics

Submitted to: Mam Hina Naz

Submitted by: Hira Shakeel

Roll no: F20-BS-ZOOL-1030

Class: BS Zoology (M)

Session: 2020-2024

Topic: Major DNA databases around the world.

1
Contents
DNA Databases............................................................................................................................................3

Types of Databases......................................................................................................................................3

1.Forensic DNA Databases.......................................................................................................................3

Interpol DNA database............................................................................................................................3

CODIS.......................................................................................................................................................3

NDNAD (National DNA Database)............................................................................................................4

DNA Database of Australia......................................................................................................................5

NDDB ( National DNA Data Bank) of Canada...........................................................................................5

Netherlands Forensic Institute (NFI)........................................................................................................5

2.Research and Medical DNA Databases.....................................................................................................6

Medical....................................................................................................................................................6

The Cancer Genome Atlas (TCGA)...........................................................................................................6

GISAID......................................................................................................................................................7

Gen Bank.................................................................................................................................................8

1000 Genomes Project............................................................................................................................8

International Hap Map Project................................................................................................................9

China National Gene Bank.....................................................................................................................10

3.Ancestry and Consumer DNA Databases................................................................................................10

23andMe...............................................................................................................................................10

My Heritage...........................................................................................................................................11

Family Tree DNA....................................................................................................................................11

2
Ancestry DNA........................................................................................................................................12

Privacy issues.............................................................................................................................................13

Conclusion.................................................................................................................................................13

References.................................................................................................................................................14

3
DNA Databases

DNA databases are collections of DNA profiles that are stored and used for various purposes,
including scientific research, medical diagnostics, forensic investigations, and genealogy. These
databases contain genetic information that can be used to identify individuals, determine
biological relationships, and study genetic traits and disorders.

Here are some common types of DNA databases and their uses:

Types of Databases

1.Forensic DNA Databases


A forensic database is a centralized DNA database for storing DNA profiles of individuals that
enables searching and comparing of DNA samples collected from a crime scene against stored
profiles. The most important function of the forensic database is to produce matches between the
suspected individual and crime scene bio-markers, and then provides evidence to support
criminal investigations, and also leads to identify potential suspects in the criminal investigation.
Majority of the National DNA databases are used for forensic purposes.

Interpol DNA database is used in criminal investigations. Interpol maintains an automated


DNA database called DNA Gateway that contains DNA profiles submitted by member countries
collected from crime scenes, missing persons, and unidentified bodies.

The DNA Gateway was established in 2002, and at the end of 2013, it had more than 140,000
DNA profiles from 69 member countries. Unlike other DNA databases, DNA Gateway is only
used for information sharing and comparison, it does not link a DNA profile to any individual,
and the physical or psychological conditions of an individual are not included in the database.

CODIS
The United States national DNA database is called Combined DNA Index System (CODIS).
CODIS consists of three levels of information; Local DNA Index Systems (LDIS) where DNA
profiles originate, State DNA Index Systems (SDIS) which allows for laboratories within states

4
to share information, and the National DNA Index System (NDIS) which allows states to
compare DNA information with one another.

The CODIS software contains multiple different databases depending on the type of information
being searched against. Examples of these databases include, missing persons, convicted
offenders, and forensic samples collected from crime scenes. Each state, and the federal system,
has different laws for collection, upload, and analysis of information contained within their
database. However, for privacy reasons, the CODIS database does not contain any personal
identifying information, such as the name associated with the DNA profile. The uploading
agency is notified of any hits to their samples and are tasked with the dissemination of personal
information pursuant to their laws.

NDNAD (National DNA Database)


The first national DNA database in the United Kingdom was established in April 1995, called
National DNA Database (NDNAD). By 2006, it contained 2.7 million DNA profiles (about
5.2% of the UK population), as well as other information from individuals and crime scenes. in
2020 it had 6.6 million profiles (5.6 million individuals excluding duplicates).

In the UK, police have wide-ranging powers to take DNA samples and retain them if the subject
is convicted of a recordable offence. As the large amount of DNA profiles which have been
stored in NDNAD, "cold hits" may happen during the DNA matching, which means finding an
unexpected match between an individual's DNA profile and an unsolved crime-scene DNA
profile. This can introduce a new suspect into the investigation, thus helping to solve the old
cases.

In England and Wales, anyone arrested on suspicion of a recordable offence must submit a DNA
sample, the profile of which is then stored on the DNA database. Those not charged or not found
guilty have their DNA data deleted within a specified period of time. In Scotland, the law
similarly requires the DNA profiles of most people who are acquitted be removed from the
database.

5
DNA Database of Australia
The Australian national DNA database is called the National Criminal Investigation DNA
Database (NCIDD). By July 2018, it contained 837,000+ DNA profiles. The database used nine
STR loci and a sex gene for analysis, and this was increased to 18 core markers in 2013.NCIDD
combines all forensic data, including DNA profiles, advanced bio-metrics or cold cases.

NDDB ( National DNA Data Bank) of Canada


The Canadian national DNA database is called the National DNA Data Bank (NDDB) which
was established in 1998 but first used in 2000. The legislation that Parliament enacted to govern
the use of this technology within the criminal justice system has been found by Canadian courts
to be respectful of the constitutional and privacy rights of suspects, and of persons found guilty
of designated offences.

Applications:

Criminal Investigation: The NDDB will assist law enforcement agencies in solving crimes by
matching DNA from crime scenes with profile stored in the database.

Link Crimes: By finding matches between DNA evidence from different crime scenes, aiding in
identifying serial offenders.

Exonerate the innocent: By using DNA evidence to clear individuals wrongfully accused or
convicted of crimes.

Identify Missing persons: By comparing DNA from missing persons or their relatives with
unidentified human remains.

Netherlands Forensic Institute (NFI)


The NFI (Netherlands Forensic Institute), formerly known as Gerechtelijk Laboratorium has a
long tradition in the Netherlands as the main institute that provides forensic services to the
criminal justice chain.

Forensic Services

6
CBRN: The NFI provides governments with expertise to help them prepare for, and prevent
terrorism-related incidents with chemical, biological, radiological and nuclear agents, possibly in
combination with explosives.

Forensics in Nuclear Security: The NFI has taken the lead in coordinating an integrated
international forensic response to nuclear terrorism.

Wildlife Forensics: The NFI offers wildlife forensic services to governments and government
agencies in order to combat illegal trading in endangered, protected or dangerous species of flora
and fauna.

2.Research and Medical DNA Databases

Medical
A medical DNA database is a DNA database of medically relevant genetic variations. It collects
an individual's DNA which can reflect their medical records and lifestyle details. Through
recording DNA profiles, scientists may find out the interactions between the genetic environment
and occurrence of certain diseases (such as cardiovascular disease or cancer), and thus finding
some new drugs or effective treatments in controlling these diseases. It is often collaborated with
the National Health Service.

The Cancer Genome Atlas (TCGA)


The Cancer Genome Atlas (TCGA) is a project to catalogue the genomic alterations responsible
for cancer using genome sequencing and bioinformatics. The overarching goal was to apply
high-throughput genome analysis techniques to improve the ability to diagnose, treat, and
prevent cancer through a better understanding of the genetic basis of the disease.

TCGA was supervised by the National Cancer Institute's Center for Cancer Genomics and the
National Human Genome Research Institute funded by the US government. A three-year pilot
project, begun in 2006, focused on characterization of three types of human cancers:
glioblastoma multiforme, lung squamous carcinoma, and ovarian serous adenocarcinoma.
In 2009, it expanded into phase II, which planned to complete the genomic characterization and

7
sequence analysis of 20–25 different tumor types by 2014. Ultimately, TCGA surpassed that
goal, characterizing 33 cancer types including 10 rare cancers.

Goals

The goal of TCGA's pilot project was to establish an infrastructure to collect, molecularly
characterize, and analyze 500 cancers and matched controls. The work required extensive
cooperation among a team of scientists from various institutions and assessment of multiple
burgeoning high-throughput technologies. TCGA wanted to not only generate high-quality and
biologically meaningful genomic data, but also make that data freely available to the cancer
research community.

GISAID
GISAID (the Global Initiative on Sharing All Influenza Data) is a global science initiative
established in 2008 to provide access to genomic data of influenza viruses. The database was
expanded to include the coronavirus responsible for the COVID-19 pandemic, as well as other
pathogens. The database has been described as "the world's largest repository of COVID-19
sequences". GISAID facilitates genomic epidemiology and real-time surveillance to monitor the
emergence of new COVID-19 viral strains across the planet.

Database for SARS-CoV-2 genomes

GISAID maintains what has been described as "the world's largest repository of COVID-19
sequences", and "by far the world's largest database of SARS-CoV-2 sequences". By mid-April
2021, GISAID's SARS-CoV-2 database reached over 1,200,000 submissions, a testament to the
hard work of researchers in over 170 different countries. Only three months later, the number of
uploaded SARS-CoV-2 sequences had doubled again, to over 2.4 million. By late 2021, the
database contained over 5 million genome sequences; as of December 2021, over 6 million
sequences had been submitted; by April 2022, there were 10 million sequences accumulated; and
in January 2023 the number had reached 14.4 million.

In January 2020, the SARS-CoV-2 genetic sequence data was shared through GISAID.
Throughout the first year of the COVID-19 pandemic, most of the SARS-CoV-2 whole-genome
sequences that were generated and shared globally were submitted through GISAID. When the

8
SARS-CoV-2 Omicron variant was detected in South Africa, by quickly uploading the sequence
to GISAID, the National Institute for Communicable Diseases there was able to learn that
Botswana and Hong Kong had also reported cases possessing the same gene sequence.

A national or forensic DNA database is not available for non-police purposes. DNA profiles can
also be used for genealogical purposes, so that a separate genetic genealogy database needs to
be created that stores DNA profiles of genealogical DNA test results.

Gen Bank
Genebank is a public genetic genealogy database that stores genome sequences submitted by
many genetic genealogists. GenBank is the NIH genetic sequence database, an annotated
collection of all publicly available DNA sequences.

Gen Bank is part of the International Nucleotide Sequence Database Collaboration, which
comprises the DNA Data Bank of Japan (DDBJ), the European Nucleotide Archive (ENA), and
Gen Bank at NCBI. These three organizations exchange data on a daily basis.

Until now, Gen Bank has contained large number of DNA sequences gained from more than
140,000 registered organizations, and is updated every day to ensure a uniform and
comprehensive collection of sequence information. These databases are mainly obtained from
individual laboratories or large-scale sequencing projects.

The files stored in Gen Bank are divided into different groups, such as BCT (bacterial), VRL
(viruses), PRI (primates) etc. People can access Gen Bank from NCBI's retrieval system, and
then use “BLAST” function to identify a certain sequence within the Gen Bank or to find the
similarities between two sequences.

1000 Genomes Project


The 1000 Genomes Project (1KGP), taken place from January 2008 to 2015, was an international
research effort to establish the most detailed catalogue of human genetic variation at the time.
Scientists planned to sequence the genomes of at least one thousand anonymous healthy
participants from a number of different ethnic groups within the following three years, using
advancements in newly developed technologies.

9
In 2010, the project finished its pilot phase, which was described in detail in a publication in the
journal Nature. In 2012, the sequencing of 1092 genomes was announced in a Nature
publication. In 2015, two papers in Nature reported results and the completion of the project and
opportunities for future research.

The 1000 Genomes Project was designed to bridge the gap of knowledge between rare genetic
variants that have a severe effect predominantly on simple traits (e.g. cystic fibrosis, Huntington
disease) and common genetic variants have a mild effect and are implicated in complex traits
(e.g. cognition, diabetes, heart disease).

The primary goal of this project was to create a complete and detailed catalogue of human
genetic variations, which can be used for association studies relating genetic variation to disease.

International Hap Map Project


The International Hap Map Project was an organization that aimed to develop a haplotype map
(Hap Map) of the human genome, to describe the common patterns of human genetic variation.
Hap Map is used to find genetic variants affecting health, disease and responses to drugs and
environmental factors. The information produced by the project is made freely available for
research. The International Hap Map Project is a collaboration among researchers at academic
centers, non-profit biomedical research groups and private companies in Canada, China
(including Hong Kong), Japan, Nigeria, the United Kingdom, and the United States.

It officially started with a meeting on October 27 to 29, 2002, and was expected to take about
three years. It comprises two phases; the complete data obtained in Phase I were published on 27
October 2005. The analysis of the Phase II dataset was published in October 2007.

Objectives

1. *Identify SNPs*: Systematically identify common genetic variations (SNPs) across different
populations.

2. *Map Haplotypes*: Determine how these SNPs are organized into blocks, called haplotypes,
which are inherited together.

10
3. *Facilitate Research*: Provide researchers with tools to find genes associated with diseases
and responses to pharmaceuticals and environmental factors.

China National Gene Bank


China National GeneBank or CNGB (Chinese: 国家基因库) is China's first national-level gene
storage bank, approved and funded by the Chinese government. Based in the Dapeng Peninsula
of Shenzhen, CNGB's mission is to support public welfare, life science research and innovation,
as well as industry incubation, through effective bio resource conservation, digitalization and
utilization.

Application

Based on research projects supported by and resources stored in CNGB, CNGB db has
developed multiple scientific databases to advance scientific discoveries in major research areas,
such as plants, animals, micro-organisms, health and diseases, etc. The databases provide not
only high-quality datasets but also specialized analysis tools.

3.Ancestry and Consumer DNA Databases

23andMe
23andMe Holding Co. is a publicly traded personal genomics and biotechnology company based
in South San Francisco, California. It is best known for providing a direct-to-consumer genetic
testing service in which customers provide a saliva sample that is laboratory analysed, using
single nucleotide polymorphism genotyping, to generate reports relating to the customer's
ancestry and genetic predispositions to health-related topics. The company's name is derived
from the 23 pairs of chromosomes in a diploid human cell.

Founded in 2006, 23andMe soon became the first company to begin offering autosomal DNA
testing for ancestry, which all other major companies now use. Its saliva-based direct-to-
consumer genetic testing business was named "Invention of the Year" by Time in 2008.

As of the latest data, 23andMe's database includes genetic information from over 10 million
people, making it one of the largest databases of its kind.

11
The company uses anonymized data for research purposes, contributing to studies in genetics,
medicine, and other fields. Customers can choose whether to participate in research or not.

23andMe offers educational resources to help users understand their genetic information and its
implications.

My Heritage
MyHeritage is an online genealogy platform with web, mobile, and software products and
services, introduced by the Israeli company MyHeritage in 2003.Users of the platform can
obtain their family trees, upload and browse through photos, and search through over 19.9 billion
historical records, among other features.

As of 2023, the service supports 42 languages. In 2016, it launched a genetic testing service
called MyHeritage DNA, with more than 6.5 million DNA kits in the company's database by
March 2023.

MyHeritage DNA is a genetic testing service launched by MyHeritage in 2016. DNA results are
obtained from home test kits, allowing users to use cheek swabs to collect samples. The results
provide DNA matching and ethnicity estimates.

MyHeritage has included on its website a series of image-editing tools, offering from restoration
to colorization and animation of images, with some of them using artificial intelligence, like the
Photo Enhancer and MyHeritage In Color, both launched in 2020, and the Photo Repair,
launched in 2021.

Family Tree DNA


FamilyTreeDNA is a division of Gene by Gene, a commercial genetic testing company based in
Houston, Texas. FamilyTreeDNA offers analysis of autosomal DNA, Y-DNA, and
mitochondrial DNA to individuals for genealogical purpose. With a database of more than two
million records, it is the most popular company worldwide for Y-DNA and mitochondrial DNA,
and the fourth most popular for autosomal DNA. In Europe, it is the most common also for
autosomal DNA. Family Tree DNA as a division of Gene by Gene were acquired by MYDNA,
Inc., an Australian company, in January 2021.

12
FamilyTreeDNA was founded based on an idea conceived by Bennett Greenspan, a lifelong
entrepreneur and genealogy enthusiast.

In May 2010, FamilyTreeDNA launched an autosomal microarray chip based DNA test. They
called the new product Family Finder.

Family Finder allows customers to match relatives as distant as about fifth cousins. Family
Finder also includes a component called myOrigins. The results of this test provide percentages
of a DNA associated with general regions or specific ethnic groups (e.g. Western Europe, Asia,
Jewish, Native American, etc.)

In December 2018, FamilyTreeDNA changed its terms of service to allow law enforcement to
use their service to identify suspects of "a violent crime" (defined as child abduction, sexual
assault or homicide) or identify the remains of victims. The company confirmed it was working
with the FBI on at least a handful of cases.

Ancestry DNA
Ancestry DNA is the genetic genealogy database service of myfamily.com (the owner of
Ancestry.com). Ancestry DNA offers an autosomal DNA test. The test was first launched in the
US in 2012. It became available in the United Kingdom, Ireland, Australia, New Zealand and
Canada in 2015. It was launched in a further 29 countries in February 2016.

Ancestry DNA is a subsidiary of Ancestry LLC. Ancestry DNA offers a direct-to-consumer


genealogical DNA test. Consumers provide a sample of their DNA to the company for analysis.
Ancestry DNA then uses DNA sequences to infer family relationships with other Ancestry DNA
users and to provide what it calls an "ethnicity estimate". This "ethnicity estimate" uses 700,000
markers which is only about .02% of all genetic markers that could be tested.

Ancestry DNA is commonly used for donor conceived persons to find their biological siblings
and in some cases their sperm or egg donor.

For the people who activate the DNA test, Ancestry offers the possibility to participate into
Human Diversity Project, a "scientific research project aimed at helping scientists better
understand population history, human migration, and human health".

13
Privacy issues
Critics of DNA databases warn that the various uses of the technology can pose a threat to
individual civil liberties. Personal information included in genetic material, such as markers that
identify various genetic diseases, physical and behavioral traits, could be used for discriminatory
profiling and its collection may constitute an invasion of privacy.

Nowadays, the privacy and security issues of DNA database has caused huge attention. Some
people are afraid that their personal DNA information will be let out easily, others may define
their DNA profiles recording in the Databases as a sense of "criminal", and being falsely accused
in a crime can lead to having a "criminal" record for the rest of their lives.

In European countries which have established a DNA database, there are some measures which
are being used to protect the privacy of individuals, more specifically, some criteria to help
removing the DNA profiles from the databases. Among the 22 European countries which have
been analyzed, most of the countries will record the DNA profiles of suspects or those who have
committed serious crimes. Most of the countries will delete the suspect's profile after they are
acquitted...etc. All the countries have a completed legislation to largely avoid the privacy issues
which may occur during the use of DNA database. Public discussion around the introduction of
advanced forensic techniques (such as genetic genealogy using public genealogy databases and
DNA phenotyping approaches) has been limited, disjointed, and unfocused, and raises issues of
privacy and consent that may warrant additional legal protections to be established.

Furthermore, DNA databases could fall into the wrong hands due to data breaches or data
sharing.

Conclusion
DNA databases offer significant advantages and opportunities for scientific discovery, medical
advancement, and personal insight. However, these benefits must be balanced with robust
privacy protections and ethical considerations to ensure that the use of genetic data is
responsible, secure, and respectful of individuals' rights. As technology and policies evolve,
DNA databases will continue to play a crucial role in shaping our understanding of genetics and
its applications in society.

14
References
https://isogg.org/wiki/AncestryDNA

https://en.wikipedia.org/wiki/DNA_database

https://isogg.org/wiki/DNA_databases

https://en.wikipedia.org/wiki/Combined_DNA_Index_System

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5388101/

https://www.britannica.com/event/International-HapMap-Project

https://isogg.org/wiki/Family_Tree_DNA

https://en.wikipedia.org/wiki/MyHeritage

15
16

You might also like