Ogee v3
Ogee v3
Received September 11, 2020; Revised September 24, 2020; Editorial Decision September 25, 2020; Accepted September 28, 2020
* To
whom correspondence should be address. Tel: +86 158 2735 4263, Fax: +86 27 8779 2072; Email: [email protected]
Correspondence may also be addressed to Li-Jie He. Email: [email protected]
Correspondence may also be addressed to Xing-Ming Zhao. Email: [email protected]
†
The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors.
C The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License
(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work
is properly cited. For commercial re-use, please contact [email protected]
Nucleic Acids Research, 2021, Vol. 49, Database issue D999
some large and complex organisms, such as human (14), us- the same cell line, thus the combined data from two distinct
ing various technologies. methods could offer more robust results and coverage over
The essentiality of a gene is highly dependent on various human essential genes, as a previous study suggested (34).
factors including the genetic context, genetic background In summary, the current version (v3) of OGEE include 16
of the host and environment (15). Hence, gene essential- datasets from 16 non-human eukaryotes and 111 datasets
ity is not a static, but rather a context-dependent property from 75 prokaryotes. Of the 213,608 collected genes, 31,177
of a gene. Since version 1, OGEE has been promoting the were tested in multiple datasets, accounting up to 14.6%
context-dependent concept and its importance for under- of all collected genes. Of the tested genes from multiple
standing on gene essentiality (16). In our last update (17), datasets, 15 440 genes were conditionally essential genes
we focused on the importance of ‘conditionally essential (CEGs) or context-dependent, representing around 49.52%
genes’ (CEGs) or ‘differentially essential genes’ (DEGs) in of genes covered by multiple datasets. In addition, OGEE
cancer cell lines. The current update provides: (a) an in- v3 also includes experimental essentiality results for 581 hu-
creased coverage of essentiality tested genes and species, man cancer cell lines.
(b) up-to-date essentiality status of existing genes, (c) an in-
A B
signaling pathways; and they are less likely to be tolerate To better appraise the pattern of essentiality across hu-
missense variation, and more prone to pathogenicity (41). man cell lines and identify potentially meaningful biological
While effort has been put into identifying targets for can- relationships between genes (47), we selected data of those
cer therapeutics, several research groups (32,33) have iden- tissues with at least ten cell lines from a conditioned ex-
tified cancer dependencies genes and lineage-specific EGs in periment, and assessed gene-gene essentiality relationships
human cell lines using RNAi screening or CRISPR–Cas9 among individual tissues by Spearman and Pearson corre-
technology. Since both RNAi and CRISPR are promising lation. Raw fitness scores instead of binary essentiality as-
complementary methods for identifying EGs, we annotated signments (i.e. essential or non-essential genes) were used
human EGs extensively by considering both of these tech- for such calculations. After filtering the genes with essen-
niques as well as factors influencing gene essentiality. tiality in at least 50% cell lines of a specific tissue, we then
As shown in Figure 1A, the 581 human cell lines collected selected gene pairs with significant correlation results (P-
in OGEE v3 were assigned into 25 groups according to their value < 0.001). The results of this analysis are available via
tissue of origin. The largest number of cell lines were for ‘Correlation co-essentiality’ in OGEE v3 for each human
lung (n = 123), followed by central nervous system (n = 59), gene tested that met the cut-off (see Figure 2B for an exam-
haematopoietic and lymphoid tissue (n = 53), ovary (n = 40) ple).
and breast (n = 38).
We selected the RKO cell line set to exemplify the anno-
Orthology and comparisons with mouse essential genes
tation of essential gene in OGEE v3 (https://v3.ogee.info/
#/cellline/large%20intestine/RKO/summary). The cell line To date, mouse is the only mammalian species for which
RKO is inferred to have 672 essential genes based on RNAi gene essentiality has been tested at organismal level. About
screening, and 1251 essential genes based on CRISPR– 30% of tested mouse genes are essential for survival, which
Cas9 dataset (Figure 2A). To understand more about the is markedly higher than that of human cell lines (21). Thus,
characteristics of essential genes and the relationship with unique essential genes in mouse experiments may be impor-
the influencing factors in human cell lines, we compared tant during growth and development, while those uniquely
tested genes of RKO against the known influencing fac- essential in human cell lines may represent suitable treat-
tors data that we had collected (Figure 2C–E). Mutations, ment and/or drug targets with putatively less side-effects.
methylation and gene expression all contributed signifi- To compare gene essentiality between mouse and human
cantly to gene essentiality in this cell line. For example, cells, we first selected tissues for which at least six cell lines
genes harboring pathogenic mutations are more likely to be were available, and defined 1607 core essential human genes
essential regardless of the experimental methods used (Fig- in >50% of the cell lines (n ≥ 3). We downloaded the mouse
ure 2C); genes with less methylation at their transcription knock-out and associated phenotype data from databases
start site (TSS200) are also more likely to be essential (Fig- IMPC (21) and MGI (22) to deeply understand the func-
ure 2D); in general, lowly-expressed genes are less likely to tions of essential genes in mice; these databases contained
be essential (Figure 2E). information for 5799 and 9693 knockout genes, respectively.
Nucleic Acids Research, 2021, Vol. 49, Database issue D1001
Figure 2. Extensive annotations on human essential genes provided by OGEE v3. Shown here are graphical visualizations taken from the OGEE v3
(https://v3.ogee.info/#/cellline/large%20intestine/RKO/summary). (A) gene essentiality determined using different experimental methods. (B) Correlational
co-essentiality chart of AAMP gene from central nervous system tissue tested by CRISPR-Cas9 technique. The orange edges denote negative correlation
while the blue edges denote positive correlation among the connected genes. (C) Gene essentiality as function of mutation: mutation outcomes predicted
using FATHMM were used to group all tested genes into three categories, the percentage of essential in each group was then calculated (please consult
‘Help’ page for more details); left panel: gene essentiality tested using RNAi, right panel: gene essentiality tested using CRISPR–Cas9. (D) Gene essentiality
as function of methylation: genes are grouped based on methylation sites and calculate percentage of essential for each sites; red color gradient is used
to denote the methylation score, increasing gradually from 0 to 1; users can select any methylations from the drop down menu. (E) Gene essentiality as
function of gene expression: all 0 rpkm genes are in the first bin and rest of the genes are assigned into equal size nine bins; lastly, percentage of essential
is calculated for each bin.
D1002 Nucleic Acids Research, 2021, Vol. 49, Database issue
DATA AVAILABILITY
All data are freely accessible to all academic users. This
work is licensed under a Creative Commons Attribution
3.0 Unported License (CC BY 3.0). Users can download
datasets from the ‘Download’ page. Individual datasets or
combined datasets of individual species can be downloaded
via the ‘Browse’ page.
10. Cheung,H.W., Cowley,G.S., Weir,B.A., Boehm,J.S., Rusin,S., 28. Aurrecoechea,C., Brestelli,J., Brunk,B.P., Dommer,J., Fischer,S.,
Scott,J.A., East,A., Ali,L.D., Lizotte,P.H., Wong,T.C. et al. (2011) Gajria,B., Gao,X., Gingle,A., Grant,G., Harb,O.S. et al. (2009)
Systematic investigation of genetic vulnerabilities across cancer cell PlasmoDB: a functional genomic database for malaria parasites.
lines reveals lineage-specific dependencies in ovarian cancer. Proc. Nucleic Acids Res., 37, D539–D543.
Natl. Acad. Sci. U.S.A., 108, 12372–12377. 29. Harb,O.S. and Roos,D.S. (2020) ToxoDB: functional genomics
11. Luo,B., Hiu,W.C., Subramanian,A., Sharifnia,T., Okamoto,M., resource for toxoplasma and related organisms. Methods in Molecular
Yang,X., Hinkle,G., Boehm,J.S., Beroukhim,R., Weir,B.A. et al. Biology, 2071, 27–47.
(2008) Highly parallel identification of essential genes in cancer cells. 30. Aslett,M., Aurrecoechea,C., Berriman,M., Brestelli,J., Brunk,B.P.,
Proc. Natl. Acad. Sci. U.S.A., 105, 20380–20385. Carrington,M., Depledge,D.P., Fischer,S., Gajria,B., Gao,X. et al.
12. Giaever,G., Chu,A.M., Ni,L., Connelly,C., Riles,L., Véronneau,S., (2009) TriTrypDB: a functional genomic resource for the
Dow,S., Lucau-Danila,A., Anderson,K., André,B. et al. (2002) Trypanosomatidae. Nucleic Acids Res., 38, D457–D462.
Functional profiling of the Saccharomyces cerevisiae genome. Nature, 31. Patel,S.J., Sanjana,N.E., Kishton,R.J., Eidizadeh,A., Vodnala,S.K.,
418, 387–391. Cam,M., Gartner,J.J., Jia,L., Steinberg,S.M., Yamamoto,T.N. et al.
13. Kamath,R.S., Fraser,A.G., Dong,Y., Poulin,G., Durbin,R., (2017) Identification of essential genes for cancer immunotherapy.
Gotta,M., Kanapin,A., Le Bot,N., Moreno,S., Sohrmann,M. et al. Nature, 548, 537–542.
(2003) Systematic functional analysis of the Caenorhabditis elegans 32. Behan,F.M., Iorio,F., Picco,G., Gonçalves,E., Beaver,C.M.,
genome using RNAi. Nature, 421, 231–237. Migliardi,G., Santos,R., Rao,Y., Sassi,F., Pinnelli,M. et al. (2019)