lncRNome A Comprehensive Knowledgebase of Human Long Noncoding RNAs
lncRNome A Comprehensive Knowledgebase of Human Long Noncoding RNAs
1093/database/bat034
.............................................................................................................................................................................................................................................................................................
Original article
lncRNome: a comprehensive knowledgebase
of human long noncoding RNAs
Deeksha Bhartiya1,y, Koustav Pal2,y, Sourav Ghosh3, Shruti Kapoor3, Saakshi Jalali1,
Bharat Panwar4, Sakshi Jain2, Satish Sati3, Shantanu Sengupta3, Chetana Sachidanandan3,
Gajendra Pal Singh Raghava4, Sridhar Sivasubbu3 and Vinod Scaria1,*
1
GN Ramachandran Knowledge Center for Genome Informatics, CSIR Institute of Genomics and Integrative Biology, Mall Road, Delhi 110007, India,
2
CSIR Open Source Drug Discovery Unit, Council of Scientific and Industrial Research, Anusandhan Bhavan, Delhi 110001, India, 3Genomics and
*Corresponding author: Tel: +91 9650466002; Fax: +91 11 27667471; Email: [email protected]
y
These authors contributed equally to this work.
Citation details: Bhartiya,D., Pal,K., Ghosh,S., et al. lncRNome: a comprehensive knowledgebase of human long noncoding RNAs. Database (2013)
Vol. 2013: article ID bat034; doi:10.1093/database/bat034.
.............................................................................................................................................................................................................................................................................................
The advent of high-throughput genome scale technologies has enabled us to unravel a large amount of the previously
unknown transcriptionally active regions of the genome. Recent genome-wide studies have provided annotations of a
large repertoire of various classes of noncoding transcripts. Long noncoding RNAs (lncRNAs) form a major proportion of
these novel annotated noncoding transcripts, and presently known to be involved in a number of functionally distinct
biological processes. Over 18 000 transcripts are presently annotated as lncRNA, and encompass previously annotated
classes of noncoding transcripts including large intergenic noncoding RNA, antisense RNA and processed pseudogenes.
There is a significant gap in the resources providing a stable annotation, cross-referencing and biologically relevant infor-
mation. lncRNome has been envisioned with the aim of filling this gap by integrating annotations on a wide variety of
biologically significant information into a comprehensive knowledgebase. To the best of our knowledge, lncRNome is one
of the largest and most comprehensive resources for lncRNAs.
Database URL: http://genome.igib.res.in/lncRNome
.............................................................................................................................................................................................................................................................................................
.............................................................................................................................................................................................................................................................................................
ß The Author(s) 2013. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/
licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly
cited. Page 1 of 6
(page number not for citation purposes)
Original article Database, Vol. 2013, Article ID bat034, doi:10.1093/database/bat034
.............................................................................................................................................................................................................................................................................................
not just in humans, but also in other model systems like per the needs of a user. To this end, the structure was
mouse (9) and zebrafish (10, 11). Although noncoding tran- designed following consultation with a number of experi-
scripts with >200 nucleotide lengths have been clubbed mental and computational biologists. We created the data-
together in a general classification of lncRNAs, the mem- base to serve as a comprehensive, user-friendly and
bers of this class have significant differences in their biolo- biologically relevant knowledgebase on human lncRNAs
gical function, genomic loci and regulation. This class built on MySQL 5.6 and having a PHP-based web interface.
includes previously known classes of ncRNAs including the In brief, each lncRNA gene has a single page with
large intergenic noncoding RNA, transcribed pseudogenes, basic linkouts to other relevant databases, annotation sets
antisense transcripts and several others, including the and relevant categories of information linked in tabs. Five
annotated classes of functionally distinct transcripts such categories of information are presently available linked
as Xist, which is involved in X inactivation (12) and Hotair with each lncRNA, which includes (i) General Information,
(13), involved in epigenetic regulation. (ii) Sequence and Structure, (iii) Interactions and Processing,
Functionally, the lncRNA class encompasses a wide var- (iv) Variations and Conservation and (v) Epigenetic
iety of distinct functions like X-chromosome inactivation, Modifications. These categories are connected to the
modulation of chromatin structure, regulation of transcrip- genome browser along with the conservation scores of all
tional and posttranscriptional processes and epigenetic lncRNA transcripts (Supplementary File S1).
modifications (14). The biological function of lncRNAs is The category ‘General Information’ hosts information
.............................................................................................................................................................................................................................................................................................
Page 2 of 6
Database, Vol. 2013, Article ID bat034, doi:10.1093/database/bat034 Original article
.............................................................................................................................................................................................................................................................................................
database of single nucleotide polymorphisms (dbSNP) transcripts (36). RNA structures were computed using
SNPs were downloaded from UCSC genome browser and RNAfold with default parameters, which is part of the
mapped to lncRNAs. Conservation scores of 66 573 sites Vienna RNA package version 1.8.5. Our group has previ-
within lncRNAs have been provided in this category. The ously suggested the presence of G-quadruplex motifs in
fifth category provides 11 790 epigenetic marks in the pro- lncRNAs that could have potential regulatory functions
moters of lncRNAs. The datasets were downloaded from (39). To enable researchers to further take up experiments
the NIH Human Epigenome Roadmap project and in this area, predictions of potential G-quadruplex forming
mapped to lncRNA promoters. The detailed methods are motifs in entire lncRNA transcripts predicted using
available as Supplementary methods. Quadfinder have been included (37), as well as potential
The database also features a comprehensive search hairpin structures in the lncRNA have been identified
option, which enables users to search through lncRNome using HairpinFetcher.
using different keywords, such as, lncRNA names, Ensembl
IDs, known targets, SNPs, diseases, etc. In addition, a separ- lncRNA processing
ate browse option also allows users to browse the database A recent study conducted by our lab has pointed to a subset
through either using the chromosome numbers or different of lncRNAs, which could be potentially processed to small
lncRNA biotypes. The database also features a genome RNAs having downstream regulatory functions by having a
browser, which can be used to browse through the dual transcriptional output (40). The same analysis was
.............................................................................................................................................................................................................................................................................................
Page 3 of 6
Original article Database, Vol. 2013, Article ID bat034, doi:10.1093/database/bat034
.............................................................................................................................................................................................................................................................................................
.............................................................................................................................................................................................................................................................................................
Page 4 of 6
Database, Vol. 2013, Article ID bat034, doi:10.1093/database/bat034 Original article
.............................................................................................................................................................................................................................................................................................
Supplementary Data
Supplementary data are available at Database Online.
distribution of epigenetic marks like DNA methylation 8. Mercer,T.R., Dinger,M.E. and Mattick,J.S. (2009) Long non-coding
RNAs: insights into functions. Nat. Rev. Genet., 10, 155–159.
and histone modifications across transcription start site
9. Guttman,M., Amit,I., Garber,M. et al. (2009) Chromatin signature
(TSS) of lncRNAs might help in evaluating the effect of chro-
reveals over a thousand highly conserved large non-coding RNAs in
matin modifications on gene expression (Supplementary mammals. Nature, 458, 223–227.
Figure S1). 10. Pauli,A., Valen,E., Lin,M.F. et al. (2012) Systematic identification of
Because the field is emerging and many more lncRNAs long noncoding RNAs expressed during zebrafish embryogenesis.
are being discovered and annotated, thanks to the Genome Res., 22, 577–591.
.............................................................................................................................................................................................................................................................................................
Page 5 of 6
Original article Database, Vol. 2013, Article ID bat034, doi:10.1093/database/bat034
.............................................................................................................................................................................................................................................................................................
11. Ulitsky,I., Shkumatava,A., Jan,C.H. et al. (2011) Conserved function 29. Kin,T., Yamada,K., Terai,G. et al. (2007) fRNAdb: a platform for
of lincRNAs in vertebrate embryonic development despite rapid mining/annotating functional RNA candidates from non-coding
sequence evolution. Cell, 147, 1537–1550. RNA sequences. Nucleic Acids Res., 35(Database issue), D145–D148.
12. Brockdorff,N. (2011) Chromosome silencing mechanisms in 30. Wang,X. (2008) miRDB: a microRNA target prediction and func-
X-chromosome inactivation: unknown unknowns. Development, tional annotation database with a wiki interface. RNA, 14,
138, 5057–5065. 1012–1017.
13. Gutschner,T. and Diederichs,S. (2012) The Hallmarks of Cancer: 31. Ellis,J.C., Brown,D.D. and Brown,J.W. (2010) The small nucleolar
a long non-coding RNA point of view. RNA. Biol., 9, 703–719. ribonucleoprotein (snoRNP) database. RNA, 16, 664–666.
14. Derrien,T., Guigo,R. and Johnson,R. (2011) The long non-coding 32. Amaral,P.P., Clark,M.B., Gascoigne,D.K. et al. (2011) lncRNAdb: a
RNAs: a new (P)layer in the ‘Dark Matter’. Front Genet., 2, 107. reference database for long noncoding RNAs. Nucleic Acids Res.,
15. Wilusz,J.E., Sunwoo,H. and Spector,D.L. (2009) Long noncoding 39(Database issue), D146–D151.
RNAs: functional surprises from the RNA world. Genes Dev., 23, 33. Bu,D., Yu,K., Sun,S. et al. (2012) NONCODE v3.0: integrative anno-
1494–1504. tation of long noncoding RNAs. Nucleic Acids Res., 40(Database
16. Wang,K.C. and Chang,H.Y. (2011) Molecular mechanisms of long issue), D210–D215.
noncoding RNAs. Mol. Cell, 43, 904–914. 34. Harrow,J., Denoeud,F., Frankish,A. et al. (2006) GENCODE: produ-
17. Yan,B. and Wang,Z. (2012) Long Noncoding RNA: its physiological cing a reference annotation for ENCODE. Genome Biol., 7 (Suppl.
and pathological roles. DNA Cell Biol., 31 (Suppl. 1), S34–S41. 1), S4.1–S4.9.
18. Moran,V.A., Perera,R.J. and Khalil,A.M. (2012) Emerging functional 35. Povey,S., Lovering,R., Bruford,E. et al. (2001) The HUGO gene no-
and mechanistic paradigms of mammalian long non-coding RNAs. menclature committee (HGNC). Hum. Genet., 109, 678–680.
.............................................................................................................................................................................................................................................................................................
Page 6 of 6
Copyright of Database: The Journal of Biological Databases & Curation is the property of
Oxford University Press / USA and its content may not be copied or emailed to multiple sites
or posted to a listserv without the copyright holder's express written permission. However,
users may print, download, or email articles for individual use.