Understanding the Functions of Long Non-Coding RNAs Through Their Higher Order Structures
Understanding the Functions of Long Non-Coding RNAs Through Their Higher Order Structures
Molecular Sciences
Review
Understanding the Functions of Long Non-Coding
RNAs through Their Higher-Order Structures
Rui Li, Hongliang Zhu * and Yunbo Luo
Department of Food Biotechnology, College of Food Science and Nutritional Engineering,
China Agricultural University, Beijing 100083, China; [email protected] (R.L.); [email protected] (Y.L.)
* Correspondence: [email protected]; Tel./Fax: +86-10-6273-7571
Abstract: Although thousands of long non-coding RNAs (lncRNAs) have been discovered in
eukaryotes, very few molecular mechanisms have been characterized due to an insufficient
understanding of lncRNA structure. Therefore, investigations of lncRNA structure and subsequent
elucidation of the regulatory mechanisms are urgently needed. However, since lncRNA are high
molecular weight molecules, which makes their crystallization difficult, obtaining information
about their structure is extremely challenging, and the structures of only several lncRNAs have
been determined so far. Here, we review the structure–function relationships of the widely
studied lncRNAs found in the animal and plant kingdoms, focusing on the principles and
applications of both in vitro and in vivo technologies for the study of RNA structures, including
dimethyl sulfate-sequencing (DMS-seq), selective 21 -hydroxyl acylation analyzed by primer
extension-sequencing (SHAPE-seq), parallel analysis of RNA structure (PARS), and fragmentation
sequencing (FragSeq). The aim of this review is to provide a better understanding of lncRNA
biological functions by studying them at the structural level.
1. Introduction
Two types of RNA molecules exist [1]: messenger RNA (mRNA) molecules, which possess the
ability to encode the amino acid sequence of proteins, and non-coding RNAs (ncRNAs), which lack or
have very little protein-coding potential [2]. mRNAs, an essential component of the central dogma of
molecular biology, are known for their crucial roles as intermediaries conveying genetic information
from DNA to the ribosomes and mediating protein synthesis [3]. With the rapid development and
application of high-throughput deep sequencing, it was shown that although ~90% of the eukaryotic
genomeis transcribed, mRNAs account only for 1%–2% of total RNAs, suggesting that a large number
of RNA molecules are ncRNAs [4]. NcRNAs can be further classified as “housekeeping” ncRNAs
and “regulatory” ncRNAs, based on their functions [5]. The former includes ribosomal RNA (rRNA),
transfer RNA (tRNA), small nuclear RNA (snRNA), and small nucleolar RNA (snoRNA), while the
latter usually refers to small ncRNA (sncRNA) and long non-coding RNA (lncRNA) [5]. SncRNAs
have been the focus of molecular biology research over the last decade, and it was demonstrated that
they are involved in the regulation of their target genes at both transcriptional and post-transcriptional
levels [6]. However, lncRNA investigations have begun only recently.
It is generally believed that lncRNAs, RNA molecules longer than 200 nucleotides, belong
to a group of RNAs with broad biogenesis, and that these molecules are always capped and
polyadenylated [7]. Initially, lncRNAs were considered “transcriptional noise” without any biological
function. However, thousands of reports in recent years have demonstrated that lncRNAs, which
interact with DNA, RNA molecules, and transcription factors, participate in various biological
Figure 1. Differences in the structure and sequence between mRNA and lncRNA. The mRNA
Figure 1. Differences in the structure and sequence between mRNA and lncRNA. The mRNA primary
primary coding sequence (CDS) plays a significant role in the translation, while lncRNAs regulate
coding sequence (CDS) plays a significant role in the translation, while lncRNAs regulate target gene
target gene expression through the interactions between their higher-order structures and major
expression through the interactions between their higher-order structures and major partner proteins.
partner proteins.
Initially,
2. lncRNA methodsand
Structure suchBiological
as nuclearFunction
magneticRelationships
resonance (NMR) and X-ray crystallography were used
for the investigations of RNA structures [15]. However, since RNA molecules have high degeneration
rateslncRNAs, which to
and are difficult arecrystallize,
frequentlythese
involved in transcriptional,
methods post-transcriptional,
cannot accurately and epigenetic
identify RNA functional regions.
processes, are currently the focus of genetic research [8,17]. Previous studies have
Currently, researchers mainly use chemical and enzymatic strategies to study highly conserved shown that the
secondary
structures and tertiary structures
of lncRNAs [16]. The of lncRNAs
rapid are highly
development of conserved and thatprobing
lncRNA structure these highly conserved
methods helps
structures are strongly related to lncRNA biological functions [11,18]. Although
researchers gain a deeper understanding of lncRNA structure-function relationships. In this short thousands of
lncRNAs have been discovered in recent years, many of their functional sites remain unknown
review, we will focus on the relationships between lncRNA structures and their functions. Furthermore, [19].
In
somethetools
following
widely sections,
used forwe thewill discuss theof
investigations structure–function
highly ordered RNA relationships
structures of lncRNAs
will found in
be systematically
animals and plants that have been extensively studied.
discussed as well, and the indications for future development will be given.
2.1. lncRNAs
2.1.1. in Animals
Xist: Repetitive Elements Involved in Protein Complex Recruitment
2.1.1.During
Xist: Repetitive
the earlyElements
stages ofInvolved
embryonic in Protein Complex
development, Recruitment
genes on the X chromosomes in female
mammals are inactivated in order to achieve
During the early stages of embryonic development, genes on the same expression levels
the ofX X-chromosomal
chromosomes ingenes femalein
male
mammalsmammals [20,21]. This
are inactivated in widely
order tospread
achieve phenomenon is called X-chromosome
the same expression inactivation
levels of X-chromosomal (XCI),
genes in
and the regulatory genes involved in XCI are located at the X-inactivation
male mammals [20,21]. This widely spread phenomenon is called X-chromosome inactivation (XCI), center [22]. Among these
genes,
and thethe Xist (X-inactive
regulatory specific transcript)
genes involved gene plays
in XCI are located at thean X-inactivation
essential role in XCI. [22].
center The lncRNA
Among these Xist,
17 kb in length, is a transcript of Xist, which initiates XCI by coating
genes, the Xist (X-inactive specific transcript) gene plays an essential role in XCI. The lncRNA Xist, the X chromosome in order to
regulate cis X inactivation (Xi), and by recruiting modifying complexes,
17 kb in length, is a transcript of Xist, which initiates XCI by coating the X chromosome in order to such as polycomb repressive
2regulate
(PRC2),cistoXspecific
inactivationsites (Xi),
on Xi,andresulting in histone
by recruiting H3 lysine
modifying K27 trimethylation
complexes, such as polycomb (H3K27me3)
repressive and2
X-linked gene silencing [21,23]. Another lncRNA involved in this process, termed Tsix, is an
(PRC2), to specific sites on Xi, resulting in histone H3 lysine K27 trimethylation (H3K27me3) and
antisense transcript of Xist, which has the opposite effect and can prevent Xist from coating the X
X-linked gene silencing [21,23]. Another lncRNA involved in this process, termed Tsix, is an
chromosome [24,25]. Maenner et al. found a repeated element in Xist, which contains eight repeats,
antisense transcript of Xist, which has the opposite effect and can prevent Xist from coating the
termed
X chromosome ; this region
A-repeat[24,25]. Maenner represents the most
et al. found conserved
a repeated element Xistinregion [26]. Its
Xist, which 2D structure
contains shows
eight repeats,
two long
termed stem-loop
A-repeat; thisstructures in the A-repeat,
region represents the mostand each stem-loop
conserved Xist regioncontains
[26]. Itsfour
2D repeats,
structurewhichshowsweretwo
shown to be associated with PRC2 recruitment [26]. It was demonstrated
long stem-loop structures in the A-repeat, and each stem-loop contains four repeats, which were shown that several segments of
the A-repeat assist with the recruitment of particular PRC2 components,
to be associated with PRC2 recruitment [26]. It was demonstrated that several segments of the A-repeat but also that the increase in
the efficacy
assist with the of binding
recruitment to the entire complex
of particular PRC2 was observedbut
components, whenalsothe entire
that A-repeatinwas
the increase involved,
the efficacy of
suggesting that the A-repeat plays a significant role in
binding to the entire complex was observed when the entire A-repeat was involved, suggestingXCI by regulating the rate ofthat
PRC2the
recruitment
A-repeat plays [26]. Additionally,
a significant role ainnovel,
XCI byhighly stablethe
regulating tetraloop motif,recruitment
rate of PRC2 the AUCG [26]. loop,Additionally,
was found ina
the 5’ region of the human A-repeat; the integrity of this structure
novel, highly stable tetraloop motif, the AUCG loop, was found in the 5’ region of the human A-repeat; was closely related to Xist
silencing [27]. It was reported that the 3’ region of the A-repeat plays
the integrity of this structure was closely related to Xist silencing [27]. It was reported that the 3’ region a significant role in
intermolecular
of the A-repeat duplex formation and
plays a significant role inthat any mutations
intermolecular that disrupt
duplex formationtheandstructure
that any of mutations
this region,that as
observed in vitro, can compromise the biological functions of the A-repeat
disrupt the structure of this region, as observed in vitro, can compromise the biological functions of the in vivo [27].
In addition
A-repeat to the A-repeat, a C-repeat, which binds YY1 transcription factor and contains four
in vivo [27].
recurring hairpins,
In addition to thewasA-repeat,
found toa C-repeat,
be involved which in the
bindslocalization and tethering
YY1 transcription factor ofandthe Xist–PRC2
contains four
complex to the specific sites of X chromosome, inducing X-linked
recurring hairpins, was found to be involved in the localization and tethering of the Xist–PRC2 complexgene silencing (Figure 2) [28].
Although
to the specific C-repeat
sites of structure probing inducing
X chromosome, showed X-linkedonly a moderate rate of
gene silencing conservation
(Figure between
2) [28]. Although
different
C-repeat structure probing showed only a moderate rate of conservation between different the
species, a 441-nucleotide subdomain containing 55 nucleotides downstream of last
species,
C-repeat is highly structured and conserved in many species [29].
a 441-nucleotide subdomain containing 55 nucleotides downstream of the last C-repeat is highly The disruption of this subdomain
leads to Xist
structured anddissociation
conserved from in manyXi, indicating
species [29]. the The
importance
disruption of this conserved
of this subdomain structure
leadsfor to Xist
Xist
functions [29].
dissociation from Xi, indicating the importance of this conserved structure for Xist functions [29].
Figure 2. Xist repetitive element functions during X-chromosome inactivation. A-repeat, which
Figure 2. Xist repetitive element functions during X-chromosome inactivation. A-repeat, which
contains two long stem-loop structures, is involved in PRC2 binding, while C-repeat binds YY1,
contains two long stem-loop structures, is involved in PRC2 binding, while C-repeat binds YY1,
assisting Xist-PRC2 complex in targeting the specific sites on Xi, and inducing histone H3 lysine K27
assisting Xist-PRC2 complex in targeting the specific sites on Xi, and inducing histone H3 lysine K27
trimethylation (H3K27me3) and X-linked gene silencing.
trimethylation (H3K27me3) and X-linked gene silencing.
Int.
Int. J.J. Mol.
Mol. Sci.
Sci. 2016,
2016, 17,
17, 702
702 44 of
of 20
21
Recently, Lv et al. confirmed the significance of Xist D-repeat in XCI using CRISPR/Cas9
Recently,
(clustered Lv et al.
regularly confirmedshort
interspaced the significance
palindromic of Xist D-repeat
repeats in XCI using CRISPR/Cas9
(CRISPR)-associated endonuclease (clustered
9) [30].
regularly interspaced short palindromic repeats (CRISPR)-associated endonuclease
The D-repeat knockout directly led to a significant decrease of Xist levels, leading to the upregulation 9) [30]. The
D-repeat
of X-linked knockout directly
genes [30]. Theled to a significant
abundance and widedecrease of Xist levels,
distribution leading elements
of repetitive to the upregulation
in lncRNAs of
X-linkedthat
suggest genesthey[30].
may The abundance
play significant and wide
roles in distribution of repetitive
exerting biological elements
functions in lncRNAs suggest
of lncRNA.
that they may play significant roles in exerting biological functions of lncRNA.
2.1.2. RoX: Tandem Stem-Loops Direct MSL Complex Assembly
2.1.2. RoX: Tandem Stem-Loops Direct MSL Complex Assembly
Another widely discussed dosage compensation effect regulated by an lncRNA is
Another widely
X-chromosome dosage discussed dosage in
compensation compensation effect regulated
Drosophila. Unlike by an discussed
the previously lncRNA is X-chromosome
X-chromosome
dosage compensation
inactivation in Drosophila.genes
dosage compensation, Unlike the single
on the previously discussed in
X chromosome X-chromosome
Drosophila males inactivation
must be
dosage compensation, genes on the single X chromosome in Drosophila
upregulated in order to match the expression levels of the genes on the two X chromosomes in males must be upregulated in
order to match the expression levels of the genes on the two X chromosomes
females [31]. Initial research revealed that this upregulation is mediated by male-specific lethal in females [31]. Initial
research
(MSL) revealedwhich
complex, that this upregulation
includes is mediated
two lncRNAs (roX1byandmale-specific
roX2) and lethal (MSL) complex,
five proteins (MSL1, MSL2,which
MSL3, MOF (males absent on the first), and MLE (maleless)) [32,33]. This complex is able to bind on
includes two lncRNAs (roX1 and roX2) and five proteins (MSL1, MSL2, MSL3, MOF (males absent to
the first),
the and MLEsites
high-affinity (maleless))
(HAS)[32,33]. This complex isand
on X-chromosome able direct
to bindhistone
to the high-affinity
H4 lysine16 sitesacetylation
(HAS) on
X-chromosome
(H4K16ac), whileandtwo direct histone
lncRNAs H4 lysine16
involved in theacetylation
formation (H4K16ac), whileRNA
of this complex, two lncRNAs
on the X1involved
(roX1) and in
the formation of this complex, RNA on the X1 (roX1) and
RNA on the X2 (roX2), serve as scaffolds essential for X-chromosome targeting [33].RNA on the X2 (roX2), serve as scaffolds
essential for X-chromosome
In order targeting
to unveil the specific [33].
mechanisms underlying MSL interactions with roX1 and roX2,
In order to unveil the specific mechanisms
Ilik et al., who suggested that MLE (maleless) RNA helicase underlying MSL interactions
and MSL2 with roX1 and roX2,
(male-specific lethalIlik
2
et al., who suggested that MLE (maleless) RNA helicase and MSL2 (male-specific
homolog) ubiquitin ligase are required for the association of roX lncRNAs with the complex, showed lethal 2 homolog)
ubiquitin
that ligase are
the tandem required for
stem-loop the association
structures in roX1of(D1–D3)
roX lncRNAs
and roX2with the
exon3complex, showed that
were involved the
in the
tandem stem-loop structures in roX1 (D1–D3) and roX2 exon3 were
interactions with MLE and MSL2 [34]. RoX1 D3 region showed the highest MLE-binding capacity, involved in the interactions with
MLEthe
and andbinding
MSL2 [34]. of MLERoX1toD3 region showed
different domainsthe ofhighest MLE-binding
roX2 showed differentcapacity, and the binding
ATP requirements. of
This
MLE to different
complex is able todomains
bind to the of roX2 showed
first half of roX2different
in an ATP requirements.manner,
ATP-independent This complex is able
while the to bind
binding to
to the first half of roX2 in an ATP-independent manner, while the binding
the second half of this molecule is ATP-dependent [34]. Additionally, only when the combinatorial to the second half of this
molecule isoccurred
mutations ATP-dependentin tandem [34]. Additionally,
stem-loops only loss
of roX2, when ofthe combinatorial
dosage compensation mutations
occurred occurred
as well,in
tandem stem-loops of roX2, loss of dosage compensation occurred as well,
indicating the existence of structural redundancy in lncRNAs (Figure 3). These results show that the indicating the existence of
structural redundancy in lncRNAs (Figure 3). These results show that
functions of roX during the recruitment of MSL complex assemblies are determined by the specific the functions of roX during the
recruitment
tandem of MSLdomains.
stem-loop complex assemblies are determined by the specific tandem stem-loop domains.
Figure 3. RoX2 tandem stem-loops are involved in MSL complex assembly.RoX2 tandem stem-loops
Figure 3. RoX2 tandem stem-loops are involved in MSL complex assembly. RoX2 tandem stem-loops
are highly conserved. MLE binding to the different parts of tandem stem-loops has different ATP
are highly conserved. MLE binding to the different parts of tandem stem-loops has different ATP
requirements. MLE binding to the first half of roX2 does not require ATP, while binding to the
requirements. MLE binding to the first half of roX2 does not require ATP, while binding to the second
second half is ATP-dependent. Only when combinatorial mutations occur in stem loops, roX2 is no
half is ATP-dependent. Only when combinatorial mutations occur in stem loops, roX2 is no longer able
longer able to recruit MSL, which results in the loss of dosage compensation and male lethality.
to recruit MSL, which results in the loss of dosage compensation and male lethality.
Int. J. Mol. Sci.
Sci. 2016,
2016, 17,
17, 702
702 5 of 20
21
2.1.3. minHOTAIR Binds PRC2, while D4 Domain Recruits the LSD1 Complex
2.1.3. minHOTAIR Binds PRC2, while D4 Domain Recruits the LSD1 Complex
HOTAIR (HOX antisense intergenic RNA), which contains 2158 nucleotides, is an antisense
HOTAIR
transcript (HOX[35].
of HOXC antisense intergenic RNA),
It is a trans-acting factor thatwhich contains
regulates HOXD2158gene
nucleotides,
expression is by
an recruiting
antisense
transcript of HOXC [35]. It is a trans-acting factor that regulates HOXD
PRC2 and lysine-specific demethylase 1 (LSD1) to the specific sites [36]. The PRC2 complex is gene expression by recruiting
PRC2
comprisedand lysine-specific
of three core proteindemethylase 1 (LSD1)
subunits, EZH2, to EED,
the specific sites [36].
and SUZ12, whichThe arePRC2involvedcomplexin the is
comprised
regulation of of three core protein
H3K27me3, whilesubunits, EZH2,
LSD1 leads to EED, and SUZ12, which
the demethylation are involved
of histone H3 lysinein the4,regulation
which is
of H3K27me3,
crucial while LSD1
for transcriptional leads to [37];
activation the demethylation
its overexpression of histone
may lead H3tolysine 4, which [38,39].
tumorigenesis is crucial for
transcriptional
Sophisticated activation [37];functions
biological its overexpression may lead toby
are often determined tumorigenesis
highly conserved [38,39].structures, and this
Sophisticated biological functions are often determined
is the case with HOTAIR as well. More than 50% of HOTAIR nucleotides are base-paired by highly conserved structures, and and this
this
is the case with HOTAIR as well. More than 50% of HOTAIR nucleotides
highly structured lncRNA contains 56 helical segments, 38 terminal loops, 34 internal loops, and are base-paired and this
highly structured
19 junction regions lncRNA containsstudies
[40]. Previous 56 helical segments,
showed that 38theterminal
300-merloops,
domain 34 internal
at the 5’loops, and 19
terminus of
junction regions [40]. Previous studies showed that the 300-mer domain
HOTAIR is involved in PRC2 binding [37]. However, a much shorter section was determined by at the 5’ terminus of HOTAIR is
involved
Wu et al. in to PRC2
contain binding [37]. However,
the minimal bindingamotif
muchof shorter
HOTAIR section was determined
(minHOTAIR), andby 2Detstructure
itsWu al. to contain
was
the minimal by
established binding motifdigestion
nuclease of HOTAIR (minHOTAIR),
experiments [41]. and its 2D structure
An 89-mer domainwas established
at the 5’ end ofbyHOTAIR,
nuclease
digestion experiments [41].
termed minHOTAIR, An 89-mer
includes domain
two duplex at the 5’
regions end of HOTAIR,
connected termed minHOTAIR,
by a 10-nucleotide single strand includes
(ss)
two
RNAduplex
linker.regions connectedof
The disruption bythis
a 10-nucleotide
highly conserved single structure
strand (ss)affects
RNA linker.
PRC2 The disruption
binding to HOTAIR,of this
highly
which conserved
demonstrates structure
a closeaffects PRC2 binding
relationship to HOTAIR,
between lncRNA which demonstrates
biological functions a close
andrelationship
structural
between lncRNA
conservation [41]. biological functions and structural conservation [41].
In contrast,
contrast, the theLSD1
LSD1complex
complexisisrecruited
recruited using
using thethe
motif on on
motif the the
3’ end of HOTAIR
3’ end of HOTAIR [37]. This
[37].
motif is a 646-mer domain very different from PRC2 recruitment domain,
This motif is a 646-mer domain very different from PRC2 recruitment domain, and nucleotides and nucleotides between the
positions
between the 1500positions
and 21481500contribute
and 2148to the formationtoofthe
contribute thisformation
functionalof domain [37]. Somarowthu
this functional domain [37]. et al.
determined
Somarowthuthat thedetermined
et al. nucleotide sequence involved in
that the nucleotide the LSD1
sequence complexinbinding
involved the LSD1 motif is verybinding
complex similar
to the sequence
motif of a conserved
is very similar domain, D4,
to the sequence of awhich contains
conserved 20 helices,
domain, D4,13which
terminal loops,20
contains andhelices,
seven
junctions
13 terminal (Figure
loops,4)and[42].seven
Theirjunctions
findings (Figure
show that the functions
4) [42]. Their findingsof HOTAIR
show in that thethe
recruitment
functions of
different histone modification complexes are achieved mainly by
HOTAIR in the recruitment of different histone modification complexes are achieved mainly the intricate and modular nature
by theof
its secondary
intricate structures.
and modular nature of its secondary structures.
Figure 4. minHOTAIR and D4 regions of HOTAIR recruit PRC2 and LSD1, respectively, in order to
Figure 4. minHOTAIR and D4 regions of HOTAIR recruit PRC2 and LSD1, respectively, in order to
regulate HOXD expression.
regulate HOXD expression.
2.1.4. MALAT1: Triple Helix Structure Explains the High Stability of Long
2.1.4. MALAT1: Triple Helix Structure Explains the High Stability of Long
Nuclear-Retained Transcripts
Nuclear-Retained Transcripts
MALAT1 (metastasis
MALAT1 (metastasis associated
associated lung
lung adenocarcinoma
adenocarcinoma transcript
transcript 1),
1), also
also called
called NEAT2 (nuclear
NEAT2 (nuclear
enriched abundant transcript 2), is a type of long nuclear-retained transcript that was shown to
enriched abundant transcript 2), is a type of long nuclear-retained transcript that was shown be
to be
associated with
associated with cancer
cancer cell
cell metastases.
metastases. ItIt is
is widely
widely expressed
expressed in both human
in both human andand mouse
mouse tissues,
tissues, and
and
it is overexpressed in many human carcinomas [43,44]. Aberrant expression
it is overexpressed in many human carcinomas [43,44]. Aberrant expression of MALAT1 leads toof MALAT1 leads to aa
decrease in patient survival [45]. This lncRNA is able to regulate alternative splicing by
decrease in patient survival [45]. This lncRNA is able to regulate alternative splicing by modulating modulating
the cellular
the cellular levels
levels of
of serine/arginine
serine/arginine (SR)
(SR) factors
factors [46].
[46].
Int. J. Mol. Sci.
Sci. 2016,
2016, 17,
17, 702
702 6 of 20
21
Unlike the 3’ or 5’ ends of other RNAs that are produced by canonical cleavage, RNase P is
Unlike for
responsible the the
3’ or 5’ ends ofofother
generation the 3’RNAs
end ofthat are produced
MALAT1 and theby5’canonical cleavage,cytoplasmic
end of tRNA-like RNase P is
responsible for the generation of the 3’ end of MALAT1 and the
RNA designated as MALAT1-associated small cytoplasmic RNA (mascRNA) [47]. Wilusz et 5’ end of tRNA-like cytoplasmic RNA al.
designated asthe
investigated MALAT1-associated
structure of MALAT1 smallandcytoplasmic RNA (mascRNA)
other nuclear-retained [47]. Wilusz
transcripts, andet they
al. investigated
suggested
the structure
that the short of MALAT1
poly(A)-richand tract
other at nuclear-retained
the 3’ ends of transcripts, and they suggested
these transcripts may existthat in the
all short
long
poly(A)-rich tract at the 3’ ends of these transcripts may exist in all long nuclear-retained
nuclear-retained transcripts [48]. Considering that the poly(A) tail of mRNA increases its stability, transcripts [48].
Considering
and that the of
the long half-life poly(A)
MALAT1,tail of it mRNA
has beenincreases
suggested its that
stability, and poly(A)
the short the longtail-like
half-lifemoieties
of MALAT1,may
correlate with the stability of MALAT1 and its resistance to exonucleases [49]. A recentlystability
it has been suggested that the short poly(A) tail-like moieties may correlate with the published of
MALAT1
study and its resistance
performed to exonucleases
by this group showed that [49].the
A recently published study
highly conserved performed
poly(A)- and its by this group
neighboring
showed that the highly conserved poly(A)- and its neighboring U-rich motifs
U-rich motifs act together in order to protect the 3’ end of MALAT1 from the activity of exonucleases act together in order to
protect the 3’ end of MALAT1 from the activity of exonucleases through
through base pairing [48]. However, it was found that base pairing between U-rich motif 2 and base pairing [48]. However, it
was found that
poly(A)-rich base
tract onlypairing between
partially U-rich motif
contributes 2 and poly(A)-rich
to MALAT1 tract only
stability. Further partially
analysis contributes
revealed that a
to MALAT1 stability. Further analysis revealed that a triple helix
triple helix U•A-U (where • and -represent Hoogsteen and Watson-Crick faces, respectively),U‚A-U (where ‚ and -represent
Hoogsteen
formed and Watson-Crick
by U-rich faces, respectively),
motif 1 interacting with A-U duplex formed by U-rich
through motif 1hydrogen
Hoogsteen interacting with A-U
bonding, is
duplex through Hoogsteen hydrogen bonding, is involved in the maintenance
involved in the maintenance of the transcript stability (Figure 5) [49,50]. A similar triple helix of the transcript stability
(Figure
structure 5)has
[49,50].
beenA similar
found triple helix
in multiple structure
endocrine has been found
neoplasia-β (MENβ) in multiple
RNA, whichendocrine neoplasia-β
is another lncRNA
(MENβ) RNA,localization
with nuclear which is another
and a lncRNA with nuclear
long half-life localization
[50]. Therefore, and a long
it appears thathalf-life [50]. Therefore,
the formation of the
it appears that the formation of the triple helixes on 3’ ends is a common way
triple helixes on 3’ ends is a common way for long nuclear-retained transcripts to avoid exonuclease for long nuclear-retained
transcripts to which
degradation, avoid exonuclease
enhances their degradation, which enhances their biological functions.
biological functions.
Figure 5. Triple helix structure of MALAT1 explains its high stability. RNase P is involved in the
Figure 5. Triple helix structure of MALAT1 explains its high stability. RNase P is involved in the
generation of the 3’ end of MALAT1 and the 5’ end of tRNA-like cytoplasmic RNA designated as
generation of the 3’ end of MALAT1 and the 5’ end of tRNA-like cytoplasmic RNA designated as
mascRNA.
mascRNA. A A triple
triple helix
helix (U•A-U) formed by
(U‚A-U) formed by the
the conserved
conserved poly(A)-
poly(A)- and its flanking
and its flanking U-rich
U-rich motifs
motifs
prevents the degradation of MALAT1 by exonucleases.
prevents the degradation of MALAT1 by exonucleases.
2.1.5. Gas5 Acts as a Decoy for the Glucocorticoid Receptor through Structure Transformation
Growth arrest-specific transcript 5 (Gas5) was shown to be downregulated in many cancer
tissues, and therefore it has long been considered a cancer-related lncRNA [51]. Recently, Kino et al.
showed that it also acts as a decoy for glucocorticoid receptor (GR), regulating target gene
expression [52]. When Gas5 is not present in the glucocorticoid signaling pathway, glucocorticoid
Int. J. Mol. Sci. 2016, 17, 702 7 of 21
2.1.5. Gas5 Acts as a Decoy for the Glucocorticoid Receptor through Structure Transformation
Growth arrest-specific transcript 5 (Gas5) was shown to be downregulated in many cancer tissues,
and therefore it has long been considered a cancer-related lncRNA [51]. Recently, Kino et al. showed
that it also acts as a decoy for glucocorticoid receptor (GR), regulating target gene expression7 of
Int. J. Mol. Sci. 2016, 17, 702
[52].
20
When Gas5 is not present in the glucocorticoid signaling pathway, glucocorticoid (GC) first binds to
GR infirst
(GC) cytoplasm,
binds to forming
GR in acytoplasm,
GC–GR complex,
formingwhich is transported
a GC–GR complex,into the is
which nucleus, whereinto
transported it binds
the
glucocorticoid
nucleus, whereresponse
it binds elements (GREs)
glucocorticoid via its DNA
response binding
elements domains,
(GREs) via its leading to the activation
DNA binding domains,
of gene to
leading expression
the activation[11].Gas5 is expression
of gene able to mimic GREs isthrough
[11].Gas5 changes
able to mimic in its
GREs secondary
through structure
changes in its
and competitively binds to GR, effectively blocking glucocorticoid signal
secondary structure and competitively binds to GR, effectively blocking glucocorticoid signal transduction by removing
GR moleculesbyfrom
transduction the signaling
removing pathway
GR molecules (Figure
from 6) [52]. By
the signaling comparing
pathway (Figurehuman and
6) [52]. Bymouse Gas5
comparing
human and mouse Gas5 structures, researchers found that even though the nucleotide sequencesnot
structures, researchers found that even though the nucleotide sequences of Gas5 transcripts are of
highlytranscripts
Gas5 homologous, the highly
are not functional motif ablethe
homologous, to functional
bind GR ismotif
conserved
able toacross
bind GRspecies [52]. Therefore,
is conserved across
it was suggested
species that theitmechanism
[52]. Therefore, of Gas5
was suggested interactions
that with theoftranscription
the mechanism factor through
Gas5 interactions with thea
structural transformation may exist in other lncRNAs with similar domains,
transcription factor through a structural transformation may exist in other lncRNAs with similar but this requires further
validationbut
domains, [18,53].
this requires further validation [18,53].
Figure 6. The role of Gas5 secondary structure transformation in glucocorticoid signal transduction. Gas5
Figure 6. The role of Gas5 secondary structure transformation in glucocorticoid signal transduction.
serves as a decoy for GR and removes it from the signaling pathway by changing its secondary
Gas5 serves as a decoy for GR and removes it from the signaling pathway by changing its secondary
structure. POL: Polymerase.
structure. POL: Polymerase.
Figure 7. IPS1 functions as an endogenous target mimic through a 23-nucleotide (nt)-long conserved
motif. The conserved 23-nt-long motif of IPS1, which shows imperfect complementarity with
miR399, ensures binding with miR399. This leads to an increased expression of miR399 target genes
and changes in phosphate content, since miR399 can no longer affect its targets.
2.2.2. Functional Domains of COOLAIR and COLDAIR Are Involved in the Repression of Flowering
Locus C (FLC)
Flowering transition is a crucial step for plant reproductive development, and FLC has long
been known as a regulator of flowering in plants [61]. Recently, the studies showed that two
vernalization-induced lncRNAs, COOLAIR (Cold Induced Long Antisense Intergenic noncoding
Figure 7.
7. IPS1
IPS1 functions
functions as an
an endogenous
endogenous targettarget mimic
mimic through
through aa 23-nucleotide
23-nucleotide (nt)-long conserved
RNA)Figure
and COLDAIR (ColdasAssisted Intronic noncoding RNA), could regulate (nt)-long conserved
A. thaliana flowering
motif. The
motif. Theconserved
conserved23-nt-long
23-nt-long motif
motif of IPS1,
of IPS1, whichwhichshowsshowsimperfect imperfect complementarity
complementarity with
with miR399,
time through FLC repression [62]. COOLAIR, transcribed from the 3’ end of FLC, represents a group
miR399, ensures binding with miR399. This leads to an increased expression
ensures binding with miR399. This leads to an increased expression of miR399 target genes and changes of miR399 target genes
of long
and
non-coding
changes in
antisense
phosphate
RNAs [62,63]. Even no though it is not indispensable for the direct
in phosphate content, sincecontent,
miR399since can no miR399
longercan affect itslonger affect its targets.
targets.
epigenetic silencing of FLC, it significantly promotes FLC transcriptional repression [64]. Recently,
COOLAIR transcription was
2.2.2. Functional found toand be COLDAIR
correlated Are withInvolved
the R-loop thestructure, formed by an
2.2.2. Functional Domains
Domains of of COOLAIR
COOLAIR and COLDAIR Are Involved in in the Repression
Repression of Flowering
of
RNA–DNA
Locus C (FLC)
Flowering hybrid,
Locus C (FLC) together with a displaced ssDNA strand [65]. R-loops were initially considered
transcriptional byproducts without any biological functions. However, Sun et al. showed that the
Flowering
R-loop, covering transition
the COOLAIRis a crucial step for is
promoter, plant
ablereproductive
to promotedevelopment,
FLC expression and byFLCrepressing
has long
been known as
COOLAIR transcription a regulator
regulator
(Figure of
of 8)flowering
flowering in
[65]. TheinR-loop plants [61].
plantsstructure Recently,
[61]. Recently, the studies showed
has been shown to have multiple roles, that two
vernalization-induced
vernalization-induced
and these structures may lncRNAs,
lncRNAs,
play crucialCOOLAIR
roles in (Cold
COOLAIR Induced of
the regulation Long
geneAntisense
expressionIntergenic noncoding
in many organisms.
RNA)
RNA) and
and COLDAIR
COLDAIR (Cold Assisted
Assisted Intronic noncoding
noncoding RNA),
COLDAIR, originating from the first intron of FLC, has the characteristics of transcripts could regulate A. thaliana flowering
flowering
time through
time throughby
transcribed FLC
FLCPol repression
repression
IV and [62]. [62].
Pol V, COOLAIR,
COOLAIR,
including transcribed
transcribed
5’ cappedfrom from the 3’
the 3’ end
structure, end of
butof no FLC,
FLC, represents
represents
poly(A) a group
tail a[66].
group of
The
of
longlong non-coding
non-coding antisense
antisense RNAs RNAs
[62,63]. [62,63].
Even Even
though though
it is
knockdown of COLDAIR by RNA interference (RNAi) compromises the vernalization response, not it is not
indispensableindispensable
for the for
direct the direct
epigenetic
epigeneticofits
silencing
indicating silencing
FLC, of FLC,
role itinsignificantly
FLC it significantly
epigenetic promotes
silencing promotes
FLC It actsFLC
[5].transcriptional
in thetranscriptional
repression
same repression
way as [64].
Xist and [64].COOLAIR
Recently,
HOTAIR, Recently,
which
COOLAIR
transcription transcription
was found towas
be found
correlated to be
with correlated
the R-loop with the
structure,
serve as scaffolds for the recruitment of PRC2 complexes to specific loci and induce epigenetic R-loop
formed structure,
by an formed
RNA–DNA by an
hybrid,
RNA–DNA
together
silencing [5].hybrid,
with togetherssDNA
a displaced
This indicates withthe
that a strand
displaced [65].
epigenetic ssDNAR-loops
silencing strandwere [65].
mediated R-loops
initially wererecruitment
considered
by PRC2 initially considered
transcriptional
through
transcriptional
byproducts
lncRNAs is an byproducts
without without
any biological
evolutionarily any biological
functions.
conserved However,
mechanism functions.
in both However,
et al. showed
Sunanimals Sun et
that the
and plants al. showed
R-loop,
[67]. that
Recentcovering the
studies
R-loop,
the
show covering
COOLAIR
that doublethe COOLAIR
the promoter, is able to promoter,
stem-and-loop promote FLC
structuresis expression
able
formed to promote
byby FLC
thanexpression
repressing
fewer nts in by
COOLAIR
100 repressing
transcription
lncRNAs are
COOLAIR
(Figure 8) transcription
[65]. The R-loop (Figure 8)
structure [65].
has The
been R-loop
shown structure
to have has
multiple
involved in PRC2 recruitment in vitro, demonstrating the significance of lncRNA structures for been shown
roles, to
and have
these multiple
structures roles,
may
the
and these
play crucial
determinationstructures
roles in the
of their may play crucial
regulation
functional of gene
roles roles in the regulation
[68].expression in many oforganisms.
gene expression in many organisms.
COLDAIR, originating from the first intron of FLC, has the characteristics of transcripts
transcribed by Pol IV and Pol V, including 5’ capped structure, but no poly(A) tail [66]. The
knockdown of COLDAIR by RNA interference (RNAi) compromises the vernalization response,
indicating its role in FLC epigenetic silencing [5]. It acts in the same way as Xist and HOTAIR, which
serve as scaffolds for the recruitment of PRC2 complexes to specific loci and induce epigenetic
silencing [5]. This indicates that the epigenetic silencing mediated by PRC2 recruitment through
lncRNAs is an evolutionarily conserved mechanism in both animals and plants [67]. Recent studies
show that the double stem-and-loop structures formed by fewer than 100 nts in lncRNAs are
involved in PRC2 recruitment in vitro, demonstrating the significance of lncRNA structures for the
determination of their functional roles [68].
Figure 8. R-loop structures covering the COOLAIR promoter repress COOLAIR transcription. FLC:
Figure 8. R-loop structures covering the COOLAIR promoter repress COOLAIR transcription. FLC:
Flowering Locus C.
Flowering Locus C.
COLDAIR, originating from the first intron of FLC, has the characteristics of transcripts transcribed
by Pol IV and Pol V, including 5’ capped structure, but no poly(A) tail [66]. The knockdown of
COLDAIR by RNA interference (RNAi) compromises the vernalization response, indicating its
role in FLC epigenetic silencing [5]. It acts in the same way as Xist and HOTAIR, which serve as
scaffolds for the recruitment of PRC2 complexes to specific loci and induce epigenetic silencing [5].
This indicates that the epigenetic silencing mediated by PRC2 recruitment through lncRNAs is an
Figure 8. R-loop structures covering the COOLAIR promoter repress COOLAIR transcription. FLC:
Flowering Locus C.
Int. J. Mol. Sci. 2016, 17, 702 9 of 21
evolutionarily conserved mechanism in both animals and plants [67]. Recent studies show that the
double stem-and-loop structures formed by fewer than 100 nts in lncRNAs are involved in PRC2
recruitment in vitro, demonstrating the significance of lncRNA structures for the determination of their
functional roles [68].
2.2.3. LDMAR: lncRNA Structural Integrity Is Required in Order to Exert Biological Functions
Photoperiod is known to be very important in the regulation of plant growth and development.
Recently Ding et al. found that a 1236-nt long lncRNA, termed long-day-specific male-fertility-associated
RNA (LDMAR), plays a significant role in the regulation of photoperiod-sensitive male sterility (PSMS)
in rice Nongken 58S (NK 58S), a spontaneous mutant of Nongken 58N (NK 58N) [69]. Under long-day
conditions, the reproductive development of both NK 58S and NK 58N requires a high expression of
LDMAR. Several studies showed that the methylation level of LDMAR promoter regions in NK 58S
was considerably higher than the level in NK 58N, leading to a much lower LDMAR expression in NK
58S, and finally resulting in PSMS [69]. Further analyses showed that this phenomenon was directly
caused by LDMAR structural changes. Compared with the structure of LDMAR in NK 58N, the
secondary structure of LDMAR in NK 58S was altered by spontaneous mutations, generating several
small RNAs, which are involved in an RNA-dependent DNA methylation (RdDM) pathway, thereby
increasing the methylation in the promoter region of LDMAR [70]. Therefore, it was shown that the
transcription level of LDMAR is reduced under long-day conditions and PSMS appears because of the
decrease in LDMAR levels [71]. Although the specific structure associated with LDMAR expression
and the underlying biochemical mechanisms remain unknown, LDMAR functional studies showed
that structural integrity is crucial for lncRNA biological function.
2.2.4. ENOD40 Highly Structured Motif Is Involved in MtRBP1 Binding and Trafficking
The ENOD40 (early nodulin 40) gene was initially found to play a significant role in the root
nodule organogenesis of leguminous plants [72,73]. It was also suggested that ENOD40 participates
in other non-symbiotic plant developmental processes, including the differentiation of vascular
bundles [73]. The abundance and degree of conservation of ENOD40 in plants suggest that this
gene may have conserved biological functions. Its transcript ENOD40 RNA, which contains a short
open reading frame mRNA (sORF-mRNA) was shown to have a bi-functional role in the process of
nodule organogenesis [72,74]. Rohrig et al. found that the conserved domains at the 5’ end of ENOD40
in soybeans encode for two 12- and 24-amino acid peptides in vitro [75]. Both of these peptides are able
to affect sucrose synthase activity by binding to a component of sucrose synthase named nodulin 100,
following its translation [75].
Comparing ENOD40 structure in different leguminous species, Girard et al. showed that five
domains in ENOD40 were highly conserved, and that uridine residues were numerous in most of
these conserved terminals and loops [76]. However, ENOD40 is not restricted to symbiotic plant
development [73], and new studies have shown that it can function as a guide, directing the relocation
of NSR (nuclear speckle RNA-binding proteins). A novel NSR, MtRBP1 (Medicago truncatula RNA
Binding Protein 1), can be transported by ENOD40 into cytoplasmic granules during nodulation.
Mutations that impair the translation of the two peptides do not influence the trafficking activity of
ENOD40, suggesting that ENOD40 has different functional roles, supported by different motifs [77].
Though ENOD40 functions as both a protein-coding and non-coding gene, the highly conserved RNA
structures imply that ENOD40 belongs to the group of lncRNAs [72]. Furthermore, it has recently
been reported that some ncRNAs have the potential to encode small peptides as well, indicating
that ENOD40 should be categorized as an lncRNA [78]. Later in A. thaliana, Bardou et al. found a
similar lncRNA-ASCO-RNA (Alternative Splicing Competitor RNA), previously named lnc351, that
could modulate alternative splicing through binding with NSR in vivo [79]. Although structures of
ASCO for NSR binding have not been revealed yet, we could infer that ASCO might also be highly
lncRNA structural or biochemical studies often require pure and homogeneous samples [81].
Therefore, lncRNA purification methods, which directly determine the quality of downstream
analysis, are important for structure probing [81]. Initially, RNA purification protocols use
denaturing polyacrylamide gel electrophoresis to achieve target RNA in vitro isolation. However, the
application of these methods is limited, since denatured RNAs are often misfolded. Additionally,
Int. J. Mol. Sci. 2016, 17, 702 10 of 21
lncRNAs, unlike mRNAs, show little structural constraint and often form alternative conformations
in vivo, making them even harder to analyze [82,83].Therefore, several different approaches that
avoid RNA denaturation have been developed to overcome these issues in recent years. Most of
structured. ENOD40 studies show that highly structured lncRNAs can simultaneously determine
those approaches utilize affinity tag, which is involved in the immobilization of the target RNAs,
multiple biological functions.
and ribozyme, to elute them specifically [82]. Although this has been successfully applied for the
investigation of guanine riboswitch structure, the idiosyncrasy of these methods hinders their
3. Technologies Used in the Structural Studies of RNAs
further application. Batey and Kieft increased the applicability and reliability of this method through
the introduction
There is no doubt of MS2
that coat
toolsprotein for the
used for the investigation
immobilizationofand
RNAglmstructures
S ribozymesignificantly
for the targetcontribute
RNA
elution
to a rapid [82]. Subsequently,
increase Chillón etof
in our understanding al.RNA
introduced a more
function. convenient
Currently, and robust approach
the technologies for
for the structural
lncRNA purification. Compared with the previously described approaches, this
characterization of RNAs encompass in vitro and in vivo methods [16]. In vitro methods mainly usemethod, which does
not involve RNA denaturation and affinity tag design, not only preserves lncRNA functional
different RNases to digest the RNA molecules of interests, while chemical reagents with cell penetration
elements but also simplifies cloning design [84]. This newly published lncRNA protocol includes
abilities are often applied for in vivo RNA structure probing [80]. In the following sections, we will
the following steps [84]: T7 RNA polymerase system is used for RNA synthesis, followed by the
discuss the basic
addition principles
of DNase andfor
enzyme, applications
the digestion ofofthese
DNAtechnologies
template, andthat could
by the potentially
addition be applied
of proteinase K, to
investigate lncRNA structures, together with the description of several lncRNA purification
which is responsible for the proteolysis of enzymes. The desired RNA is obtained by ultrafiltration methods
for motif determination.
and purified using size-exclusion chromatography (Figure 9).
Figure 9. Enzymatic synthesis and purification of lncRNA. T7 RNA polymerase system is used for
Figure 9. Enzymatic synthesis and purification of lncRNA. T7 RNA polymerase system is used for
RNA synthesis, followed by the addition of DNase enzyme for the digestion of DNA template, and
RNAby synthesis, followed by the addition of DNase enzyme for the digestion of DNA template, and by
the addition of proteinase K, which is responsible for the proteolysis of enzymes. The desired
the addition of proteinase K, which isand
RNA is obtained by ultrafiltration responsible for the
purified using proteolysischromatography.
size-exclusion of enzymes. TheFPLC:
desired
FastRNA
is obtained by ultrafiltration and
Protein Liquid Chromatography. purified using size-exclusion chromatography. FPLC: Fast Protein
Liquid Chromatography.
previously described approaches, this method, which does not involve RNA denaturation and affinity
tag design, not only preserves lncRNA functional elements but also simplifies cloning design [84].
This newly published lncRNA protocol includes the following steps [84]: T7 RNA polymerase system
is used for RNA synthesis, followed by the addition of DNase enzyme, for the digestion of DNA
template, and by the addition of proteinase K, which is responsible for the proteolysis of enzymes.
The desired RNA is obtained by ultrafiltration and purified using size-exclusion chromatography
(Figure 9).
functional domains of lncRNAs, Ilik et al. demonstrated the accuracy of this method by comparing the
datasets to the results obtained by SHAPE, and their results were concordant [34]. PARS is the first
high-throughput approach for the genome-wide elucidation of RNA structural properties [94], and it
will undoubtedly play a significant role in further structural analyses of lncRNA.
Another nuclease-based approach is fragmentation sequencing (FragSeq), which utilizes
P1 endonuclease to digest single-stranded RNA, followed by high-throughput sequencing and
bioinformatic analyses of the generated fragments (Table 1) [95]. Although only single-stranded RNA
regions can be directly identified using this approach, its biggest advantage lies in endogenous control,
which shows the ability to recognize 5’ phosphate and 5’ hydroxyl residues that are not generated
by nuclease digestion, significantly increasing the accuracy of this method [94]. The feasibility and
reproducibility of this method have been validated by the identification of the entire mouse nuclear
transcriptome, leading to the discovery of novel conserved structures of ncRNAs [95].
random hexamer (N6 ) reverse transcription for the first strand cDNA synthesis and the addition of
a part of NGS adapter on one side. Additionally, cDNA ligation differs between all three methods.
Structural-seq uses linear DNA ligation, while intramolecular circular DNA ligation is used in
DMS-seq and Mod-seq. Furthermore, DMS-seq and Structural-seq are used for the investigations of
polyadenylated transcripts, while Mod-seq can be used to study total RNA [16].
These techniques have been used to determine the secondary structures of coding and non-coding
RNAs. Rouskin et al. used DMS-seq to probe mRNA structures in yeast and mammalian cells, showing
an excellent agreement with the previously determined mRNA structures [104]. Ding et al. investigated
RNA structures of A. thaliana in vivo by Structure-seq, and found a three-nucleotide periodic repeat
pattern in the coding regions, which was closely associated with translational efficiency [105]. The
structural information of four rRNAs and 32 additional RNAs in yeast was determined by Mod-seq.
Furthermore, Mod-seq has been proven to be a robust method for the investigations of the structures
of long RNAs and complex RNA mixtures, because of its correct detection of structural changes in
5.8S and 25S rRNAs in the ribosomal protein L26 deletion mutant [102]. Although these methods have
been widely used in RNA structural studies, several disadvantages remain. For example, DMS reagent
has a limited shelf life, and the use of a reagent that is not fresh can lead to poor target modification
and high error rates [100]. The selection of primers should be carefully considered, because the use
of primers with poor specificity and labeling efficiency can result in multiple unwanted disruptions
of the process [100]. Furthermore, the ability of DMS to differentiate between dsRNA and ssRNA
is hindered when ssRNA interacts with RNA binding protein (RBP) in vivo [106]. Therefore, a more
suitable chemical reagent needs to be developed in order for these issues to be resolved.
3.3.2. icSHAPE
A traditional SHAPE reagent can be used for highly accurate studies of RNA structures composed
of all four nucleotides [107]. However, the high background signal obtained by the traditional SHAPE
probing methods increases false positive rates. Additionally, RNA structural information obtained
in vitro greatly differs from its dynamic structure in vivo. In contrast to this, DMS allows for RNA
structure probing in vivo, but only two of the four nucleotides can be modified, which often leads to
incorrect results [107]. Because of this, a new method termed In vivo Click SHAPE (icSHAPE), using
an improved SHAPE reagent for genome-wide investigations of RNA structure, has been created
(Table 1) [108]. The existing SHAPE probe 2-methylnicotinic acid imidazolide (NAI) is changed into
NAI-N3 by adding an azide group, making it possible for RNA structure probing in vivo [108]. This
azide group plays a very important role in the subsequent “click” of biotin moiety to SHAPE reagent,
which allows for the purification of NAI-N3 -modified RNA through streptavidin beads, and the signal
to noise ratio of sequencing results vastly increases after the enrichment of modified RNAs [108]. The
accuracy and reproducibility of icSHAPE have been validated by studying the known structures of
18S and 28S rRNAs in mouse embryonic stem cells (mESC) [107]. Furthermore, icSHAPE showed that
3’ UTR structures tend to be more single-stranded than CDS or 5’ UTR. ncRNAs, such as pseudogenes,
lncRNAs, and primary miRNA precursors, tend to be more folded in vivo, suggesting that mRNA and
ncRNA structures differ greatly in vivo [108].
in agreement with the previous ones [109]. Additionally, multiple base paired regions between U3
snoRNA and pre-rRNA strongly facilitate pre-rRNA folding and its subsequent processing, suggesting
the significant contribution of intramolecular interactions to the maintenance of RNA secondary
structure [110]. CLASH was applied for the mapping of the human interactome, and Helwak et al.
found that majority of miRNAs interact with mRNAs through 5’ seed region [109]. Furthermore,
nearly 60% of miRNA-mRNA interactions are achieved by non-canonical base pairing, containing
bulges, loops, and hairpins, which may affect the response of RNA-induced silencing complex (RISC)
to miRNA-target binding [109].
Another probing method, with a similar approach to the previous one, is hiCLIP (RNA hybrid and
individual-nucleotide resolution UV cross-linking and immunoprecipitation) (Table 1) [111]. Compared
with CLASH, hiCLIP shows a greater control over the ligation of two RNA strands. Sugimoto et al.
applied hiCLIP in the studies of duplex structures bound by a dsRBP, termed Staufen 1 (STAU1),
which is involved in mRNA localization, stability, and translation. The results showed that almost
70% of duplexes can be found in 3’ UTR and duplexes in CDS tend to have shorter loops than in the
UTRs [111]. In addition, hiCLIP identified an 858-nt-long duplex region in the 3’ UTR of XBP1, a STAU1
negatively-regulated mRNA. This duplex was found to play a central role in the regulation of XBP1
stability. A decrease in this stability was observed when the structure of the duplex was disrupted by
AA dinucleotide insertion, while its stability returned to the original levels when a complementary TT
dinucleotide was inserted, demonstrating a close structure–function relationships [111]. Nevertheless,
icCLIP shows severe limitations in the probing of other RNA secondary structures that are not involved
in RBP interactions.
Acknowledgments: We wish to thank Yong-Fang Yang and Tian Wang for stimulating discussions and critical
review of the manuscript. This work was supported by grants from the National Natural Sciences Foundation of
China (91540118 and 31471921) and the Chinese Universities Scientific Fund (2016QC037) and Great Northern
Agriculture Education Fund (1061-2415003) to Hongliang Zhu.
Author Contributions: Rui Li and Hongliang Zhu planned the manuscript outline. Rui Li wrote the draft and
generated the figures; Hongliang Zhu and Yunbo Luo revised and did proofreading. All authors read and
approved the final manuscript.
Conflicts of Interest: The authors declared no conflict of interests.
Abbreviations
ncRNA: non-coding RNA; snRNA: small nuclear RNA; snoRNA: small nucleolar RNA; SRA: steroid
receptor RNA activator; NMR: nuclear magnetic resonance; XCI: X-chromosome inactivation; PRC2:
polycomb repressive 2; H3K27me3: histone H3 lysine K27 trimethylation CRISPR/Cas9: Clustered regularly
interspaced short palindromic repeats (CRISPR)-associated endonuclease 9; roX1: RNA on the X 1; roX2: RNA
Int. J. Mol. Sci. 2016, 17, 702 16 of 21
on the X 2; HAS: high-affinity sites; H4K16ac: histone H4 on lysine16 acetylation; MSL: Male-lethal specific;
MLE: maleless; MSL2: male-specific lethal 2 homolog; LSD1: lysine-specific demethylase 1; HOTAIR:HOX
antisense intergenic RNA; minHOTAIR: minimal binding motif of HOTAIR; MALAT1: Metastasis associated
lung adenocarcinoma transcript 1; mascRNA: MALAT1-associated small cytoplasmic RNA; NEAT2: nuclear
enriched abundant transcript 2; Gas5: growth arrest-specific 5; GR: glucocorticoid receptor; GC: glucocorticoid;
IPS1: Induced by Phosphate Starvation 1; eTM: endogenous target mimicry mechanism; FLC: Flowering locus
C; COOLAIR: Cold Induced Long Antisense Intergenic noncoding RNA; COLDAIR: Cold Assisted Intronic
noncoding RNA; LDMAR: long-day-specific male-fertility-associated lincRNA; PSMS: photoperiod-sensitive
male sterility; NK 58S: Nongken 58S; NK 58N: Nongken 58N; RdDM: RNA-dependent DNA methylation;
RBP: RNA binding proteins; MtRBP1: Medicago truncatula RNA Binding Protein 1; DMS: dimethyl sulfate;
SHAPE: selective 2’-hydroxyl acylation analyzed by primer extension; CMCT: 1-cyclohexyl-(2-morpholinoethyl)
carbodiimidemetho-p-toluene sulfonate; RNase: ribonuclease; PARS: Parallel analysis of RNA structure; Frag-Seq:
fragmentation sequencing; icSHAPE: In Vivo Click SHAPE; CLASH : Crosslinking Ligation and Sequencing
Hybrids; hiCLIP : RNA hybrid and individual-nucleotide resolution UV cross-linking and immunoprecipitation;
RPL : RNA Proximity Ligation.
References
1. Djebali, S.; Davis, C.A.; Merkel, A.; Dobin, A.; Lassmann, T.; Mortazavi, A.; Tanzer, A.; Lagarde, J.; Lin, W.;
Schlesinger, F.; et al. Landscape of transcription in human cells. Nature 2012, 489, 101–108. [CrossRef]
[PubMed]
2. Quinn, J.J.; Chang, H.Y. Unique features of long non-coding RNA biogenesis and function. Nat. Rev. Genet.
2016, 17, 47–62. [CrossRef] [PubMed]
3. Shabalina, S.A.; Ogurtsov, A.Y.; Spiridonov, N.A. A periodic pattern of mRNA secondary structure created
by the genetic code. Nucleic Acids Res. 2006, 34, 2428–2437. [CrossRef] [PubMed]
4. Ponting, C.P.; Belgard, T.G. Transcribed dark matter: Meaning or myth? Hum. Mol. Genet. 2010, 19,
R162–R168. [CrossRef] [PubMed]
5. Kim, E.D.; Sung, S. Long noncoding RNA: Unveiling hidden layer of gene regulatory networks.
Trends Plant Sci. 2012, 17, 16–21. [CrossRef] [PubMed]
6. Simon, S.A.; Meyers, B.C. Small RNA-mediated epigenetic modifications in plants. Curr. Opin. Plant Biol.
2011, 14, 148–155. [CrossRef] [PubMed]
7. Yang, Y.; Wen, L.; Zhu, H. Unveiling the hidden function of long non-coding RNA by identifying its major
partner-protein. Cell Biosci. 2015, 5. [CrossRef] [PubMed]
8. Chen, L.L.; Carmichael, G.G. Decoding the function of nuclear long non-coding RNAs. Curr. Opin. Cell Biol.
2010, 22, 357–364. [CrossRef] [PubMed]
9. Wang, K.C.; Chang, H.Y. Molecular mechanisms of long noncoding RNAs. Mol. Cell 2011, 43, 904–914.
[CrossRef] [PubMed]
10. Mercer, T.R.; Mattick, J.S. Structure and function of long noncoding RNAs in epigenetic regulation.
Nat. Struct. Mol. Biol. 2013, 20, 300–307. [CrossRef] [PubMed]
11. Johnsson, P.; Lipovich, L.; Grander, D.; Morris, K.V. Evolutionary conservation of long non-coding RNAs;
sequence, structure, function. Biochim. Biophys. Acta 2014, 1840, 1063–1071. [CrossRef] [PubMed]
12. Novikova, I.V.; Hennelly, S.P.; Sanbonmatsu, K.Y. Structural architecture of the human long non-coding
RNA, steroid receptor RNA activator. Nucleic Acids Res. 2012, 40, 5034–5051. [CrossRef] [PubMed]
13. Zhang, H.; Chen, X.; Wang, C.; Xu, Z.; Wang, Y.; Liu, X.; Kang, Z.; Ji, W. Long non-coding genes implicated
in response to stripe rust pathogen stress in wheat (Triticum aestivum L.). Mol. Biol. Rep. 2013, 40, 6245–6253.
[CrossRef] [PubMed]
14. Di, C.; Yuan, J.; Wu, Y.; Li, J.; Lin, H.; Hu, L.; Zhang, T.; Qi, Y.; Gerstein, M.B.; Guo, Y.; et al. Characterization
of stress-responsive lncRNAs in Arabidopsis thaliana by integrating expression, epigenetic and structural
features. Plant J. 2014, 80, 848–861. [CrossRef] [PubMed]
15. Dominguez, C.; Schubert, M.; Duss, O.; Ravindranathan, S.; Allain, F.H. Structure determination and
dynamics of protein-RNA complexes by NMR spectroscopy. Prog. Nucl. Magn. Reson. Spectrosc. 2011, 58,
1–61. [CrossRef] [PubMed]
Int. J. Mol. Sci. 2016, 17, 702 17 of 21
16. Kwok, C.K.; Tang, Y.; Assmann, S.M.; Bevilacqua, P.C. The RNA structurome: Transcriptome-wide structure
probing with next-generation sequencing. Trends Biochem. Sci. 2015, 40, 221–232. [CrossRef] [PubMed]
17. Guttman, M.; Amit, I.; Garber, M.; French, C.; Lin, M.F.; Feldser, D.; Huarte, M.; Zuk, O.; Carey, B.W.;
Cassady, J.P.; et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in
mammals. Nature 2009, 458, 223–227. [CrossRef] [PubMed]
18. Novikova, I.V.; Hennelly, S.P.; Tung, C.S.; Sanbonmatsu, K.Y. Rise of the RNA machines: Exploring the
structure of long non-coding RNAs. J. Mol. Biol. 2013, 425, 3731–3746. [CrossRef] [PubMed]
19. Derrien, T.; Johnson, R.; Bussotti, G.; Tanzer, A.; Djebali, S.; Tilgner, H.; Guernec, G.; Martin, D.; Merkel, A.;
Knowles, D.G.; et al. The gencode v7 catalog of human long noncoding RNAs: Analysis of their gene
structure, evolution, and expression. Genome Res. 2012, 22, 1775–1789. [CrossRef] [PubMed]
20. Simon, M.D.; Pinter, S.F.; Fang, R.; Sarma, K.; Rutenberg-Schoenberg, M.; Bowman, S.K.; Kesner, B.A.;
Maier, V.K.; Kingston, R.E.; Lee, J.T. High-resolution Xist binding maps reveal two-step spreading during
X-chromosome inactivation. Nature 2013, 504, 465–469. [CrossRef] [PubMed]
21. Pontier, D.B.; Gribnau, J. Xist regulation and function explored. Hum. Genet. 2011, 130, 223–236. [CrossRef]
[PubMed]
22. Froberg, J.E.; Yang, L.; Lee, J.T. Guided by RNAs: X-inactivation as a model for lncRNA function. J. Mol. Biol.
2013, 425, 3698–3706. [CrossRef] [PubMed]
23. Yang, C.; Chapman, A.G.; Kelsey, A.D.; Minks, J.; Cotton, A.M.; Brown, C.J. X-chromosome inactivation:
Molecular mechanisms from the human perspective. Hum. Genet. 2011, 130, 175–185. [CrossRef] [PubMed]
24. Morey, C.; Arnaud, D.; Avner, P.; Clerc, P. Tsix-mediated repression of Xist accumulation is not sufficent for
normal random X inactivation. Hum. Mol. Genet. 2001, 10, 1403–1411. [CrossRef] [PubMed]
25. Migeon, B.R.; Lee, C.H.; Chowdhury, A.K.; Carpenter, H. Species differences in Tsix/Tsix reveal the roles of
these genes in X-chromosome inactivation. Am. J. Hum. Genet. 2002, 71, 286–293. [CrossRef] [PubMed]
26. Maenner, S.; Blaud, M.; Fouillen, L.; Savoye, A.; Marchand, V.; Dubois, A.; Sanglier-Cianferani, S.;
van Dorsselaer, A.; Clerc, P.; Avner, P.; et al. 2-D structure of the a region of Xist RNA and its implication for
PRC2 association. PLoS Biol. 2010, 8, e1000276. [CrossRef] [PubMed]
27. Duszczyk, M.M.; Wutz, A.; Rybin, V.; Sattler, M. The Xist RNA A-repeat comprises a novel AUCG tetraloop
fold and a platform for multimerization. RNA 2011, 17, 1973–1982. [CrossRef] [PubMed]
28. Jeon, Y.; Lee, J.T. Yy1 tethers Xist RNA to the inactive X nucleation center. Cell 2011, 146, 119–133. [CrossRef]
[PubMed]
29. Fang, R.; Moss, W.N.; Rutenberg-Schoenberg, M.; Simon, M.D. Probing Xist RNA structure in cells using
targeted structure-seq. PLoS Genet. 2015, 11, e1005668. [CrossRef] [PubMed]
30. Lv, Q.; Yuan, L.; Song, Y.; Sui, T.; Li, Z.; Lai, L. D-repeat in the Xist gene is required for X chromosome
inactivation. RNA Biol. 2016, 13, 172–176. [CrossRef] [PubMed]
31. Flintoft, L. Non-coding RNA: Structure and function for lncRNAs. Nat. Rev. Genet. 2013, 14, 598. [CrossRef]
[PubMed]
32. Wutz, A. Noncoding RoX RNA remodeling triggers fly dosage compensation complex assembly. Mol. Cell
2013, 51, 131–132. [CrossRef] [PubMed]
33. Maenner, S.; Muller, M.; Frohlich, J.; Langer, D.; Becker, P.B. ATP-dependent RoX RNA remodeling by
the helicase maleless enables specific association of MSL proteins. Mol. Cell 2013, 51, 174–184. [CrossRef]
[PubMed]
34. Ilik, I.A.; Quinn, J.J.; Georgiev, P.; Tavares-Cadete, F.; Maticzka, D.; Toscano, S.; Wan, Y.; Spitale, R.C.;
Luscombe, N.; Backofen, R.; et al. Tandem stem-loops in RoX RNAs act together to mediate X chromosome
dosage compensation in drosophila. Mol. Cell 2013, 51, 156–173. [CrossRef] [PubMed]
35. Gupta, R.A.; Shah, N.; Wang, K.C.; Kim, J.; Horlings, H.M.; Wong, D.J.; Tsai, M.C.; Hung, T.; Argani, P.;
Rinn, J.L.; et al. Long non-coding RNA HOT AIR reprograms chromatin state to promote cancer metastasis.
Nature 2010, 464, 1071–1076. [CrossRef] [PubMed]
36. Yan, K.; Arfat, Y.; Li, D.; Zhao, F.; Chen, Z.; Yin, C.; Sun, Y.; Hu, L.; Yang, T.; Qian, A. Structure prediction:
New insights into decrypting long noncoding RNAs. Int. J. Mol. Sci. 2016, 17. [CrossRef] [PubMed]
37. Tsai, M.C.; Manor, O.; Wan, Y.; Mosammaparast, N.; Wang, J.K.; Lan, F.; Shi, Y.; Segal, E.; Chang, H.Y.
Long noncoding RNA as modular scaffold of histone modification complexes. Science 2010, 329, 689–693.
[CrossRef] [PubMed]
Int. J. Mol. Sci. 2016, 17, 702 18 of 21
38. Loewen, G.; Jayawickramarajah, J.; Zhuo, Y.; Shan, B. Functions of lncRNA HOT AIR in lung cancer.
J. Hematol. Oncol. 2014, 7. [CrossRef] [PubMed]
39. Wang, B.; Su, Y.; Yang, Q.; Lv, D.; Zhang, W.; Tang, K.; Wang, H.; Zhang, R.; Liu, Y. Overexpression of long
non-coding RNA HOT AIR promotes tumor growth and metastasis in human osteosarcoma. Mol. Cells 2015,
38, 432–440. [CrossRef] [PubMed]
40. He, S.; Liu, S.; Zhu, H. The sequence, structure and evolutionary features of HOTAIR in mammals.
BMC Evol. Biol. 2011, 11. [CrossRef] [PubMed]
41. Wu, L.; Murat, P.; Matak-Vinkovic, D.; Murrell, A.; Balasubramanian, S. Binding interactions between long
noncoding RNAHOTAIR and PRC2 proteins. Biochemistry 2013, 52, 9519–9527. [CrossRef] [PubMed]
42. Somarowthu, S.; Legiewicz, M.; Chillon, I.; Marcia, M.; Liu, F.; Pyle, A.M. HOTAIR forms an intricate and
modular secondary structure. Mol. Cell 2015, 58, 353–361. [CrossRef] [PubMed]
43. Yoshimoto, R.; Mayeda, A.; Yoshida, M.; Nakagawa, S. MALAT1 long non-coding RNA in cancer.
Biochim. Biophys. Acta 2016, 1859, 192–199. [CrossRef] [PubMed]
44. West, J.A.; Davis, C.P.; Sunwoo, H.; Simon, M.D.; Sadreyev, R.I.; Wang, P.I.; Tolstorukov, M.Y.; Kingston, R.E.
The long noncoding RNAs NEAT1 and MALAT1 bind active chromatin sites. Mol. Cell 2014, 55, 791–802.
[CrossRef] [PubMed]
45. Zhang, J.; Zhang, B.; Wang, T.; Wang, H. LncRNA malat1 overexpression is an unfavorable prognostic factor
in human cancer: Evidence from a meta-analysis. Int. J. Clin. Exp. Med. 2015, 8, 5499–5505. [PubMed]
46. Tripathi, V.; Ellis, J.D.; Shen, Z.; Song, D.Y.; Pan, Q.; Watt, A.T.; Freier, S.M.; Bennett, C.F.; Sharma, A.;
Bubulya, P.A.; et al. The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by
modulating Sr splicing factor phosphorylation. Mol. Cell 2010, 39, 925–938. [CrossRef] [PubMed]
47. Wilusz, J.E.; Freier, S.M.; Spector, D.L. 31 end processing of a long nuclear-retained noncoding RNA yields a
tRNA-like cytoplasmic RNA. Cell 2008, 135, 919–932. [CrossRef] [PubMed]
48. Wilusz, J.E.; JnBaptiste, C.K.; Lu, L.Y.; Kuhn, C.D.; Joshua-Tor, L.; Sharp, P.A. A triple helix stabilizes the 31
ends of long noncoding RNAs that lack poly(A) tails. Genes Dev. 2012, 26, 2392–2407. [CrossRef] [PubMed]
49. Brown, J.A.; Valenstein, M.L.; Yario, T.A.; Tycowski, K.T.; Steitz, J.A. Formation of triple-helical structures
by the 3’-endsequences of MALAT1 and MENβ noncoding RNAs. Proc. Natl. Acad. Sci. USA 2012, 109,
19202–19207. [CrossRef] [PubMed]
50. Brown, J.A.; Bulkley, D.; Wang, J.; Valenstein, M.L.; Yario, T.A.; Steitz, T.A.; Steitz, J.A. Structural insights
into the stabilization of MALAT1 noncoding RNA by a bipartite triple helix. Nat. Struct. Mol. Biol. 2014, 21,
633–640. [CrossRef] [PubMed]
51. Pickard, M.R.; Williams, G.T. Molecular and cellular mechanisms of action of tumour suppressor Gas5
lncRNA. Genes (Basel) 2015, 6, 484–499. [CrossRef] [PubMed]
52. Kino, T.; Hurt, D.E.; Ichijo, T.; Nader, N.; Chrousos, G.P. Noncoding RNAGas5 is a growth arrest- and
starvation-associated repressor of the glucocorticoid receptor. Sci. Signal. 2010, 3. [CrossRef] [PubMed]
53. Rinn, J.L.; Chang, H.Y. Genome regulation by long noncoding RNAs. Annu. Rev. Biochem. 2012, 81, 145–166.
[CrossRef] [PubMed]
54. Liu, J.; Wang, H.; Chua, N.H. Long noncoding RNA transcriptome of plants. Plant Biotechnol. J. 2015, 13,
319–328. [CrossRef] [PubMed]
55. Li, X.; Wu, Z.; Fu, X.; Han, W. LncRNAs: Insights into their function and mechanics in underlying disorders.
Mutat. Res. Rev. Mutat. Res. 2014, 762, 1–21. [CrossRef] [PubMed]
56. Wierzbicki, A.T. The role of long non-coding RNA in transcriptional gene silencing. Curr. Opin. Plant Biol.
2012, 15, 517–522. [CrossRef] [PubMed]
57. Zhang, Y.C.; Chen, Y.Q. Long noncoding RNAs: New regulators in plant development. Biochem. Biophys.
Res. Commun. 2013, 436, 111–114. [CrossRef] [PubMed]
58. Franco-Zorrilla, J.M.; Valli, A.; Todesco, M.; Mateos, I.; Puga, M.I.; Rubio-Somoza, I.; Leyva, A.; Weigel, D.;
Garcia, J.A.; Paz-Ares, J. Target mimicry provides a new mechanism for regulation of microRNA activity.
Nat. Genet. 2007, 39, 1033–1037. [CrossRef] [PubMed]
59. Heo, J.B.; Lee, Y.S.; Sung, S. Epigenetic regulation by long noncoding RNAs in plants. Chromosome Res. 2013,
21, 685–693. [CrossRef] [PubMed]
60. Guil, S.; Esteller, M. RNA–RNA interactions in gene regulation: The coding and noncoding players.
Trends Biochem. Sci. 2015, 40, 248–256. [CrossRef] [PubMed]
Int. J. Mol. Sci. 2016, 17, 702 19 of 21
61. Yamaguchi, A.; Abe, M. Regulation of reproductive development by non-coding RNA in Arabidopsis: To
flower or not to flower. J. Plant Res. 2012, 125, 693–704. [CrossRef] [PubMed]
62. Csorba, T.; Questa, J.I.; Sun, Q.; Dean, C. Antisense coolair mediates the coordinated switching of chromatin
states at FLC during vernalization. Proc. Natl. Acad. Sci. USA 2014, 111, 16160–16165. [CrossRef] [PubMed]
63. Kim, D.H.; Sung, S. Environmentally coordinated epigenetic silencing of FLC by protein and long noncoding
RNA components. Curr. Opin. Plant Biol. 2012, 15, 51–56. [CrossRef] [PubMed]
64. Wang, Z.W.; Wu, Z.; Raitskin, O.; Sun, Q.; Dean, C. Antisense-mediated FLC transcriptional repression
requires the P-TEFb transcription elongation factor. Proc. Natl. Acad. Sci. USA 2014, 111, 7468–7473.
[CrossRef] [PubMed]
65. Sun, Q.; Csorba, T.; Skourti-Stathaki, K.; Proudfoot, N.J.; Dean, C. R-loop stabilization represses antisense
transcription at the Arabidopsis FLC locus. Science 2013, 340, 619–621. [CrossRef] [PubMed]
66. Chekanova, J.A. Long non-coding RNAs and their functions in plants. Curr. Opin. Plant Biol. 2015, 27,
207–216. [CrossRef] [PubMed]
67. Lee, J.T. Lessons from X-chromosome inactivation: Long ncRNA as guides and tethers to the epigenome.
Genes Dev. 2009, 23, 1831–1842. [CrossRef] [PubMed]
68. Zhao, J.; Ohsumi, T.K.; Kung, J.T.; Ogawa, Y.; Grau, D.J.; Sarma, K.; Song, J.J.; Kingston, R.E.; Borowsky, M.;
Lee, J.T. Genome-wide identification of polycomb-associated RNAs by RIP-seq. Mol. Cell 2010, 40, 939–953.
[CrossRef] [PubMed]
69. Ding, J.; Lu, Q.; Ouyang, Y.; Mao, H.; Zhang, P.; Yao, J.; Xu, C.; Li, X.; Xiao, J.; Zhang, Q. A long noncoding
RNA regulates photoperiod-sensitive male sterility, an essential component of hybrid rice. Proc. Natl. Acad.
Sci. USA 2012, 109, 2654–2659. [CrossRef] [PubMed]
70. Ding, J.; Shen, J.; Mao, H.; Xie, W.; Li, X.; Zhang, Q. RNA-directed DNA methylation is involved in regulating
photoperiod-sensitive male sterility in rice. Mol. Plant 2012, 5, 1210–1216. [CrossRef] [PubMed]
71. Zhang, J.; Mujahid, H.; Hou, Y.; Nallamilli, B.R.; Peng, Z. Plant long ncRNAs: A new frontier for gene
regulatory control. Am. J. Plant Sci. 2013, 4, 1038–1045. [CrossRef]
72. Bardou, F.; Merchan, F.; Ariel, F.; Crespi, M. Dual RNAs in plants. Biochimie 2011, 93, 1950–1954. [CrossRef]
[PubMed]
73. Gultyaev, A.P.; Roussis, A. Identification of conserved secondary structures and expansion segments in
ENOD40 RNAs reveals new ENOD40 homologues in plants. Nucleic Acids Res. 2007, 35, 3144–3152.
[CrossRef] [PubMed]
74. Ariel, F.; Romero-Barrios, N.; Jegu, T.; Benhamed, M.; Crespi, M. Battles and hijacks: Noncoding transcription
in plants. Trends Plant Sci. 2015, 20, 362–371. [CrossRef] [PubMed]
75. Rohrig, H.; Schmidt, J.; Miklashevichs, E.; Schell, J.; John, M. Soybean ENOD40 encodes two peptides that
bind to sucrose synthase. Proc. Natl. Acad. Sci. USA 2002, 99, 1915–1920. [CrossRef] [PubMed]
76. Girard, G.; Roussis, A.; Gultyaev, A.P.; Pleij, C.W.; Spaink, H.P. Structural motifs in the RNA encoded by the
early nodulation gene enod40 of soybean. Nucleic Acids Res. 2003, 31, 5003–5015. [CrossRef] [PubMed]
77. Campalans, A.; Kondorosi, A.; Crespi, M. Enod40, a short open reading frame-containing mRNA, induces
cytoplasmic localization of a nuclear RNA binding protein in medicago truncatula. Plant Cell 2004, 16,
1047–1059. [CrossRef] [PubMed]
78. Anderson, D.M.; Anderson, K.M.; Chang, C.L.; Makarewich, C.A.; Nelson, B.R.; McAnally, J.R.; Kasaragod, P.;
Shelton, J.M.; Liou, J.; Bassel-Duby, R.; et al. A micropeptide encoded by a putative long noncoding RNA
regulates muscle performance. Cell 2015, 160, 595–606. [CrossRef] [PubMed]
79. Bardou, F.; Ariel, F.; Simpson, C.G.; Romero-Barrios, N.; Laporte, P.; Balzergue, S.; Brown, J.W.; Crespi, M.
Long noncoding RNA modulates alternative splicing regulators in Arabidopsis. Dev. Cell 2014, 30, 166–176.
[CrossRef] [PubMed]
80. Ziehler, W.A.; Engelke, D.R. Probing RNA structure with chemical reagents and enzymes. Curr. Protoc.
Nucleic Acid Chem. 2001, 6. [CrossRef]
81. Cheong, H.K.; Hwang, E.; Lee, C.; Choi, B.S.; Cheong, C. Rapid preparation of RNA samples for NMR
spectroscopy and X-ray crystallography. Nucleic Acids Res. 2004, 32, e84. [CrossRef] [PubMed]
82. Batey, R.T.; Kieft, J.S. Improved native affinity purification of RNA. RNA 2007, 13, 1384–1389. [CrossRef]
[PubMed]
83. Said, N.; Rieder, R.; Hurwitz, R.; Deckert, J.; Urlaub, H.; Vogel, J. In vivo expression and purification of
aptamer-tagged small RNA regulators. Nucleic Acids Res. 2009, 37, e133. [CrossRef] [PubMed]
Int. J. Mol. Sci. 2016, 17, 702 20 of 21
84. Chillon, I.; Marcia, M.; Legiewicz, M.; Liu, F.; Somarowthu, S.; Pyle, A.M. Native purification and analysis of
long RNAs. Methods Enzymol. 2015, 558, 3–37. [PubMed]
85. Poulsen, L.D.; Kielpinski, L.J.; Salama, S.R.; Krogh, A.; Vinther, J. SHAPE selection (SHAPES) enrich for RNA
structure signal in SHAPE sequencing-based probing data. RNA 2015, 21, 1042–1052. [CrossRef] [PubMed]
86. Spitale, R.C.; Crisalli, P.; Flynn, R.A.; Torre, E.A.; Kool, E.T.; Chang, H.Y. RNA SHAPE analysis in living cells.
Nat. Chem. Biol. 2013, 9, 18–20. [CrossRef] [PubMed]
87. Lucks, J.B.; Mortimer, S.A.; Trapnell, C.; Luo, S.; Aviran, S.; Schroth, G.P.; Pachter, L.; Doudna, J.A.;
Arkin, A.P. Multiplexed RNA structure characterization with selective 21 -hydroxyl acylation analyzed
by primer extension sequencing (SHAPE-seq). Proc. Natl. Acad. Sci. USA 2011, 108, 11063–11068. [CrossRef]
[PubMed]
88. Siegfried, N.A.; Busan, S.; Rice, G.M.; Nelson, J.A.; Weeks, K.M. RNA motif discovery by shape and
mutational profiling (SHAPE-MAP). Nat. Methods 2014, 11, 959–965. [CrossRef] [PubMed]
89. Homan, P.J.; Favorov, O.V.; Lavender, C.A.; Kursun, O.; Ge, X.; Busan, S.; Dokholyan, N.V.; Weeks, K.M.
Single-molecule correlated chemical probing of RNA. Proc. Natl. Acad. Sci. USA 2014, 111, 13858–13863.
[CrossRef] [PubMed]
90. Foley, S.W.; Vandivier, L.E.; Kuksa, P.P.; Gregory, B.D. Transcriptome-wide measurement of plant RNA
secondary structure. Curr. Opin. Plant Biol. 2015, 27, 36–43. [CrossRef] [PubMed]
91. Kertesz, M.; Wan, Y.; Mazor, E.; Rinn, J.L.; Nutter, R.C.; Chang, H.Y.; Segal, E. Genome-wide measurement of
RNA secondary structure in yeast. Nature 2010, 467, 103–107. [CrossRef] [PubMed]
92. Wan, Y.; Qu, K.; Ouyang, Z.; Chang, H.Y. Genome-wide mapping of RNA structure using nuclease digestion
and high-throughput sequencing. Nat. Protoc. 2013, 8, 849–869. [CrossRef] [PubMed]
93. Wan, Y.; Qu, K.; Zhang, Q.C.; Flynn, R.A.; Manor, O.; Ouyang, Z.; Zhang, J.; Spitale, R.C.; Snyder, M.P.;
Segal, E.; et al. Landscape and variation of RNA secondary structure across the human transcriptome. Nature
2014, 505, 706–709. [CrossRef] [PubMed]
94. Novikova, I.V.; Hennelly, S.P.; Sanbonmatsu, K.Y. Tackling structures of long noncoding RNAs. Int. J.
Mol. Sci. 2013, 14, 23672–23684. [CrossRef] [PubMed]
95. Underwood, J.G.; Uzilov, A.V.; Katzman, S.; Onodera, C.S.; Mainzer, J.E.; Mathews, D.H.; Lowe, T.M.;
Salama, S.R.; Haussler, D. Fragseq: Transcriptome-wide RNA structure probing using high-throughput
sequencing. Nat. Methods 2010, 7, 995–1001. [CrossRef] [PubMed]
96. Kashi, K.; Henderson, L.; Bonetti, A.; Carninci, P. Discovery and functional analysis of lncRNAs:
Methodologies to investigate an uncharacterized transcriptome. Biochim. Biophys. Acta 2016, 1859, 3–15.
[CrossRef] [PubMed]
97. Zheng, Q.; Ryvkin, P.; Li, F.; Dragomir, I.; Valladares, O.; Yang, J.; Cao, K.; Wang, L.S.; Gregory, B.D.
Genome-wide double-stranded RNA sequencing reveals the functional significance of base-paired RNAs in
Arabidopsis. PLoS Genet. 2010, 6, e1001141. [CrossRef] [PubMed]
98. Li, F.; Zheng, Q.; Ryvkin, P.; Dragomir, I.; Desai, Y.; Aiyer, S.; Valladares, O.; Yang, J.; Bambina, S.; Sabin, L.R.;
et al. Global analysis of RNA secondary structure in two metazoans. Cell Rep. 2012, 1, 69–82. [CrossRef]
[PubMed]
99. Li, F.; Zheng, Q.; Vandivier, L.E.; Willmann, M.R.; Chen, Y.; Gregory, B.D. Regulatory impact of RNA
secondary structure across the Arabidopsis transcriptome. Plant Cell 2012, 24, 4346–4359. [CrossRef] [PubMed]
100. Cordero, P.; Kladwang, W.; VanLang, C.C.; Das, R. Quantitative dimethyl sulfate mapping for automated
RNA secondary structure inference. Biochemistry 2012, 51, 7037–7039. [CrossRef] [PubMed]
101. Kubota, M.; Tran, C.; Spitale, R.C. Progress and challenges for chemical probing of RNA structure inside
living cells. Nat. Chem. Biol. 2015, 11, 933–941. [CrossRef] [PubMed]
102. Talkish, J.; May, G.; Lin, Y.; Woolford, J.L., Jr.; McManus, C.J. Mod-seq: High-throughput sequencing for
chemical probing of RNA structure. RNA 2014, 20, 713–720. [CrossRef] [PubMed]
103. Lin, Y.; May, G.E.; Joel McManus, C. Mod-seq: A high-throughput method for probing RNA secondary
structure. Methods Enzymol. 2015, 558, 125–152. [PubMed]
104. Rouskin, S.; Zubradt, M.; Washietl, S.; Kellis, M.; Weissman, J.S. Genome-wide probing of RNA structure
reveals active unfolding of mRNA structures in vivo. Nature 2014, 505, 701–705. [CrossRef] [PubMed]
105. Ding, Y.; Tang, Y.; Kwok, C.K.; Zhang, Y.; Bevilacqua, P.C.; Assmann, S.M. In vivo genome-wide profiling of
RNA secondary structure reveals novel regulatory features. Nature 2014, 505, 696–700. [CrossRef] [PubMed]
Int. J. Mol. Sci. 2016, 17, 702 21 of 21
106. Lu, Z.; Chang, H.Y. Decoding the RNA structurome. Curr. Opin. Struct. Biol. 2016, 36, 142–148. [CrossRef]
[PubMed]
107. Flynn, R.A.; Zhang, Q.C.; Spitale, R.C.; Lee, B.; Mumbach, M.R.; Chang, H.Y. Transcriptome-wide
interrogation of RNA secondary structure in living cells with icSHAPE. Nat. Protoc. 2016, 11, 273–290.
[CrossRef] [PubMed]
108. Spitale, R.C.; Flynn, R.A.; Zhang, Q.C.; Crisalli, P.; Lee, B.; Jung, J.W.; Kuchelmeister, H.Y.; Batista, P.J.;
Torre, E.A.; Kool, E.T.; et al. Structural imprints in vivo decode RNA regulatory mechanisms. Nature 2015,
519, 486–490. [CrossRef] [PubMed]
109. Helwak, A.; Kudla, G.; Dudnakova, T.; Tollervey, D. Mapping the human miRNA interactome by clash
reveals frequent noncanonical binding. Cell 2013, 153, 654–665. [CrossRef] [PubMed]
110. Kudla, G.; Granneman, S.; Hahn, D.; Beggs, J.D.; Tollervey, D. Cross-linking, ligation, and sequencing
of hybrids reveals RNA–RNA interactions in yeast. Proc. Natl. Acad. Sci. USA 2011, 108, 10010–10015.
[CrossRef] [PubMed]
111. Sugimoto, Y.; Vigilante, A.; Darbo, E.; Zirra, A.; Militti, C.; D1 Ambrogio, A.; Luscombe, N.M.; Ule, J. Hiclip
reveals the in vivo atlas of mRNA secondary structures recognized by Staufen 1. Nature 2015, 519, 491–494.
[CrossRef] [PubMed]
112. Ramani, V.; Qiu, R.; Shendure, J. High-throughput determination of RNA structure by proximity ligation.
Nat. Biotechnol. 2015, 33, 980–984. [CrossRef] [PubMed]
113. Cao, J. The functional role of long non-coding RNAs and epigenetics. Biol. Proced. Online 2014, 16, 11.
[CrossRef] [PubMed]
114. Yoon, J.H.; Abdelmohsen, K.; Gorospe, M. Posttranscriptional gene regulation by long noncoding RNA.
J. Mol. Biol. 2013, 425, 3723–3730. [CrossRef] [PubMed]
115. Novikova, I.V.; Hennelly, S.P.; Sanbonmatsu, K.Y. Sizing up long non-coding RNAs do lncRNAs have
secondary and tertiary structure. Bioarchitecture 2012, 2, 189–199. [CrossRef] [PubMed]
© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC-BY) license (http://creativecommons.org/licenses/by/4.0/).