0% found this document useful (0 votes)
46 views

RNA Structure Prediction Software and Analysis

There are three main strategies for RNA structure prediction: 1) Energy minimization methods use dynamic programming to find the lowest free energy secondary structure based on nearest neighbor thermodynamic parameters. 2) Comparative sequence analysis methods detect co-varying mutations to predict base pairing, achieving 97% accuracy for conserved rRNAs. 3) Combined methods that incorporate both thermodynamic and comparative sequence data have improved structure prediction results compared to single methods.

Uploaded by

thamizh555
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

RNA Structure Prediction Software and Analysis

There are three main strategies for RNA structure prediction: 1) Energy minimization methods use dynamic programming to find the lowest free energy secondary structure based on nearest neighbor thermodynamic parameters. 2) Comparative sequence analysis methods detect co-varying mutations to predict base pairing, achieving 97% accuracy for conserved rRNAs. 3) Combined methods that incorporate both thermodynamic and comparative sequence data have improved structure prediction results compared to single methods.

Uploaded by

thamizh555
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

RNA Structure Prediction

RNA structure prediction strategies


Secondary structure prediction

1) Energy minimization
(thermodynamics)

2) Comparative sequence analysis


(co-variation)

3) Combined experimental & computational


Secondary structure prediction strategies

1) Energy minimization (thermodynamics)


• Algorithm:
Dynamic programming to find
high probability pairs
(also, some Genetic algorithms)

• Software:
Mfold - Zuker
Vienna RNA Package - Hofacker
RNAstructure - Mathews
Sfold - Ding & Lawrence

R Knight 2005
Secondary structure prediction strategies

2) Comparative sequence analysis (co-variation)


• Algorithm:
Mutual information
Context-free grammars

• Software:
ConStruct
Alifold
Pfold
FOLDALIGN
Dynalign

R Knight 2005
Secondary structure prediction strategies

3) Combined experimental & computational

• Experiment:
Map single-stranded vs double-stranded regions in
folded RNA

• How?
Enzymes: S1 nuclease, T1 RNase
Chemicals: kethoxal, DMS

R Knight 2005
Experimental RNA structure determination?

• X-ray crystallography

• NMR spectroscopy

• Enzymatic/chemical mapping
1) Energy minimization method

What are the assumptions?


Native tertiary structure or "fold" of an RNA
molecule is (one of) its "lowest" free energy
configuration(s)
Gibbs free energy = G in kcal/mol at 37C
= equilibrium stability of structure
lower values (negative) are more favorable
Is this assumption valid?
in vivo? - this may not hold, but we don't really know
Free energy minimization

What are the rules?

A U Basepair A=U
A U A=U What gives here?
G = -1.2 kcal/mole

A U Basepair
A=U
U A U=A
G = -1.6 kcal/mole

C Staben 2005
Energy minimization calculations:
Base-stacking is critical

AA -1.2 CG -3.0
UU GC

AU or UA -1.6 GC -4.3
UA AU CG

AG, AC, CA, GA -2.1 GU -0.3


UC, UG, GU, CU UG

CC -4.8 XG, GX 0
GG YU, UY

- Tinocco et al.

C Staben 2005
Nearest-neighbor parameters

Most methods for free energy minimization


use nearest-neighbor parameters (derived from
experiment) for predicting stability of an RNA secondary structure
(in terms of G at 37C)

& most available software packages use


the same set of parameters:
Mathews, Sabina, Zuker & Turner, 1999
Energy minimization - calculations:
Total free energy of a specific
conformation for a specific RNA molecule
= sum of incremental energy terms for:
• helical stacking
(sequence dependent)
• loop initiation
• unpaired stacking

(favorable "increments" are < 0)

Fig 6.3
Baxevanis &
Ouellette 2005
But how many possible conformations for a single RNA molecule?

Huge number:
Zuker estimates (1.8)N possible secondary structures for a
sequence of N nucleotides
for 100 nts (small RNA…) =
3 X 1025 structures!
Solution? Not exhaustive enumeration…
 Dynamic programming
O(N3) in time
O(N2) in space/storage
iff pseudoknots excluded, otherwise:
O(N6 ), time
O(N4 ), space
2) Comparative sequence analysis
(co-variation)

Two basic approaches:


• Algorithms constrained by initial alignment
Much faster, but not as robust as unconstrained
Base-pairing probabilities determined by a partition
function

• Algorithms not constrained by initial alignment


Genetic algorithms often used for finding an alignment & set
of structures
RNA Secondary structure prediction: Performance?

How evaluate?
• Not many experimentally determined structures
currently, ~ 50% are rRNA structures
so "Gold Standard" (in absence of tertiary structure):
compare with predicted RNA secondary structure with that
determined by comparative sequence analysis (!!??) using Benchmark
Datasets
NOTE: Base-pairs predicted by comparative sequence analysis for large &
small subunit rRNAs are 97% accurate when compared with high resolution
crystal structures! - Gutell, Pace
RNA Secondary structure prediction: Performance?

1) Energy minimization (via dynamic programming)


73% avg. prediction accuracy - single sequence
2) Comparative sequence analysis
97% avg. prediction accuracy - multiple sequences (e.g., highly
conserved rRNAs)
much lower if sequence conservation is lower &/or fewer sequences
are available for alignment
3) Combined - recent developments:
combine thermodynamics & co-variation
& experimental constraints? IMPROVED RESULTS
RNA structure prediction strategies
Tertiary structure prediction
Requires "craft" & significant user input & insight
1) Extensive comparative sequence analysis to predict tertiary
contacts (co-variation)
e.g., MANIP - Westhof
2) Use experimental data to constrain model building
e.g., MC-CYM - Major
3) Homology modeling using sequence alignment & reference tertiary
structure (not many of these!)
4) Low resolution molecular mechanics
e.g., yammp - Harvey

You might also like