0% found this document useful (0 votes)
40 views29 pages

AICHE Presentation On Multistate Reweighting Methods

This document summarizes a presentation about using multistate reweighting and linear mapping techniques to rapidly explore molecular simulation parameter space and calculate thermodynamic properties. The key points are: 1) Multistate reweighting allows thermodynamic estimates calculated from a small set of "benchmark" simulations to be reevaluated and predicted for a large number of other parameter sets, dramatically reducing computational cost compared to running new simulations for each set. 2) Linear mapping between molecular geometries allows configurations generated with one geometry to be used to evaluate potential energies for different geometries, enabling free energy calculations between systems with different structures. 3) Together, these methods were used to search over 5,000 combinations of simulation parameters and

Uploaded by

hkmydreams
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views29 pages

AICHE Presentation On Multistate Reweighting Methods

This document summarizes a presentation about using multistate reweighting and linear mapping techniques to rapidly explore molecular simulation parameter space and calculate thermodynamic properties. The key points are: 1) Multistate reweighting allows thermodynamic estimates calculated from a small set of "benchmark" simulations to be reevaluated and predicted for a large number of other parameter sets, dramatically reducing computational cost compared to running new simulations for each set. 2) Linear mapping between molecular geometries allows configurations generated with one geometry to be used to evaluate potential energies for different geometries, enabling free energy calculations between systems with different structures. 3) Together, these methods were used to search over 5,000 combinations of simulation parameters and

Uploaded by

hkmydreams
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 29

Using multistate reweighting to rapidly explore

molecular parameter space




Himanshu Paliwal and Michael R. Shirts
AICHE Annual meeting
Session: Recent advances in molecular simulation III
Oct 29
th
2012

Shirts Research Group
Molecular design in chemical and parameter space
requires high throughput thermodynamic calculations
Observables like free energies of solvation, binding etc. are used for

Computational drug design


Design of new chromatographic surfaces


Design of new solvents


Parameterizing force fields






Two problems in computing large number of
thermodynamic estimates


Computational cost to generate samples is very high.


Finding accurate and inexpensive simulation parameters for
calculating thermodynamic properties of interest




Every sampled thermodynamic state has some information
about the neighboring unsampled thermodynamic state

U
A
X
A

U
A
(X
A
)
U
B

U
C

U
D

U
B
(X
A
)
U
C
(X
A
)
U
D
(X
A
)
Sampled state
Unsampled
states
Sampled energies
Reevaluated
energies
O
A

O
B

O
C

O
D

How do we rapidly scan the vast simulation parameter
space of nonbonded interaction parameters?
Different choices of simulation parameters translate to
different accuracies in thermodynamic estimates
Accuracy of thermodynamic estimates depend how accurately
potential energies are estimated. For example

Thermodynamic observables


Free energy differences


We dont know how accurate the potential energy should be to get desired
accuracy in thermodynamic estimates.





( ) ( ) ( )
0
0 1 1 0
) ( exp ln
1
0 1 U U G G G = = A

|
|
}
}
I
I

=
dx x U
dx x U A
A
)) ( exp(
)) ( exp(
|
|
Coulomb interaction is estimated as a sum of real space
and Fourier space contributions








Parameters for Coulomb using PME
Short range cutoff (r
c,coul
)
(Gaussian width) or Etol
Fourier spacing (Fsp)
Order of spline (Order)

How do we choose the cutoff between
short and long range Lennard-Jones contributions?

Parameters for LJ
Short range cutoff (r
c,LJ
)






Choosing a switching distance







Tradeoff between minimizing force discontinuities and error in
thermodynamic estimates. (Only important for MD)

Parameters for Coulomb and LJ switch
Coulomb switch distance (r
swi,coul
)
LJ switch distance (r
swi,LJ
)


How accurate does the potential energy need to be to get
desired accuracy in thermodynamic estimates?

















O
1

O
2

Observables space (O) Potential energy space (E)
P
1

P
2

Parameter space (P)
E
1

E
2

O
1
, O
2
could be free
energy and enthalpy of
phase change
E
1
, E
2
could be LJ and
Coulomb potential
P
1
, P
2
could be LJ cutoff
and Coulomb cutoff
Studied Not Studied
Parameter space for the search is combinatorially
large, hence we split the search












Total number of combinations before split is 2,592,000.
After split number of combinations is 5184.
Paramters Choices # of choices
Order of beta spline [3, 4, 5, 6] 4
Ewald tolerance [10
-2
, 10
-4
, 10
-6
, 10
-8
, 10
-10
] 5
Fourier spacing (nm) [0.04, 0.06, 0.08, 0.10, 0.12, 0.14, 0.16 , 0.18, 0.20] 9
Coulomb cutoff (nm) [0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5 ] 10
Width of Coulomb switch (nm) [0.2, 0.18, 0.16, 0.14, 0.12, 0.10, 0.08, 0.06, 0.04, 0.02, 0.01, 0.001] 12
LJ cutoff (nm) [0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5 ] 10
Width of LJ switch (nm) [0.2, 0.18, 0.16, 0.14, 0.12, 0.10, 0.08, 0.06, 0.04, 0.02, 0.01, 0.001] 12
We search for converged parameters for three
benchmark systems
The three systems in the benchmark set which were used to
validate free energy methods are used in this study also.
Methane solvation

Dipole Inversion

Anthracene solvation


We calculate the following observables:
Free energy of transformation
Enthalpy of transformation
Heat of vaporization of TIP3P water

We want a computationally inexpensive parameter set
which gives accurate thermodynamic estimates
Parameter set

Definitions:
Expensive parameters give converged potential energies and hence
converged thermodynamic estimates.

Optimized parameters give potential energies and thermodynamic
estimates statistically indistinguishable w.r.t expensive but will be
computationally cheap.

Benchmark parameters: the set we started with.

| |
LJ swi LJ c coul swi coul c
r r r r Fsp Etol Order
, , , ,
, , , , , , = I
Different parameter sets represent different
thermodynamic states
Multistate Bennett Acceptance Ratio (MBAR) can reweight the data from
sampled states to predict thermodynamic estimates for the unsampled
states.


Initial state sampling U
B
(X
B
)
U
B,0
(X
B,0
)

1

U
B,1
(X
B,0
)
0

1

U
B,1
(X
B,1
) U
B,0
(X
B,1
)
G
B

Reevaluate U
E
(X
B
)
U
E,0
(X
B,0
) U
E,1
(X
B,0
)
U
E,1
(X
B,1
) U
E,0
(X
B,1
)
U
i,0
(X
B,0
) U
i,1
(X
B,0
)
U
i,1
(X
B,1
) U
i,0
(X
B,1
)
Reevaluate U
i
(X
B
)
G
E
G
i

G
BE
G
Ei

G
Bi

Samples generated only using the benchmark set are
used to predict G
Ei
for 5184 parameter sets













Takes only a minute to reevaluate one set of parameters
Reduction in time consumption : 540 CPU years 1 CPU month
Our search results in optimized parameters that are
statistically indistinguishable from expensive









Predicted using only
benchmark set
Direct
differences
Calculated using
both sets
Efficiency achieved using reweighting formalism is
very promising.
We search through 5200 parameter combinations which
involves calculation of:
3 million observables for 60,000 thermodynamic states.

Using MD for all the sets the same analysis would have taken
540 CPU years.

We used MD for a single set with re-evaluation of energies for
rest of the sets and the whole exercise took
One CPU month.

Re-evaluation formalism is faster than raw sample generation
by roughly
a factor of 4000







How do we deal with changes in molecular geometry
Parameter scans and molecular transformations not only
involve alchemical transformations in LJ and Coulomb
but also changes in geometry
Alchemical changes (growing and disappearing of atoms):
Changes in sigma, epsilon and charges


Geometry changes:
Change in bond length, bond angle, dihedral angle


Present free energy methods lack the capacity to do free
energy analysis for transformations involving changes in
geometry.
A perturbed molecular geometry is never seen in the
simulation of original geometry.
In the FEP calculation



We need to calculate U
TIP3P
(X
TIP4P
) .

But we will never see a TIP3P water geometry in a TIP4P water simulation.





We can introduce a linear map T
ij
which maps TIP4P to TIP3P geometry
and then we can evaluate U
TIP3P
(T
ij
(X
TIP4P
))
( ) ( ) ( )
P TIP
P TIP P TIP P TIP P TIP
U U P TIP G P TIP G G
4
4 3 3 4
) ( exp ln
1
4 3 = = A

|
|
TIP4P TIP3P
We use linear maps between molecular geometries to
estimate thermodynamic property differences
The linear map T
ij
introduces a Jacobian J
ij
term in the partition function
integral.






J
ij
can be analytically calculated and can be included in the free energy
estimating algorithm.

This way we dont have to change anything in the MD code.
|
|
|
|
.
|

\
|

= =
=
K
j
N
n
K
k
jn
w
k k k
jn
w
i
i
j
x u f N
x u
f
1 1
1
)) ( exp(
)) ( exp(
ln
(

= ) ln(
1
)) ( ( ) (
ij jn ij i jn
w
i
J x T U x u
|
|
MBAR Eq.
unchanged !!
J
ij
is included in
effective potential
We test our new algorithm by estimating free energies
and enthalpies for four transformations
A set of truncated harmonic oscillators
Force constant and Equilibrium distance is changed(2)

TIP4P TIP3P molecule in liq. Phase
Charge, sigma, epsilon is changed and additional
site is introduced (4)

SPC-E TIP3P molecule in liq. Phase
Charge, sigma, epsilon, OH bond length and
HOH bond angle is changed (5)

TIP4P SPC-E molecule in liq. phase
Charge, sigma, epsilon, OH bond length, HOH bond
angle is changed and extra site is introduced(6)
TIP4P TIP3P
SPC-E TIP3P
TIP4P SPCE
Mapping makes configurations of different geometries
visible in simulations of all other intermediate states
k
i

Ok
j

Ok
i

k
j

We validate water transformations using
thermodynamic cycles.
Analytical free energy for truncated harmonic oscillators with spread


Water transformations







(G
hyd
)
TIP4P
and (G
hyd
)
TIP3P
are evaluated without mapping, G
2a
is
evaluated using mapping and G
1a
can be evaluated analytically.
(G
hyd
)
TIP3P
(G
hyd
)
TIP4P
(G
1a
)

(G
2a
)

N TIP3P
molecules
in vacuum
N TIP4P
molecules
in vacuum
N TIP4P
molecules
in TIP4P water
N TIP3P
molecules
in TIP3P water
N TIP4P
molecules
in TIP4P water
N SPC-E
molecules
in SPC-E water
N TIP3P
molecules
in TIP3P water
(G
2a
)

(G
2b
)

(G
2c
)

a a
P TIP
hyd
P TIP
hyd
G G G G
1 2
3 4
) ( ) ( A A = A A 0
2 2 2
= A + A + A
a b c
G G G
) 2 ln( ) (
i i
analytical f to =
Multistate reweighting with mapping drastically reduces
uncertainty of the calculation
Direct calculations
Linear removal of charge, soft core removal of LJ interactions
One molecule solvated using 21 intermediate states, each state simulated for 10 ns.

Mapped calculations
Linearly variation of charge, LJ and geometry over 21 states , each state simulated for 10
ns.




Transformation
Direct calculation
G
hyd
(kJ/mol)
Mapped
G
hyd
(kJ/mol)
TIP3P TIP4P -0.0205 0.0633 0.0410 0.0002
SPC-E TIP3P 3.9479 0.0632 4.0503 0.0001
TIP4P SPC-E -3.9274 0.0656 - 4.0912 0.0002
300 X
Lower uncertainty
Mapped calculations require less number of
intermediate states compared to direct calculations
Mapped calculations require less number of samples
compared to direct calculations
MBAR without mapping requires 20,000 samples/state and 21 intermediate states to reach a
statistical uncertainty of 0.06 kJ/mol in free energy estimate.

MBAR with mapping requires just 50 samples/state and 11 intermediate state to reach the
same precision. Speed up in this particular system is by a factor of 800

Conclusion

It is possible to estimate thermodynamic properties for large number of
unsampled states using samples from just a few states.

It is possible to estimate free energies for states which do not share similar
geometries using a linear transformation which maps the two different
geometries.

Reweighting technique along with mapping algorithm can substantially
speed up the parameter scans.


Acknowledgement
I thank ..

you all for your time and patience

the organizers for giving me a platform to discuss my research

NSF CHE-1152786 for funding.

the entire Shirts group for all the encouragement and support.

You might also like