Markov Chain Monte-Carlo Enhanced Variational Quantum Algorithms
Markov Chain Monte-Carlo Enhanced Variational Quantum Algorithms
strate both the effectiveness of our technique and the validity of our analysis through quantum
circuit simulations for MaxCut instances, solving these problems deterministically and with per-
fect accuracy. Our technique stands to broadly enrich the field of variational quantum algorithms,
improving and guaranteeing the performance of these promising, yet often heuristic, methods.
FIG. 1: Diagram of a random graph for MaxCut, VQE, and MCMC-VQA. Random graphs (black, Secs. I and
III) in this work are generated with normally distributed edge weights wi . The objective is to minimize Eq. 1 by
optimally assigning each pair of vertices via , vib ∈ {−1, 1}. MaxCut can be solved on a quantum computer by
mapping via , vib → σia , σib and minimizing the corresponding H. See Sec. III for graph details. VQE (gray, Sec. I)
minimizes the loss function for each θ̂ by calculating the expectation value Λ(θ̂) and updating θ̂ with gradient
descent using ∇Λ(θ̂). MCMC-VQA (blue, Sec. II) uses gradient descent with ∇Λ(θ̂) and random noise ξΘr to
produce candidate state θ̂′ , but also calculates probability distributions P (θ̂) and P (θ̂′ ), as well as proposal
distributions G(θ̂′ |θ̂) and G(θ̂|θ̂′ ). Using these distributions, the acceptance distribution A(θ̂′ |θ̂) is calculated and
compared to random uniform sample u ∼ U (0, 1). If A(θ̂′ |θ̂) > u, then θ̂′ → θ̂. Otherwise, the MCMC-VQA
algorithm restarts with the original θ̂. (Red) after the maximum number of MCMC-VQA epochs TMC have
occurred, the sampled parameters with the lowest loss, θ̂min , are selected and the optimization completes with a
closing sequence of VQE epochs.
dient of any θtk ∈ θ̂t can be calculated as ∇k Λ(θ̂t ) = A(x′ |x) is the acceptance distribution, or the probability
of accepting the new state x′ given state x. To satisfy
Λ(θ̂t + ǫk̂) − Λ(θ̂t − ǫk̂) /2ǫ by finite difference. As
Eq. 4, the acceptance distribution is defined as
∇Λ(θ̂t ) → 0 in the vicinity of both global and local min-
ima, VQE training is prone to stagnation at suboptimal
solutions.
P (x′ )G(x|x′ )
A(x′ |x) = min 1, . (6)
P (x)G(x′ |x)
II. RESULTS
Note that as only the ratio P (x′ )/P (x) is considered, the
In this section, we present our novel method for en- probability distribution need not be normalized. To de-
hancing the performance of VQAs with classical MCMCs, termine whether the candidate state x′ or the current
a technique that we dub MCMC-VQA. We start by state xt should be used as the future state xt+1 , a sam-
briefly reviewing traditional MCMC, focusing on the ple u is drawn from the uniform distribution U (0, 1). If
Metropolis-Hastings algorithm. Then, we introduce A(x′ |xt ) ≥ u, then xt+1 = x′ and we say that the candi-
MCMC-VQA, derive its behavior, and verify our find- date state x′ is accepted. Otherwise, xt+1 = xt and we
ings with numerical simulations. say that x′ is rejected.
We now present the MCMC-VQA method. Fig. 1 con-
tains a diagram of the algorithm (blue). In particular,
A. MCMC-VQA Method we focus on an ergodic Metropolis-Hastings algorithm,
which is guaranteed to sample states near global minima.
MCMC algorithms, such as Metropolis-Hastings, com- We outline the algorithm both idealistically and experi-
bine the randomized sampling of Monte-Carlo methods mentally, prove its ergodicity and convergence, and verify
with the Markovian dynamics of a Markov chain in or- these findings with numerical simulations.
der to randomly sample from a distribution that is diffi- As we seek the lowest energy eigenstate when solving
cult to characterize deterministically [32]. MCMC is par- MaxCut via VQE, we define P (θ̂) as the Boltzmann dis-
ticularly useful for approximations in high-dimensional tribution
spaces, where the so-called “curse of dimensionality” can
make techniques such as random sampling prohibitively
slow [43]. The core merit of MCMC techniques is their er- X
godicity, which guarantees that all states of the distribu- P (θ̂a ) = exp (−βΛa ) /Z, Z= exp (−βΛi ) , (7)
i
tion are eventually sampled in a statistically representa-
tive way, regardless of which initial point is chosen. This
representative sample is known as the unique stationary such that a state’s probability increases exponentially
distribution π. In particular, any Markov chain that is with decreasing loss function.
both irreducible (each state has a non-zero probability of To calculate the proposal distribution G(θ̂′ |θ̂t ), we
transitioning to any other state) and aperiodic (not par- must consider the sampling statistics of VQAs. Due to
titioned into sets that undergo periodic transitions) will quantum uncertainty, a measurement mri (θ̂t ) of operators
provably converge to its unique stationary distribution π, ωi σia σib from Eq. 2 is a sample from a distribution with
from which it samples ergodically [44]. The mathemati- mean µit and variance
cal properties of ergodic Markov chains are well-studied,
including analytic bounds for solution quality and mixing
time (number of epochs) [45, 46].
(∆it )2 = ωi2 [h(σia σib )2 it − hσia σib i2t ] = ωi2 [1 − (µit )2 ]. (8)
In order to obtain π for a distribution of inter-
est, Metropolis-Hastings specifies the transition kernel
P (x′ |x), which is the probability that state x transitions The Central Limit Theorem asserts that, assuming at
to state x′ . Typically, the Markov process is defined such least M & 30 independent and identically distributed
that transitions satisfy the detailed balance condition: measurements mri (θ̂t ), an estimate of the loss func-
tion Λt is the statistic lt ∼ N Λt , (∆Λ 2
t ) , where
i 2
(∆Λ 2
Similarly, ∀θtk ∈ θ̂t
P
P (x)P (x′ |x) = P (x′ )P (x|x′ ). (4) t ) = i (∆t ) /M [47, 48].
and assuming small parameter shifts ǫ, the gradient
When Eq. 4 holds, the chain is said to be reversible and ∇k Λt = Λ(θ̂t + ǫk̂) − Λ(θ̂t − ǫk̂) /2ǫ is the statis-
is guaranteed to converge to a stationary distribution.
P (x′ |x) can be factored into two quantities tic dk lt ∼ N ∇k Λt , [∆2Λ (θ̂t + ǫk̂) + ∆2Λ (θ̂t − ǫk̂)]/4ǫ2 .
The variance of this distribution can be simplified by not-
ing that to first order in ǫ, the parameter shifted Pauli
P (x′ |x) = G(x′ |x)A(x′ |x), (5) operators are σia ±k
= σia (θ̂ ± ǫk̂) = σia ± ιiak , where
where G(x′ |x) is the proposal distribution, or the condi- σia = σia (θ̂) and ιiak = (∂σia /∂θk )ǫ. We can then sim-
tional probability of proposing state x′ given state x, and plify the sum ∆i (θ̂t + ǫk̂)2 + ∆i (θ̂t − ǫk̂)2 = 2∆i (θ̂t )2 by
4
FIG. 2: Example trajectories with inverse thermodynamic temperature β = 0.8 (left) and β = 0.2 (right).
Four-hundred MCMC-VQA epochs (Markovian epochs) are followed by a closing sequence of VQE epochs
(beginning at red dashed line), which is initialized with the best parameters θ̂min found during the Markov process.
At lower temperature (β = 0.8), trajectories become trapped in local minima and reaching ergodicity is a lengthy
process. Conversely, the high-temperature (β = 0.2) trajectories rapidly reach burn-in, generating θ̂min that lead to
near perfect convergence during the VQE closing sequence. See Sec. III for simulation details.
±k ±k 2 ±k ±k 2
dk lt ∼ N ∇k Λt , ∆2Λ (θ̂t )/2ǫ2 . (10)
∆i (θ̂t + ǫk̂)2 = h(ωi σia σib ) i − hωi σia σib i , (9a)
+k +k 2 −k −k 2
h(σia σib ) i + h(σia σib ) i = 2 + O(ι2 ), (9b) Standard gradient descent would propose the candi-
+k +k 2
hσia σib i + −k −k 2
hσia σib i = 2hσia σib i + O(ι ). 2
(9c) date state θ̂′ = θ̂ − η∇Λt , however MCMC-VQA adds
a normally distributed random noise term Θr ∼ N (0, 1)
with scale parameter ξ in order to expand the support of
Now, up to first order in ι, we can derive the gradient’s the proposal distribution G(θ̂′ |θ̂t ). This specifies
Λ 2
2 (∆t )
Y
′ ′ ′ 2 ′
G(θ̂ |θ̂t ) = G(θ̂ |θ̂t )k , G(θ̂ |θ̂t )k = pdf N η∇k Λ(θ̂t ), ξ + η θ̂ t − θ̂ , (11)
2ǫ2
k
C.1. Irreducibility
TY
−1
p(θ̂1 |θ̂a )p(θ̂b |θ̂T ) p(θ̂i+1 |θ̂i ) > 0. (14)
B. Implementation of MCMC-VQA on Quantum
i=1
Hardware
That is, the Markov chain is irreducible if, for any two
points in parameter space θ̂a , θ̂b , there exists a series of
As discussed above, the loss function Λt is not pre- transitions of any length T such that θ̂a → θ̂b with non-
cisely determined on actual quantum zero probability [49]. While this definition of irreducibil-
P hardware,
i
but
i ity is sufficient, we will instead focus on the yet more
rather estimated as a statistic lt = q
i t , where qt =
1
PM r powerful condition of strong irreducibility. A Markov
M r=1 mi (θ̂t ). As a result, the variance of a single chain is strongly irreducible iff
observable measurement (∆it )2 is estimated by (δti )2 =
ωi2 [1−(qti )2 ], while thatPof the total loss
P function (∆Λ 2
t ) is
Λ 2 i 2 2 i 2
estimated by (δt ) = i (δt ) /M = i ωi [1 − (qt ) ]/M , g(θ̂a |θ̂b ) > 0, ∀θ̂a , θ̂b , (15)
for M -measurements per observable. Alternatively, the
variances could be directly estimated from the standard meaning that all points in parameter space have a non-
deviations of expectation value statistics. We then define zero probability of transitioning to all other points [50].
a(θ̂′ |θ̂t ), the acceptance distribution on quantum hard- This condition is then equivalent to
6
FIG. 3: (Left, blue) Average MCMC-VQA accuracy (1 − α, for average error α) vs inverse thermodynamic
temperature β. Nearly perfect average accuracy is obtained for properly tuned hyperparameter β (here, β ≈ 0.2).
At low temperature (large β), the algorithm mixes slowly, only partially approximating ergodicity in TMC = 400
Markovian epochs. This partial convergence results in lower accuracy, which approaches that of traditional VQE
(blue dashed line) in the limit of large β. Conversely, for high temperature (small β), the algorithm is insufficiently
biased towards low-energy solutions, which renders its gradient descent inefficient and reduces its accuracy. (Left,
gray) The standard deviation of MCMC-VQE accuracy vs β. Higher standard deviation directly corresponds with
lower accuracy. As discussed above, at high β, this is due to runs trapped in local minima (see Fig. 2), while at low
β, this stems from the lack of energy-preferred convergence. (Left) Optimal value of ξ vs β, where ξ is the gradient
descent noise parameter (θ̂′ = θ̂ − η∇Λt + ξΘr ) and each trajectory undergoes TMC = 400 Markovian epochs. As
larger temperatures generate more permissive acceptance distributions A(θ̂′ |θ̂), higher ξ values lead to more efficient
mixing in the low-β limit. See Sec. III for simulation details.
" 2 #
(2π)−1/2 − θak − θbk − ηdk la
g(θ̂b |θ̂a )k = p exp > 0, ∀k, (16)
ξ 2 + η 2 (δaΛ )2 /2ǫ2 2 (ξ 2 + η 2 (δaΛ )2 /2ǫ2 )
2
where we note that δΛ (θ̂t ) ∝ 1/M . creasing ξ. Moreover, due to the uncertainty introduced
Eq. 15 is satisfied, at least technically to some toler- by finite statistics dk la and (δaΛ )2 , sampling of the propo-
ance, ∀θ̂a , θ̂b . Although g(θ̂b |θ̂a )k may become very small, sition kernel g(θ̂b |θ̂a )k can allow for otherwise unlikely
it will generally retain a non-zero probability for virtually transitions.
all transitions, and the chain will be strongly irreducible, C.2. Aperiodicity
albeit perhaps slow to convergence. More precise argu-
ments can be made in the limit of large √ ξ, where to first
order in small 1/ξ, g(θ̂b |θ̂a )k → 1/ 2πξ and all transi- In the case of strong irreducibility argued above (Eq.
tions become equally likely. While this extreme ξ limit 15), aperiodicity is automatically satisfied. Assuming
is too random to result in efficient gradient descent, it only the weaker irreducibility of Eq. 14, it is sufficient
illustrates a concrete transition to irreducibility with in- to show that [49]
" #
(2π)−1/2 − (ηdk la )2
a(θ̂a |θ̂a )g(θ̂a |θ̂a ) = g(θ̂a |θ̂a ) = p exp > 0. (17)
ξ 2 + η 2 (δaΛ )2 /2ǫ2 2 (ξ 2 + η 2 (δaΛ )2 /2ǫ2 )
FIG. 4: Average accuracy vs Markovian epochs for three different β values. Gray dots are the average MCMC-VQA
accuracy 1 − α, and blue curves are a least squares fit of this data to the analytical accuracy of an ergodic Markov
chain 1 − αMC (τ ), with theoretical mixing time τ (see Eq. 18). The analytical time-dependence of αMC matches the
observed scaling of α, affirming that MCMC-VQA is an ergodic Markov chain, and thus guaranteeing convergence to
the global minimum. Furthermore, the ratio of observed scale parameters √ between MCMC-VQA simulations with
different β values is consistent with the analytic dependence τ ∝ ln(1/ π ∗ ) (Eq. 18) on the least likely state
π ∗ ∝ exp(−βΛmax ) (Eq. 7). This functional dependence on temperature further supports our claims of ergodically
sampling from P (θ̂) and thus deterministically converging to the global minimum.
pecially nonconvex and thus prone to convergence in lo- validity of our analytical findings, and the capacity of
cal minima [19], however MCMC-VQA can be used with MCMC-VQA to not only outperform traditional VQAs,
arbitrary parameterization. The circuit gates are alter- but to do so with up to perfect and deterministic conver-
nated between a layer of single-qubit parameterized ro- gence.
tations (angles θ̂) about the y-axis and a layer of two- In future research, MCMC-VQA should be studied for
qubit control-Z gates. For each method (VQE or MCMC- a variety of different applications, quantum algorithms,
VQA) and set of hyperparameters, a variety of learning and Markov processes. In addition to quantum optimiza-
rates are scanned so that numerical comparisons could tion, VQAs have been employed to address a myriad of
be drawn against the optimal performance of each algo- topics in both quantum chemistry [13–15] and condensed
rithm. All VQE sequences consisted of 100 epochs. Fig. matter physics [16–18]. Moreover, even simple quantum
2 shows an ensemble of trajectories whereas Figs. 3 and Hamiltonians, such as the transverse field Ising model,
4 is the average over the optimal learning rate for all ten are known to acutely struggle with premature conver-
graphs and 20 random initializations. For simplicity, we gence to local, rather than global, minima. Similarly,
take the large M limit, assuming many measurements our technique could be extended to QAOA [4] or any of
and precise expectation values. the numerous VQAs that have been proposed in recent
years. Finally, tens of MCMCs have been devised over
the past 70 years, each with their own advantages, with
IV. DISCUSSION variations featuring Gibbs sampling [56], parallel temper-
ing [57], and independence sampling [58]. These methods
In this work, we have introduced MCMC-VQA: a novel could be substituted for Metropolis-Hastings in order to
variational quantum algorithm that harnesses classical produce algorithms with lower computational overhead
Makov chains to obtain analytic convergence gaurantees and faster mixing times. In short, varieties of MCMC-
for parameterized quantum circuits. As ergodic Markov VQA can be developed for a broad spectrum of varia-
chains representatively sample a target probability dis- tional quantum algorithms to both improve and guaran-
tribution, they identify regions near the global minimum tee performance.
with high probability. We present MCMC-VQA, both ACKNOWLEDGEMENTS
from a theoretical and practical perspective, prove its er-
godicity, and derive its time-complexity (mixing time) as O.S. likes to thank Katie Pizzolato for accommodat-
a function of both accuracy and inverse thermodynamic ing HPC resource requests on IBM Cloud. This work
temperature. Focusing on MaxCut optimization within was done during T.L.P.’s internship at IBM Quantum,
the VQE framework due to its plentiful local minima and for which T.L.P. thanks Katie Pizzolato and the entire
employing a reversible Metropolis-Hastings Markov pro- IBM Quantum team. S.F.Y. would like to acknowledge
cess, we demonstrate the ergodicity of our method, the funding by NSF and AFOSR.
[1] While finalizing this manuscript, we became aware of [9] G. Nannicini, Phys. Rev. E 99, 013304 (2019).
another work applying Markov Chain Monte-Carlo tech- [10] L. Braine, D. J. Egger, J. Glick, and S. Woerner, IEEE
nique in quantum algorithms [59]. However, we differenti- Transactions on Quantum Engineering 2, 1 (2021).
ate our work by targeting near-term quantum algorithms [11] T. L. Patti, J. Kossaifi, A. Anandkumar, and S. F. Yelin,
and providing the proof of ergodicity. “Variational quantum optimization with multi-basis en-
[2] J. R. McClean, J. Romero, R. Babbush, and A. Aspuru- codings,” (2021), arXiv:2106.13304 [quant-ph].
Guzik, New Journal of Physics 18, 023023 (2016). [12] B. Fuller, C. Hadfield, J. R. Glick, T. Imamichi, T. Itoko,
[3] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. R. J. Thompson, Y. Jiao, M. M. Kagele, A. W. Blom-
Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’brien, Schieber, R. Raymond, and A. Mezzacapo, “Approxi-
Nature communications 5, 4213 (2014). mate solutions of combinatorial problems via quantum
[4] E. Farhi, J. Goldstone, and S. Gutmann, arXiv preprint relaxations,” (2021), arXiv:2111.03167 [quant-ph].
arXiv:1411.4028 (2014). [13] S. McArdle, S. Endo, A. Aspuru-Guzik, S. C. Benjamin,
[5] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, and X. Yuan, Rev. Mod. Phys. 92, 015003 (2020).
S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, [14] A. Kandala, A. Mezzacapo, K. Temme, M. Takita,
L. Cincio, and P. J. Coles, Nature Reviews Physics 3 M. Brink, J. M. Chow, and J. M. Gambetta, Nature
(2021). 549, 242 (2017).
[6] W. Lavrijsen, A. Tudor, J. Müller, C. Iancu, and [15] H. R. Grimsley, S. E. Economou, E. Barnes, and N. J.
W. de Jong, in 2020 IEEE International Conference Mayhall, Nature Communications 10 (2019).
on Quantum Computing and Engineering (QCE) (IEEE, [16] M. B. Ritter, in Journal of Physics: Conference Series,
2020) pp. 267–277. Vol. 1290 (IOP Publishing, 2019) p. 012003.
[7] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, [17] N. Vogt, S. Zanker, J.-M. Reiner, T. Eckl, A. Marusczyk,
S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, and M. Marthaler, “Preparing symmetry broken ground
L. Cincio, et al., Nature Reviews Physics , 1 (2021). states with variational quantum algorithms,” (2020),
[8] M. R. Garey and D. S. Johnson, Computers and in- arXiv:2007.01582 [quant-ph].
tractability, Vol. 29 (wh freeman New York, 2002).
9
[18] F. Zhang, N. Gomes, Y. Yao, P. P. Orth, and T. Iadecola, [37] J. Lemieux, B. Heim, D. Poulin, K. Svore, and
Phys. Rev. B 104, 075159 (2021). M. Troyer, Quantum 4, 287 (2020).
[19] J. Lee, A. B. Magann, H. A. Rabitz, and C. Arenz, Phys. [38] A. Montanaro, Proc. R. Soc. A. 471 (2016),
Rev. A 104, 032401 (2021). https://doi.org/10.1098/rspa.2015.0301.
[20] D. Beaulieu and A. Pham, arXiv preprint [39] A. Cornelissen and S. Jerbi, “Quantum algorithms
arXiv:2108.13464 (2021). for multivariate monte carlo estimation,” (2021),
[21] D. J. Egger, J. Mareček, and S. Woerner, Quantum 5, arXiv:2107.03410 [quant-ph].
479 (2021). [40] Y. Wang, S. Wu, and J. Zou, Statistical Science 31, 362
[22] W. van Dam, K. Eldefrawy, N. Genise, and N. Parham, (2016).
arXiv preprint arXiv:2108.08805 (2021). [41] M. Medvidovic and G. Carleo, npj Quantum Information
[23] J. Rivera-Dean, P. Huembeli, A. Acı́n, and J. Bowles, 7 (2021), 10.1038/s41534-021-00440-z.
“Avoiding local minima in variational quantum algo- [42] C. W. Commander, “Maximum cut problem, max-
rithms with neural networks,” (2021), arXiv:2104.02955 cutmaximum cut problem, max-cut,” in Encyclopedia of
[quant-ph]. Optimization, edited by C. A. Floudas and P. M. Parda-
[24] S. M. Harwood, D. Trenev, S. T. Stober, P. Barkoutsos, los (Springer US, Boston, MA, 2009) pp. 1991–1999.
T. P. Gujarati, S. Mostame, and D. Greenberg, arXiv [43] C. J. Geyer, Statistical Science 7, 473 (1992).
preprint arXiv:2102.02875 (2021). [44] S. P. Brooks, Journal of the Royal Statistical Society.
[25] O. Shehab, I. H. Kim, N. H. Nguyen, K. Landsman, C. H. Series D (The Statistician) 47, 69 (1998).
Alderete, D. Zhu, C. Monroe, and N. M. Linke, arXiv [45] R. Montenegro and P. Tetali, Found. Trends Theor. Com-
preprint arXiv:1906.00476 (2019). put. Sci. 1, 237–354 (2006).
[26] S. Bravyi, D. Gosset, and R. König, Science 362, 308 [46] N. M. March (2011).
(2018). [47] T. K. Kim, Korean journal of anesthesiology 68, 540
[27] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, (2015).
and H. Neven, Nature communications 9, 1 (2018). [48] S. G. Kwak and J. H. Kim, Korean J Anesthesiol. 70
[28] T. L. Patti, K. Najafi, X. Gao, and S. F. Yelin, Phys. (2017), 10.4097/kjae.2017.70.2.144.
Rev. Research 3, 033090 (2021). [49] C. Daskalakis, “6.896: Probability and computation,”
[29] C. Ortiz Marrero, M. Kieferová, and N. Wiebe, PRX (2011).
Quantum 2, 040316 (2021). [50] N. Whiteley, “The metropolis-hastings algorithm,”
[30] Z. Holmes, K. Sharma, M. Cerezo, and P. J. Coles, “Con- (2008).
necting ansatz expressibility to gradient magnitudes and [51] T. L. Patti, J. Kossaifi, S. F. Yelin, and A. Anandkumar,
barren plateaus,” (2021), arXiv:2101.02138 [quant-ph]. “Tensorly-quantum: Quantum machine learning with
[31] M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. tensor methods,” (2021), arXiv:2112.10239 [quant-ph].
Coles, Nature Communications 12 (2021). [52] “Tensorly-quantum: Tensor-based quantum machine
[32] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, learning,” (2021).
A. H. Teller, and E. Teller, Journal of the Royal Statis- [53] E. N. Gilbert, The Annals of Mathematical Statistics 30,
tical Society. Series D (The Statistician) 47, 69 (1998). 1141 (1959).
[33] H. Robbins and S. Monro, The annals of mathematical [54] D. Coppersmith, D. Gamarnik, M. Hajiaghayi, and G. B.
statistics , 400 (1951). Sorkin, Random Structures & Algorithms 24, 502 (2004).
[34] C. Nemeth and P. Fearnhead, Journal of the [55] T. Luczak, in Proceedings of Random graphs, Vol. 87
American Statistical Association 116, 433 (2021), (1990) pp. 151–159.
https://doi.org/10.1080/01621459.2020.1847120. [56] A. E. Gelfand, Journal of the American statistical Asso-
[35] M. Szegedy, in Proceedings of the 45th Annual IEEE ciation 95, 1300 (2000).
Symposium on Foundations of Computer Science, FOCS [57] D. J. Earl and M. W. Deem, Physical Chemistry Chem-
’04 (IEEE Computer Society, USA, 2004) p. 32–41. ical Physics 7, 3910 (2005).
[36] K. Temme, T. J. Osborne, K. G. Vollbrecht, D. Poulin, [58] W. K. Hastings, Biometrika 57, 97 (1970).
and F. Verstraete, Nature 471, 87 (2011). [59] G. Mazzola, “Digital quantum advantage in monte carlo
simulations of frustrated spin models,” (in prep.).