Taxonomy For Physics Beyond Quantum Mechanics
Taxonomy For Physics Beyond Quantum Mechanics
JRH, 0000-0001-8587-7618
Received: 24 October 2023
We propose terminology to classify interpretations
Accepted: 10 June 2024
of quantum mechanics and models that modify
or complete quantum mechanics. Our focus is on
models which have previously been referred to
as superdeterministic (strong or weak), retrocausal
Subject Category: (with or without signalling, dynamical or non-
Physics dynamical), future-input-dependent, atemporal and
all-at-once, not always with the same meaning or
context. Sometimes, these models are assumed to be
Subject Areas:
deterministic, sometimes not, the word deterministic
quantum physics has been given different meanings, and different
notions of causality have been used when classifying
them. This has created much confusion in the
Keywords: literature, and we hope that the terms proposed
quantum, taxonomy, interpretations, models, here will help to clarify the nomenclature. The
theories general model framework that we will propose may
also be useful to classify other interpretations and
modifications of quantum mechanics. This document
Author for correspondence: grew out of the discussions at the 2022 Bonn
Workshop on Superdeterminism and Retrocausality.
Jonte R. Hance
e-mail: [email protected]
1. Introduction
Quantum mechanics, despite its experimental success,
has remained unsatisfactory for a variety of reasons,
notably due to its tension with locality, and due to
the measurement problem [1,2]. Different authors have
© 2024 The Author(s). Published by the Royal Society under the terms of the
Creative Commons Attribution License http://creativecommons.org/licenses/
by/4.0/, which permits unrestricted use, provided the original author and
source are credited.
formulated their unease with quantum mechanics in different ways. Analysing the origin of this
2
unease is not the purpose of this present article. The purpose is instead to sort out the confusion
in the terminology used to describe this unease.
We will, in the following, use a notation in which curly brackets ⋅ denote sets of mathe-
analytic. For more on the relation between calculational models and computer models, see §2f.
matical assumptions. The set ⋅ is the set of all assumptions that can be derived from ⋅
(including the elements of ⋅ themselves).
The different components of the modelling framework have the following properties:
Inputs (of a calculational model): The inputs I of a calculational model are a (at most countably
infinite) set of mathematical assumptions—each of which is an input, I i—that describes the
We will denote this set as I : = I i | i ∈ K ⊆ ℕ+ , were K is the (at most countably infinite) index
scenario. To be part of the inputs, an assumption must differ between at least two scenarios.
To aid readability, and because the index set will not matter in the following, we will from here
Loosely speaking, you can think of the inputs as the mathematical representation of a
scenario. A typical example might be the temperature in the laboratory, or the frequency of
a laser. However, the inputs of a model do not necessarily have to correspond to definite
observable properties of the scenario. They could also be expressing ignorance about certain
properties of the scenario and thus be random variables with probability distributions, or they
might indeed be entirely unobservable. We will come back to this point in §3a.
may work with an unspecified variable x that is an element of some space. The input may then
The inputs of a c-model are often assignments of values to variables. For example, a c-model
assign the value x = 3 to this variable for a specific scenario. For such an input, we will refer to
the value it assigns as the input value. We want to stress that the input value is not the same as
the input. The input is ‘x = 3’, the input value is ‘3’.
oscillator. In this case, the model would be the differential equation mẍ = − kx and a complete
A typical example of a c-model that we are all familiar with would be the Harmonic
set of inputs would be value assignments for k and m, plus two initial values for, say, x t = 0
and ẋ t = 0 .
However, inputs of a calculational model are not necessarily value assignments. They might
also be constraint-equations, or boundary conditions, or something else entirely. For example,
when studying stellar equilibrium, it is quite common to enter an equation of state as the input
to the Tolmann–Oppenheimer–Volkov (TOV) equations. In this case, the TOV equations are the
same for all stellar objects that one may want to consider; they are hence part of the model. The
equation of state, on the other hand, changes from one type of star to another; it is hence part of
the input.
While our main interest is in models that describe the real world, it is also possible to
study a model’s properties with scenarios that do not exist in reality. We will refer to those as
hypothetical scenarios. They include, but are not limited to, counterfactual realities, as well as
universes with different constants of nature. (Note Frigg & Nguyen [4], among others, have also
discussed representation in scientific modelling.)
Calculational model: A set C of mathematical assumptions Ax that are independent of the
scenario. We will denote this set as C: = Ak .
Set-up: The union of a calculational model and its inputs. F : = I ∪ C.
but from neither the model nor its inputs in isolation O = F ∖ (I ∪ C).
Outputs (of a calculational model): All mathematical statements that can be deduced from set-up,
Predictions from observables are obtained from the outputs of the model. However, not all
outputs of a model need to be observable. The prime example is quantum mechanics, where
the outputs contain the time-evolution of the wavefunction, but the wavefunction itself is not
observable. But there are many other examples, such as the production of gravitational waves
by a black hole merger. Given suitable inputs, the model (General Relativity) will output a
4
mathematical description for the creation and propagation of gravitational waves, but we only
measure their arrival on Earth, and only measure that through the waves’ influence on matter,
Ak = Ak1 ∪ Ak2 , so that both C1 and C2 are each set-ups of calculational models and the
1 2
combination of their outputs is the same as the outputs of C. A simpler way to put this is that,
if we split the set-up of an irreducible model into two, some of the output can no longer be
calculated. This approach is reminiscent of identifying particles in quantum field theory from
the irreducible representations of the Poincaré group.
We need this requirement because otherwise, we could just join different set-ups to form a
new one, which would make it impossible to classify any. Note that this does not mean the
composition of two set-ups is not a set-up. On the contrary, a composition of two set-ups is a
set-up, it is just that the combined set-up is no longer irreducible. The issue is, if we allowed
reducible combinations of set-ups, then we could not meaningfully assign properties to them. It
would be like asking which fruit is fruit salad. Once we have, however, succeeded in identifying
properties of irreducible set-ups, we can join those with the same properties together, and
meaningfully assign the same property to the reducible set-up. Using the above fruit example,
if we have identified several fruits as apples, we can join them and be confident we have apple
salad.
A particularly relevant case of a reducible set-up is one in which some inputs or assump-
tions can be removed without changing anything about the outputs. This may be because the
assumptions are not independent (in the sense that some can be derived from the others), or
because an assumption is simply not used to calculate the outputs.
One might be tempted to add to the requirements of a model that its assumptions are
consistent (given problems such as the Principle of Explosion with inconsistent models [5]).
However, as is well known, for recursively enumerable sets, Gödel’s theorem [6] tells us that
we cannot in general prove the consistency of the assumptions. We might then try to settle
on the somewhat weaker requirement that at least the assumptions should not be known to
be inconsistent. However, it sometimes happens in physics that a model works well in certain
parameter ranges despite being inconsistent in general. An example may be the Standard
1
An example for this is the case of the Standard Model with its fundamental constants not fixed but taken as variable
inputs. This cannot be a correct model for scenarios in our universe because in our universe the constants are constant,
hence they cannot be inputs. One can, however, consider hypothetical scenarios with different values of the constants, often
interpreted as a type of multiverse. But, of course, one could alternatively interpret these hypothetical scenarios as different
versions of the Standard Model that, alas, happen to not agree with our observations. The point is that whether the
Standard Model with fundamental constants that do not agree with our observations describes a hypothetical alternative
universe, or is just a wrong model for our universe, is a matter of definition.
Model without the Higgs field and its boson [7]. For this reason, we will here not add any
5
requirement about consistency. One may justify this by taking a purely instrumental approach.
We only care whether a set of assumptions is any good at describing observations.
calculational model, then M: = [C]m is the m-model that encompasses all mathematically
equivalent formulations
[C]m : = Bj : Bj ⇔ Ai . (2.1)
it with the more old-fashioned ∇ , ∇ ⋅ and ∇ × operators. These two definitions would be two
different calculational models, but the same mathematical model. A similar equivalence covers
switching from the Schrödinger picture to the Heisenberg picture in quantum mechanics. We
believe that most physicists would not call these different models.
The inputs of a mathematical model are likewise an equivalence class: it is the class of all sets
of assumptions which change with the scenario that, together with the mathematical model,
produce equivalent outputs.
The inputs of a mathematical model are usually not just the set of assumptions that are
mathematically equivalent to the inputs of one of the calculational models. This is because the
assumptions of the model itself may render certain inputs equivalent that are not equivalent
time-reversible evolution operator will produce identical outputs for input states at times t1 and
without the model. An example that will be relevant for what follows is that a model with a
t2 > t1, if the input state at t2 is the forward evolution of the state from t1. In this case, the input is
m-equivalent, though the inputs at different times are not equivalent to each other without the
model.
We will refer to different calculational models as representations of their m-class.
2
Glymour uses the term ‘theory equivalence’, not ‘model equivalence’. This is not a relevant distinction for the purposes of
this present paper: please see §2e for discussion.
Representation (of an m-model): A calculational model within the m-class of that m-model. 6
In Argaman [14], a similar concept, that of two c-models within the same m-class, was referred
physical model, or p-model for short. We will denote this equivalence class with [ ⋅ ]p and refer to
mathematical models into yet another, even bigger class and say they constitute the same
it as the p-class. Note that we could take either the p-equivalence class of a computational model
or that of a mathematical model. Models in the same p-class will be called p-equivalent.
By calling these models physical, we do not mean to imply that we only consider observa-
ble properties as physically real. The reader should not take our nomenclature to imply any
statement about realism or empiricism. We call them physical just because they describe what
physicists in practice can distinguish with physical measurements.
This now gives us a way to define what we mean by an interpretation:
Interpretation (of a p-model): Any one of the mathematical models in the p-class.
Note that according to this nomenclature,3 each interpretation has its own representations. That
is, a c-model is a representation of an m-class, but more specifically it is a representation of a
particular interpretation.
For example, if we take the Copenhagen Interpretation (CI) in one of its common axiomatic
formulations (e.g. as given in Zurek [16]), that set of axioms will constitute a particular representa-
tion, hence, a c-model. The CI per se is the class of all mathematically equivalent models. We can then
further construct the physical equivalence class of the CI, which we will in the following refer to as
Standard Quantum Mechanics (SQM:=[CI]P). Any other mathematical model that is physically but
not mathematically equivalent to the CI is also an interpretation of SQM.
For the purposes of this article, we will not need to specify exactly what we mean by CI.
However, we will take it to be only a first-quantized theory. That is, it is not a quantum field
theory. If one were to specify further details, one would in particular have to decide whether
one considers the relativistic version, or only the non-relativistic case, as some alternative
interpretations work only non-relativistically [17].
That we have interpretations and not just representations is a possibility which appears
in quantum mechanics because it has outputs that are unobservable in principle. A different
computational model which affects only unobservable outputs might not be mathematically
equivalent and yet give rise to the same physics.
3
Note others have given different definitions of interpretations for quantum mechanics, e.g. [15].
Modification or completion or extension (of a p-model): Another p-model that is not physically 7
equivalent but so far empirically equivalent.
(e) Theories
We will in the following not distinguish between models and theories. Loosely speaking, a
theory is a class of models that can be used for a large number of scenarios, congruent with the
semantic approach of Suppes [18] and van Fraassen [19]. However, physicists do not tend to use
the terms theory and model in a well-defined way.
Model in particular. We refer to General Relativity as a theory, and to ΛCDM as a model. This
For example, we use the terminology of Quantum Field Theory in general, and the Standard
agrees with what we said above. But we also refer to Fermi’s theory of beta decay, and the BCS
theory, though those would better be called models. To make matters worse, we also sometimes
call supersymmetry a theory, when it is really a property of a class of m-models, and so on. For
the purposes of this article, we will not need to distinguish theories from models, so we will not
bother with a precise definition of the term ‘theory’.
Interpretations of
Modifications of
Figure 2. A mathematical model Mi jk is an equivalence class of computational models Ci jkl. A physical model P i j is an
equivalence class of mathematical models Mi jk. An empirical model E i is an equivalence class of physical models P i j.
are certain types of analogue computers, which however can better be understood as simula-
tions again.
That is to say, when one wants to classify a computer model according to our scheme, one
should take its algorithmic definition as a set of assumptions, and use that to define a calcula-
tional model and its inputs. This calculational model, defined by its executable algorithm, will
be different from the calculational model that uses an analytic expression.
(One can then further ask to what accuracy will the algorithm, when executed on a physical
computer, approximate the output of the analytical model. This is a relevant point which
was previously brought up in [22]. While an analytically defined calculational model might
be time-reversible, an algorithmic approximation of it run on a computer will in general no
longer be time-reversible. This is because errors will build up differently depending on which
direction of time one runs the algorithm. This is particularly obvious for time-evolutions which
are chaotic. In such cases, the forward-evolution and the backward-evolution of the algorithm
as executed on the computer will actually be two different calculational models, and both are
different from the analytical calculational model that they approximate).
This property was termed ‘holistic determinism’ in Adlam [24] and differs from a more
common definition of determinism, that connects one moment in time with a later one. We
will elaborate on this in the next subsection. For now, we further follow [23] and distinguish
determinism from predictability:
Predictable: An m-model is predictable if it is deterministic, and the inputs are all derived from
observable properties of the scenario.
Both determinism and predictability are m-model properties, because we expect a redefinition
of the mathematics that one uses to process inputs to not change predictions for observable
output. If that was the case, we would assume something is wrong with our idea of what is
observable.
The distinction between deterministic and predictable is that a model may have inputs
that are unknowable in principle. Typically, these are value assignments for variables that are
usually referred to as ‘hidden variables’. A model with such hidden variables, according to the
above terminology, may be deterministic and yet unpredictable.
Hidden variables: Hidden variables, that we will denote κ, are input values to a c-model that
cannot be inferred from any observation on the scenario.
We want to stress that these hidden variables are in general not localized and sometimes not
even localizable in any meaningful way. We will say more about localizable variables in the
next subsection, but a simple example is that we could use Fourier-transforms of space–time
variables, or just extensive quantities that are properties of volumes. Note that hidden variables
are defined for c-models, not for m-models. This is because hidden variables can be redefined
into (not necessarily local) random variables with an equivalent mathematical outcome. That is
to say, mathematically, it makes no difference whether a variable was unknowable or indeed
random.
While it may sound confusing at first to distinguish determinism from predictability, it will
be useful in what follows. Indeed, the reader may have recognized that Bohmian Mechanics
is deterministic yet not predictable. In Bohmian Mechanics, observables can be calculated with
certainty if the inputs are specified, yet the inputs are also assumed to be partly unobservable in
principle, so predictions can still not be made with certainty. Since determinism is a property of
an m-model that cannot be removed with a redefinition, it follows that Bohmian Mechanics is
not a representation of SQM. We elaborate on this further in the companion paper [3].
4
We use the common mathematical abbreviation ‘iff’ for ‘if and only if.
distance. But it could alternatively be some other structure that performs a similar function,
10
such as a lattice, or a discrete network with a suitably defined metric.
To make sense of space and time, we will in the following consider only a restricted m-class
For models which have a proper notion of space–time, and a distance measure on it, we then
want to identify how local they are. For this, we first have to identify the input values that
can be assigned to space–times which Bell coined ‘local beables’ [28]. According to Bell, local
beables ‘can be assigned to some bounded space–time region’. We will take ‘assigned to’ to
mean that the variable is the value of a function whose domain is space–time, or whatever the
→
stand-in for space–time is in the model at hand. Note ‘region’ might be a point set.
For example, if space–time is parameterized by coordinates x , t , and we have a function
→ →
f: x , t → ℂ, then f x , t is a local beable. The domain of the SQM wavefunction is generically
configuration space and it is therefore not a local beable, though under certain circumstances
local beables can be obtained from it (such as the single-particle wavefunction in position
space). Local beables are not necessarily observable.
Bell’s definition of a local beable makes the assignment to a space–time region optional
(can be assigned). However, if the assignment was optional, then it could be omitted, and a
model with an assumption that can be omitted is reducible. Since we only deal with irreducible
models, we therefore already know that if a variable is assigned to a space–time region, then
this assignment is actually necessary. For this reason, we define
Local beable: An input value of a c-model that is assigned to a compact region of space–time.
we can just define the state f x, t ⊗ f x′, t′ ∈ M ⊗ M and call that a local beable at x, t , just
ultra-local in that we do not need to know the state at any other location than x, t to calculate
on perspective, one might consider such a model as either ultra-local or ultra-non-local. It is
the time-evolution. It is ultra-non-local because any point in space–time contains a copy of the
A similar problem would occur with parameters, such as ℏ, that could be transformed into
entire universe.
Typical examples of such inputs are event relations, consistency requirements for histories,
evolution laws, temporal boundary conditions or superselection rules. Something as mundane
as an integral over time that depends on the scenario would also be an all-at-once input.
All-at-once (AAO) model: A c-model that uses all-at-once input.
The use of all-at-once input is a priori a property of c-models. That is to say, it might be possible
to reformulate a model with AAO input into a mathematically equivalent model that does not
have this property.
The principle of least action in classical mechanics, for example, uses All-At-Once input (the
action), but given that the Lagrangian fulfils suitable conditions, the principle can equivalently
be expressed by the Euler–Lagrange equations which do not require AAO input. Another
example may be the cosmological model introduced by Carroll & Remmen [31], which uses
a constraint on the space–time average of the Lagrangian density. The authors refer to this as
non-local, and in some sense it is, but it is more importantly also an all-at-once input.
Now that we have localized variables, we can define a notion of locality. We will first leave
Argaman [22]. The idea is that if one wants to calculate the outputs O A for a region A, then
aside causality and introduce a notion of Continuity of Action (CoA), as done in Wharton &
it is sufficient to have all the information on a closed hypersurface, S1, surrounding the region,
and adding local beables from another region outside S1 provides no further information.
condition P O A | I S1 , I B = P O A | I S1 for any region S1 that encompasses A but not B
CoA (locality): An m-model has continuous action or fulfils CoA or is local iff it fulfils the 12
might have a situation where, e.g. P I S1 , I B = 1 and P I S1 , I B ′ = 0 just because the latter
a mathematical constraint (as happens with the Fermi exclusion principle). In this case, one
it is rather meaningless to talk about violations of locality for configurations which do not
physically exist. Hence, if a model has input constraints, one should only apply the locality
requirement to inputs that lead to physically possible scenarios.
We define CoA as an m-model property because, if it could be removed with a
mathematical redefinition, we believe most physicists would not accept the definition as
meaningful.
The term ‘non-locality’ has been used to refer to many other definitions in the foundations
of physics in general, and quantum mechanics in particular. For example, in field theories,
non-locality often refers to dynamical terms of higher order, a definition that is also used in
General Relativity [34]. In quantum field theories, non-locality usually refers to the failure of
operators to commute outside the light cone. Even in quantum foundations, non-locality may
refer to different properties. For example, entanglement itself is sometimes considered non-local
despite being locally created [35]. The latter in particular has created a lot of confusion, because
while it has been experimentally shown that entanglement is a non-local correlation and does in
fact exist, this does not imply that nature is non-local in the sense that the term has been used in
Bell’s theorem, which has of course not been shown [36].
Surveying all those different notions of non-locality is beyond the scope of this present work.
However, we want to stress that as definitions, none of these notions of non-locality are wrong.
They are just that: definitions. We chose our definition to resemble closely the ‘spooky action at
a distance’ that Einstein worried about.
According to our definition, a calculational model fulfils CoA even if it cannot be directly
evaluated whether it fulfils the requirement on the conditional probabilities, so long as a
for our purposes is that [CI]m does not fulfil CoA. This is because making a measurement in B
mathematically equivalent reformulation of the model fulfils it. The most relevant example
provides information about the measurement outcome in A that was not available in S1. This is
Einstein’s ‘spooky action at a distance’.
That is, with the terminology we have introduced so far, the CI and all mathematically
equivalent formulations are equally non-local. The question is then merely whether this is
(a) (b)
S2 13
B B
Time
S1 S2
Space Space
Figure 3. (a) CoA. (b) Strong CoA.
something that we should worry about. If one can understand the wavefunction as an epistemic
state (a state of knowledge), then its non-local update is not a priori worrisome.
CoA loosely speaking means that localizable variables can only influence their nearest
neighbours. However, it does not require that this influence lies within the light cones. To
their insides to a space–time region, A. We will denote with L(A) the union of all space–time
arrive at a stronger condition, we will therefore now as is customary assign the light-cones and
And correspondingly,
Weak CoA (locally superluminal): An m-model has Weak CoA if it fulfils CoA but not Strong CoA.
Weak CoA, roughly speaking, means that influences happen locally, but sometimes superluminally.
What we call Strong CoA was called Einstein locality in Maudlin [37]. Local and non-local models
can further be distinguished into those which are super- and subluminal. It is rather uncommon
to consider subluminal non-locality, but it will be helpful in what follows to clearly distinguish
local beables necessary to find out what happens at A are those within or on the light-cones of A, and
non-locality from superluminality. We can define subluminal non-locality by requiring that the only
And, consequently,
Non-locally superluminal: An m-model is non-locally superluminal iff it is non-local, but not
non-locally subluminal.
inside the light cones, because of the requirement that one only considers regions S2 which
We should also mention here that strong CoA is a weaker criterion than confining CoA
do not overlap with the light cones of region B. The reason for this requirement—as noted by
Bell already—is that otherwise the outputs in B might well provide additional information that
correlates with the inputs from S2 (and hence, the outputs in A) without influences ever leaving
regions A and B far enough, they will always overlap. This means that one can always try to
the light-cones (see appendix for explanation). But, of course, if one extends the light cones of
14
al
n
umi
-loc
perl
a
lly s
u
ly s
uper
al
lum
Loc
Loc al
ally min
inal
rlu
supe
rlum supe
ally
inal -loc
Non
Time
n al Loc
umi ally
perl s uper
al ly su lum
-loc inal
Non
Non
inal
-loc
rlum
ally
supe
supe
ally
rlum
Loc
inal
Space
Figure 4. Difference between local, non-local and superluminal. Note that local and non-local are here distinguished by
having solid/dotted lines respectively, and that each arrow is only one representative for the entire quadrant (e.g. one can
imagine a reflected version of the left-to-right ‘locally subluminal’ arrow, going instead of right-to-left, which is also valid, so
long as it is in both the past and future light cones).
explain violations of Strong CoA by locating an origin of correlations between A and B in the
overlap of the light cones.5 We will come back to this later.
5
This is often called a common ‘cause‘. However, all that is required here is a correlation, not a causation.
Like before, it is possible to have a model that just does not have a temporal order, or that
15
does not distinguish past and future. Indeed, this is the case for many of the simplest models
that we deal with, such as an undamped harmonic oscillator, or the two-body problem in
an arrow of time, according to which calculating localizable output values at time t′ does not
Temporally deterministic: A c-model is temporally deterministic if it is both deterministic, and has
require inputs that are local beables at t > t′. An m-model is temporally deterministic if at least
one of its c-models is.
This is a complicated way of saying that, in a temporarily deterministic model, a future state
can be calculated with certainty from a past state, but not necessarily the other way round. This
notion of temporally deterministic is what is often meant by the term ‘deterministic’. Note that
a temporally deterministic model might have other inputs besides value-assignments for local
beables. In particular, a model with all-at-once inputs may still be temporally deterministic.
We have defined temporal determinism of an m-model from the requirement that at least
one of its c-models has that property, because temporal determinism is easy to remove by
redefining all variables so that they mix different times, or using (partially) time-like boundary
In the CI, so long as no measurement occurs, the state of the wavefunction at time ta is
conditions.
temporally determined by the state of the wavefunction at time tb ≠ ta and the Hamiltonian
operator. The state of the wavefunction after measurement, on the other hand, is generically not
determined by the state of the wavefunction before measurement. Hence, the CI (c-model) is
not temporally deterministic.
It does not follow from this that SQM, which we defined as [CI]m, is also not temporarily
deterministic. However, this is so because—as we saw earlier—SQM is not deterministic to
begin with, so it cannot be temporarily deterministic either.
For temporal models, we can further define:
As mentioned earlier, this definition implicitly makes a statement about what properties we
expect of time, and hence cannot stand on its own. That is not its purpose. Its purpose is rather
to capture what properties physical properties like time and time-reversibility should have.
Time-reversibility should not be confused with invariance under time-reversal, which is
a stronger requirement, but one that we will not consider here. Just because a model is
time-reversible, does not mean that its time-reversed version is the same as the original.
Next, we recall a term previously introduced in Wharton and Argaman [22]:
Future input dependence (FID): A c-model has a FID iff, to produce output for time t, it uses local
beables from at least one time t′ > t for at least one scenario.
FID is a property of the set-up of the c-model, and may simply be a matter of choice. For
16
example, in any time-reversible c-model, one can replace a future input with a past input and
get the same outputs. We define it here because it was previously used in Wharton & Argaman
producing outputs for time t requires inputs from t′ > t. An m-model has an FIR if all c-models
Future input requirement (FIR): A c-model has an FIR iff there is at least one scenario for which
Time
S3
S3
Space Space
Figure 5. (a) Local causality. (b) A subtlety of local causality.
time as a preferred slicing in space–time. This will give us a total of eight distinctions. It is then
straightforward to define:
Local retrocausality: An m-model is locally retrocausal iff it fulfils Strong CoA and has a future
input requirement.
Non-local causality: An m-model is non-locally causal iff it is non-local, subluminal and has no
future input requirement.
One can similarly define the terms non-local pseudo-retrocausality, locally superluminal
pseudo-retrotemporarity and non-locally superluminal pseudo-retrotemporality, by taking the
definition without the ‘pseudo’ and replacing the future input requirement with a future input
dependence.
The reason we use the prefix ‘pseudo’ is because according to our earlier definition a future
18
input dependence is a matter of choice. It can be replaced with a past input, at least in principle.
This does not mean that a future input dependence is unimportant, however. This is because
Dynamical retrocausality may or may not be due to a space–time structure with closed
time-like curves. It is important to emphasize that dynamical retrocausality is a property of
the model, and not a property of the way the model uses inputs. Depending on how the
arrow of time is defined, it may not be particularly meaningful. What makes an arrow of time
meaningful, however, is not a question we want to unravel here.
Just for completeness, we also define:
Counter-causal: An m-model is counter-causal iff its time-reversed version is locally causal.
A model with such a property would makes one strongly suspect that the arrow of time was
just defined the wrong way round to begin with. However, possibly there were other reasons to
define an arrow of time that way.
A general comment is in order here, which is that the term ‘retrocausal’ is somewhat
linguistically confusing. It does not so much refer to causes generally going backwards, but
rather to them sometimes going against an arrow of time that was defined from something
else. That is, it is really a mix of different directions of time that mark a retrocausal model, the
already mentioned property that has previously been referred to as the possibility of zigzags in
space–time [42]. Note that the zigzag property itself is defined against the presumed-to-exist GR
arrow of time.
Time
Space Space
Figure 6. Orientable and non-orientable arrows of time. The arrows in the figure indicate the hypothetical flow of internal
time of an observer, which might differ from the coordinate time. That is, the arrows are not in all places time-like, according
to the coordinate time. This is supposed to illustrate the often-used example in which a spaceship that can travel faster than
light in one frame actually seems to go back in time in another frame. If that was possible, one could use it to construct loops
that seem to be ‘timelike’ from the point of view inside the spaceship (i.e. permissible motion for massive objects), but not
according to the coordinate time.
controllable, an experimenter need not have free will, whatever that might mean, and they also
do not have to control the local beable themselves; it could be done by some kind of apparatus.
If controllable input is correlated with observable output, we will speak of signalling.
Signalling is particularly interesting if it is outside the forward light cones.
controllable inputs which are local beables in a region A that are correlated with observable
Superluminal signalling: A c-model allows superluminal signalling iff it is superluminal and has
outputs that are local beables in a region outside L(A). An m-model allows superluminal
signalling if at least one c-model in its class does.
at least one scenario for which observables at time t are correlated with required controllable
Retrocausal signalling: A c-model allows retrocausal signalling if it is retrocausal and there is
inputs from t′ > t. An m-model allows retrocausal signalling if at least one c-model in its
equivalence class does.
This retrocausality signalling could either be local or non-local. Like for the non-signalling
case discussed in §3d, causal and retrocausal non-local signalling can only be distinguished
in the presence of an arrow of time, and the same is the case for temporal and retrotemporal
The problem with retrocausal signalling is that the observables at time t are, well, observable.
superluminal signalling, since Lorentz-transformations can mix both cases.
If they can be affected by input at a later time t′ > t, then the result may disagree with what one
had already observed. This is what opens the door to causal paradoxes.
We will not introduce a notion of pseudo-retrocausal signalling, as that would be a techni-
cally possible definition, but rather oxymoronic. If a future input was removable and therefore
not necessary for an earlier observable, then no signal was sent (though in such a case an agent
might still have an illusion of signalling).
4. Specific model properties 20
In this section, we will now introduce properties that are specific to models typically used in the
From this, we can tell that all local interpretations or modifications of quantum mechanics can
be classified by the ways in which they violate SI.6 We will hence refer to them all as SI-violating
models.
SI is also sometimes referred to as ‘measurement independence,’ or the ‘free choice
assumption’ or the ‘free will assumption’ in Bell’s theorem. In rare occasions, we have seen
it being referred to as ‘no conspiracy’. In recent years, theories which violate SI have also been
dubbed ‘contextual’ [48], though the class of contextual models is larger than just those which
violate SI (there is more ‘context’ to an experiment than its measurement setting).
Not all models that are being used in quantum foundations reproduce all predictions of
quantum mechanics. Many of them only produce output for certain experimental situations,
typically Bell-type tests, interferometers or Stern–Gerlach devices. We can then ask, in a
6
It is sometimes argued that the Many Worlds Interpretation is a counterexample to this claim [47]. However, as laid out in
the companion paper [3], the Many Worlds Interpretation is either not empirically equivalent to SQM, or violates CoA.
D1 D2
21
Figure 7. Sketch of Cosmic Bell test with source S and two detectors. The settings of the detectors D1 and D2 are chosen by
using photons from distant astrophysical sources, usually quasars. Any past cause that could have given rise to the observed
correlations must then have been very far back in time. Light cones indicated by grey shading.
hopefully obvious generalization of the above classification, whether these models are either
representations or interpretations of [CI]m as it applies to the same experiments.
Traditionally, a distinction has been made between SI-violating models that are either
retrocausal or superdeterministic. However, this distinction has remained ambiguous for three
reasons.
First because—as we have seen already—retrocausality itself has been used for a bewilder-
ing variety of cases. If one then defines superdeterminism as those SI-violating models which
are not retrocausal, one gets an equally bewildering variety. Second, since Bell (who coined the
term ‘superdeterminism’) did not distinguish retrocausality from superdeterminism, one could
reasonably argue that superdeterminism should be equated with SI-violation in general, and
then consider retrocausality to be a variant of superdeterminism. Third, not everyone agrees
that superdeterministic models have to be deterministic to begin with [49,50].
There is no way to define the term so that it agrees with all the ways it has previously been
used. We will therefore just propose a definition that we believe agrees with the way it has most
widely been used, based on the following reasoning:
(i) We are not aware of any superdeterministic model which is not also deterministic.
Leaving aside that it is terrible nomenclature to speak of ‘non-deterministic superdetermi-
nism’, there is not even an example for it. For this reason, we will assume that superdeter-
ministic models are deterministic.
(ii) We want a model that fulfils Strong CoA, because this is the major reason why violations
of SI are interesting, and it is also the context in which Bell coined the expression.
(iii) Most of the literature seems to consider superdeterminism and retrocausality as two
disjoint cases, so we will do the same, even though Bell seems to not have used this
distinction when he coined the term. This point together, with the previous one, implies
that the model has to be locally causal.
(iv) We want a distinction that refers to the m-model, not to its particular realization as a
c-model, to avoid new ambiguities.
(v) The model should reproduce the predictions of quantum mechanics, at least to a
reasonable extent. This assumption is relevant because without it Newtonian mechanics
would also be superdeterministic which makes no sense.
(vi) It is a one-world model that violates SI.
Some readers might argue that requirement 6 is strictly speaking unnecessary because it follows
from Bell’s theorem given the previous five requirements. However, since we did not explicitly
list the assumptions to Bell’s theorem, and it is somewhat controversial which of those have to
be fulfilled in any case, we add it as an extra requirement.
Taking this together, we arrive at the following definition:
Superdeterminism: An m-model with additional variables which is deterministic, locally causal, 22
violates SI, and is empirically equivalent to SQM (as in [CI]e).
5. Classification
We want our classification scheme to be practically useful, so we will here provide a short guide
to how it works.
The question we want to address is this: suppose you have a c-model for quantum mechan-
ics—that is the thing you are doing your calculations with—what should you call it? We will
assume that your model is presently empirically equivalent to SQM in the non-relativistic limit.
(If it is not, you have bigger problems than finding a name for it.) You then proceed as follows:
(i) To classify the model, you first have to make sure that its set-up irreducible. If it is
reducible, the set-up cannot be classified, so please remove all assumptions that are not
necessary to calculate outputs. Assumptions that state the ‘physical existence’ (whatever
that may be) of one thing or another are typically unnecessary for any calculation.
(ii) Figure out whether your model is physically equivalent to SQM. If it is not, it is a
modification. If it is, it is an interpretation.
(iii) Figure out whether your model is mathematically equivalent to any already known
23
interpretation. If it is, your model is a representation (of the interpretation it is mathemati-
cally equivalent to). If it is not, you have a representation of a new interpretation.
6. Summary
We have proposed a classification scheme for models in the foundations of quantum mechanics.
It is most central element is the distinction between different types of models: calculational,
mathematical, physical and empirical. After distinguishing these different classes of models, we
have defined some of their properties that are discussed most commonly in the foundations of
quantum mechanics, with special attention to those concerning locality and causality. We hope
that the here proposed terminology can aid to clarify which problems of quantum mechanics
can and cannot be solved by interpretation.
Appendix
If the region S3 was allowed to intersect with the inside of the past-lightcone of B, then in a
not contained on S3. In this case then, local beables at B could provide extra information for
theory which is not temporally deterministic, correlations could be created later which were
what happens at A though the information got there locally and inside the light-cones. For
illustration, see figure 5b.
References
1. Hance JR, Hossenfelder S. 2022 What does it take to solve the measurement problem? J.
Phys. Commun. 6, 102001. (doi:10.1088/2399-6528/ac96cf)
2. Adlam E. 2023 Do we have any viable solution to the measurement problem? Found. Phys.
53, 44. (doi:10.1007/s10701-023-00686-x)
3. Hossenfelder S. 2023 Quantum confusions, cleared up (or so I hope). arXiv. (doi:10.48550/
arXiv.2309.12299)
4. Frigg R, Nguyen J. 2020 Modelling nature: an opinionated introduction to scientific
representation. Cham: Springer. (doi:10.1007/978-3-030-45153-0)
5. MacFarlane J. 2020 Philosophical logic: a contemporary introduction. New York, NY:
24
Routledge. (doi:10.4324/9781315185248)
6. Gödel K. 1931 Über formal unentscheidbare Sätze der Principia Mathematica und