Snapshot Derivation
-===================
+-------------------
An MVCC snapshot memorializes a point in the commit sequence. That is, the
effects of any transaction which committed before the snapshot was taken will
which had already ended at the time the snapshot was taken ("xmax"); and
(2) the XIDs of all lower-numbered transactions which had not yet ended at the
time the snapshot was taken ("running XIDs"). From the perspective of a given
-MVCC snapshot, any XID less than the lowest running XID (aka "xmin") or
-greater than or equal to xmax is invisible. Intermediate XIDs are visible if
-they are committed and not among the running XIDs.
+MVCC snapshot, any XID less than the lowest running XID (aka "xmin") is
+visible, while any XID greater than or equal to xmax is invisible.
+Intermediate XIDs are visible if they are committed and not among the running
+XIDs (but see the next paragraph, concerning the handling of subtransaction
+XIDs).
Since a given toplevel transaction can have many subtransactions, and thus
many XIDs, we need some way to bound the size of the running XID list even
and then check the parent XIDs against the snapshot. However, it would be
inefficient to do this in all cases, so we keep track of the highest subxid
that's been removed from the list of running XIDs. A pg_subtrans lookup is
-required only for XIDs which follow xmin but are less than or equal to the
-highest removed subxid.
+required only for XIDs which follow xmin but are precede or equal the highest
+removed subxid.
As an additional special case, transactions which are performing a lazy
VACUUM operation can be excluded from the set of running XIDs, since they
automatically when it is necessary to advance xmax.
Shared Memory Organization
-==========================
+--------------------------
The main shared memory structure used by the snaparray code is a ring buffer,
which acts as a circular message buffer. We maintain three pointers into this
of messages which can be stored in this buffer: snapshot summaries, and newly
completed XIDs.
-Newly completed XIDs are recoreded by simply writing them into the buffer.
+Newly completed XIDs are recorded by simply writing them into the buffer.
A snapshot summary is distinguished by first writing InvalidTransactionId
into the buffer, followed by the remaining data items, all as 4-byte
quantities. The format in full is as follows:
Write access to the ring buffer is serialized by SnapArrayLock; only one
transaction (that has an XID) can end at a time. When ending, a transaction
-first may either write an XID completion message or may choose (if the distance
-between the start pointer and the new stop pointer seems like it's getting too
-large) to instead write a new snapshot summary.
+first must write either an XID completion message or a new snapshot summary.
Reads from the ring buffer can proceed in parallel with other reads, and
generally with writes. However, if the ring buffer wraps around before a