walsnd: Don't set waiting_for_ping_response spuriously
authorAlvaro Herrera <[email protected]>
Sat, 8 Aug 2020 16:31:55 +0000 (12:31 -0400)
committerAlvaro Herrera <[email protected]>
Sat, 8 Aug 2020 16:31:55 +0000 (12:31 -0400)
commit55d42c9178306f4f92ac4ed0a924b2e3b914a5b1
treea17a87a8c811e5483b544207e47354a51f11b82a
parentd06c96185c427e34be8cfe7513e2195842423ea2
walsnd: Don't set waiting_for_ping_response spuriously

Ashutosh Bapat noticed that when logical walsender needs to wait for
WAL, and it realizes that it must send a keepalive message to
walreceiver to update the sent-LSN, which *does not* request a reply
from walreceiver, it wrongly sets the flag that it's going to wait for
that reply.  That means that any future would-be sender of feedback
messages ends up not sending a feedback message, because they all
believe that a reply is expected.

With built-in logical replication there's not much harm in this, because
WalReceiverMain will send a ping-back every wal_receiver_timeout/2
anyway; but with other logical replication systems (e.g. pglogical) it
can cause significant pain.

This problem was introduced in commit 41d5f8ad734, where the
request-reply flag was changed from true to false to WalSndKeepalive,
without at the same time removing the line that sets
waiting_for_ping_response.

Just removing that line would be a sufficient fix, but it seems better
to shift the responsibility of setting the flag to WalSndKeepalive
itself instead of requiring caller to do it; this is clearly less
error-prone.

Author: Álvaro Herrera <[email protected]>
Reported-by: Ashutosh Bapat <[email protected]>
Backpatch: 9.5 and up
Discussion: https://postgr.es/m/20200806225558[email protected]
src/backend/replication/walsender.c