Fix LOCK_TIMEOUT handling in slotsync worker.
authorAmit Kapila <[email protected]>
Tue, 9 Dec 2025 07:02:08 +0000 (07:02 +0000)
committerAmit Kapila <[email protected]>
Tue, 9 Dec 2025 07:02:08 +0000 (07:02 +0000)
Previously, the slotsync worker relied on SIGINT for graceful shutdown
during promotion. However, SIGINT is also used by the LOCK_TIMEOUT handler
to cancel queries. Since the slotsync worker can lock catalog tables while
parsing libpq tuples, this overlap caused it to ignore LOCK_TIMEOUT
signals and potentially wait indefinitely on locks.

This patch replaces the slotsync worker's SIGINT handler with
StatementCancelHandler to correctly process query-cancel interrupts.
Additionally, the startup process now uses SIGUSR1 to signal the slotsync
worker to stop during promotion. The worker exits after detecting that the
shared memory flag stopSignaled is set.

Author: Hou Zhijie <[email protected]>
Reviewed-by: shveta malik <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Backpatch-through: 17, here it was introduced
Discussion: https://postgr.es/m/TY4PR01MB169078F33846E9568412D878C94A2A@TY4PR01MB16907.jpnprd01.prod.outlook.com

src/backend/replication/logical/slotsync.c

index 051b1c866b584e831234a7b5cc569fb9a07a315d..27e262ecbf22087b2bbe907b3244a1feca39825c 100644 (file)
@@ -1156,10 +1156,10 @@ ProcessSlotSyncInterrupts(WalReceiverConn *wrconn)
 {
    CHECK_FOR_INTERRUPTS();
 
-   if (ShutdownRequestPending)
+   if (SlotSyncCtx->stopSignaled)
    {
        ereport(LOG,
-               errmsg("replication slot synchronization worker is shutting down on receiving SIGINT"));
+               errmsg("replication slot synchronization worker is shutting down because promotion is triggered"));
 
        proc_exit(0);
    }
@@ -1390,7 +1390,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 
    /* Setup signal handling */
    pqsignal(SIGHUP, SignalHandlerForConfigReload);
-   pqsignal(SIGINT, SignalHandlerForShutdownRequest);
+   pqsignal(SIGINT, StatementCancelHandler);
    pqsignal(SIGTERM, die);
    pqsignal(SIGFPE, FloatExceptionHandler);
    pqsignal(SIGUSR1, procsignal_sigusr1_handler);
@@ -1495,7 +1495,8 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 
    /*
     * The slot sync worker can't get here because it will only stop when it
-    * receives a SIGINT from the startup process, or when there is an error.
+    * receives a stop request from the startup process, or when there is an
+    * error.
     */
    Assert(false);
 }
@@ -1582,8 +1583,12 @@ ShutDownSlotSync(void)
 
    SpinLockRelease(&SlotSyncCtx->mutex);
 
+   /*
+    * Signal slotsync worker if it was still running. The worker will stop
+    * upon detecting that the stopSignaled flag is set to true.
+    */
    if (worker_pid != InvalidPid)
-       kill(worker_pid, SIGINT);
+       kill(worker_pid, SIGUSR1);
 
    /* Wait for slot sync to end */
    for (;;)