Active Data Guard 21c Master Technical
Active Data Guard 21c Master Technical
Ludovico Caldara
Senior Principal Product Manager @ludodba
Oracle Database High Availability (HA), Scalability and http://www.linkedin.com/in/ludovicocaldara
Maximum Availability Architecture (MAA) Team
www.ludovicocaldara.net
Table of contents (click on a thumbnail to jump to the section)
Oracle (Active) Data Guard & MAA
$350K $10M
average cost of downtime average cost of unplanned data
per hour center outage or disaster
87 hours 91%
average amount of downtime percentage of companies that have
per year experienced an unplanned data
center outage in the last 24 months
Continuous availability
Data protection
Active replication
Production site Replicated site
Scale out
All tiers exist with on-premises and cloud. However, platinum currently must be
configured manually while bronze to gold are covered with cloud tool automation for
the most part depending on the desired RTO (i.e. FSFO & multiple standby databases
still must be manually configured for example)
Primary Secondary
Site Site
• Basic DR (included with DB EE)
- License primary and secondary sites
• Active-passive
- Standby is used only for failovers
Sync or Async Replication
via in-memory Redo
• Automatic failover to Standby site
https://www.oracle.com/database/technologies/high-availability/dataguard-activedataguard-demos.html
9 Copyright © 2022, Oracle and/or its affiliates
Data Guard
Capabilities Included with Oracle Database Enterprise Edition (EE)
DATAFILES DATAFILES
SYNC or ASYNC
LOGFILES LOGFILES
block replication
CONTROLFILES CONTROLFILES
Standby Site
SGA LGWR
REDO
BUFFER
NSS/ SYNC or ASYNC RFS MRP
TT Redo transport
Primary Standby
Redo Logs
Standby
Database Database
Online
Redo Logs
SGA LGWR
REDO
BUFFER
TT RFS MRP
Oracle Net
Primary Standby
Redo Logs
Standby
Database Database
COMMIT ACK
COMMIT Standby Site
Online
Redo Logs
SGA LGWR
REDO
BUFFER
NSS RFS MRP
Oracle Net
Primary Standby
Redo Logs
Standby
Database Database
COMMIT ACK
COMMIT Standby Site
Online
Redo Logs
SGA LGWR
REDO
BUFFER
NSS RFS MRP
Oracle Net
Primary Standby
Redo Logs
Standby
Database Database
COMMIT ACK
COMMIT Standby Site
Online
Redo Logs
SGA LGWR
REDO
BUFFER
NSS RFS MRP
VALUE DATUM_TIME
------------ -------------------
+00 00:00:00 07/11/2022 08:28:46
VALUE DATUM_TIME
------------ -------------------
+01 13:50:54 07/12/2022 21:48:10
MAX(LAST_TIME)
-------------------
07/11/2022 08:28:46
How To Calculate The Required Network Bandwidth Transfer Of Redo In Data Guard (Doc ID 736755.1)
https://support.oracle.com/rs?type=doc&id=736755.1
Assessing and Tuning Network Performance for Data Guard and RMAN (Doc ID 2064368.1)
https://support.oracle.com/rs?type=doc&id=2064368.1
3 3
1 2
1
Possible split-brain
1 The standby database becomes the new primary 1 The procedure shuts down the primary first
2 The former primary shuts down after FastStartFailoverThreshold 2 The standby database becomes the new primary
3 The former primary is reinstated automatically 3 The former primary is not reinstated
Primary Secondary
• The standby is converted to Snapshot Standby
Site Site • Standby open read write
HR = (DESCRIPTION =
(CONNECT_TIMEOUT=120)(RETRY_COUNT=50)(RETRY_DELAY=3)
(TRANSPORT_CONNECT_TIMEOUT=3)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=cluster1-scan)(PORT=1521)))
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=cluster2-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME = HR.oracle.com)))
Client draining/failover is a crucial part of high availability for applications connecting to the database.
2. failure
ONS Replay
Driver
SERVICE
Transaction
Guard
1 Application
Checklist for Continuous Service for MAA Solutions
https://www.oracle.com/technetwork/database/clustering/checklist-ac-6676160.pdf
34 Copyright © 2022, Oracle and/or its affiliates
Fast Application Notification
Session Draining for planned maintenance
register
ONS ONS
connect
CRM_SVC
ONS ONS
ONS ONS
CRM_SVC CRM_SVC
Replay
Driver
in-flight
transaction
CRM_SVC CRM_SVC
Replay
Driver
check the
transaction
CRM_SVC status CRM_SVC
Replay
Driver
replay if
necessary
CRM_SVC CRM_SVC
• Introduced in 18c for JDBC thin, 19c for OCI (Oracle Call Interface)
• Records session and transaction state server-side
• No application change
• Works without connection pools (although they are still recommended)
• Replayable transactions are replayed
• Non-replayable transactions raise exception
• Good driver coverage but check the doc!
• Side effects are never replayed
Requires Replay
Since Application
Best for Connection Side JDBC/OCI
Version Changes
Pool Effects
Catch FAN No, but
Planned
FAN 10g events recommended N/A Both
Maintenance
(or use UCP) (FCF)
Use explicit
Unplanned Yes
AC 12c boundaries Yes Both
Maintenance (Choose)
(or use UCP)
Unplanned
TAC No, but
Maintenance 19c No Never Both
(Recommended) recommended
OCI
JDBC
Primary Secondary
Site Site
?
-- THE LAST TIME THE STANDBY HEARD FROM THE PRIMARY (1 second tolerance)
SQL> select datum_time, (sysdate-to_date(datum_time,'MM/DD/YYYY HH24:MI:SS'))*86400 secs_ago
2> from v$dataguard_stats where name='transport lag';
DATUM_TIME SECS_AGO
------------------------------ ---------- The columns might be null in some cases
07/11/2022 08:28:46 90361
Primary Secondary
Site Site
? ?
Standby
Redo Logs
-- THE TIMESTAMP OF THE LAST REDO ENTRY RECEIVED FROM THE PRIMARY
SQL> alter session set nls_date_format='MM/DD/YYYY HH24:MI:SS';
SQL> select coalesce(max(s.last_time), max(a.next_time)) as last_redo_from_prim,
2> (sysdate-coalesce(max(s.last_time), max(a.next_time)))*86400 as secs_ago
3> from v$standby_log s , v$archived_log a ;
LAST_REDO_FROM_PRIM SECS_AGO
------------------- ----------
07/11/2022 08:28:46 91061
? Is it a reliable way to calculate data loss?
Primary Secondary
Site Site
?
DID IT CRASH? WHAT IS THE PRIMARY DOING?
DID IT STALL? • Primary still committing
DID IT KEEP COMMITTING? • DATA LOSS and possible split brain
• Primary crashed
• Maybe no data loss
50 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer acts as a quorum
Automatic failover when the Primary Database is unavailable
21c
Execute custom actions before and after the automatic failover occurs
$ cat $DG_ADMIN/config_ConfigName/callout/fsfocallout.ora
3 fsfo_precallout.sh
Observer
Primary DB Standby DB
✘
53 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the primary site: OK
Observer
0
Primary
keeps writing
unprotected Primary DB Standby DB
✘
54 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the primary site: OK
Observer
Manual failover
required
Primary DB Standby DB
✘
55 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the primary site: OK
Observer
Configuration
Primary DB keeps working Standby DB
unobserved
✘
56 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!
Observer
Automatic
0 failover!
Primary DB Standby DB
✘ ✘
57 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!
Observer
Primary keeps
writing
unprotected Primary DB Standby DB
✘ ✘
58 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!
Observer
Automatic
Primary failover!
shuts down
to prevent
split-brain! Primary DB Standby DB
✘ ✘
59 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!
Observer
No database
is available for writes!
Primary 0
shuts down
to prevent
split-brain! Primary DB Standby DB
✘ ✘
60 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!
Observer
Configuration
Primary DB keeps working Standby DB
unobserved
✘ ✘
61 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at an external size: BEST
Observer Site
Primary DB Standby DB
0
Primary keeps
writing
unprotected Primary DB Standby DB
Primary keeps
writing
unprotected Primary DB Standby DB
Configuration
Primary DB keeps working Standby DB
unobserved
Observer Site 1
Primary Inactive Observer Secondary
Site Site
Observer Site 3
Inactive Observer
66 Copyright © 2022, Oracle and/or its affiliates * Four starting with 21c
Oracle Data Guard Observer High Availability
Tolerate observer site failure but avoid observer on the standby site
Primary DB Standby DB
67 Copyright © 2022, Oracle and/or its affiliates * Four starting with 21c
Oracle Data Guard Observer High Availability
Optimal configuration with two sites
Primary DB Standby DB
Primary DB Standby DB
Observer
DGMGRL> edit database NASHUA set property
FastStartFailoverTarget='BOSTON,NEWYORK';
Observer
Primary FSFO
DB Candidate
FSFO Remote
Target bystander
Observer
Primary Isolated
• The observer can still contact the FSFO target.
What happens:
Reinstate needed New Target 1. The primary is STALLED.
2. The observer initiates the failover to FSFO Target.
Primary FSFO 3. The FSFO Target becomes primary.
DB Candidate 4. The new primary and the observer agree to a new FSFO Target.
5. The former Primary DB will require a reinstate.
FSFO Remote
Target bystander
New Primary
Observer
FSFO Target Isolated
• The observer can still contact the primary.
What happens:
1. The primary goes temporarily UNSYNCHRONIZED (no FSFO,
New Target unless Max Protection).
Primary FSFO 2. The new primary and the observer agree to a new FSFO Target.
DB Candidate 3. As soon as the new FSFO target is ready, FSFO is possible again.
FSFO Remote
Target bystander
Unsynchronized
Observer
Primary and FSFO Target Isolated
• The observer cannot contact the primary nor the FSFO target
What happens:
1. The observer cannot tell if the network is unreachable, or the
Primary Unobserved whole site is down. The primary might still write to the standby
Primary FSFO (valid LAD destination).
DB Candidate 2. The primary and FSFO targets keep working, the configuration
is UNOBSERVED.
3. The observer cannot initiate a failover, as it would lead to split-
brain and data loss.
FSFO Remote
Target bystander
Observer
Primary and Observer Isolated
• No FSFO target or candidates can be contacted.
What happens:
Unsynchronized 1. The primary keeps working without protection (unless Max
Protection).
Primary FSFO
DB Candidate
FSFO Remote
Target bystander
MAXIMUM PERFORMANCE
Never wait
MAXIMUM AVAILABILITY
Wait a bit
MAXIMUM PROTECTION
Data loss No/minimal data loss No data loss Waits until the standby is available again (SYNC only).
Protection No transactions are lost, ever.
FastStartFailoverLagLimit
At t0 + ping time, the primary cannot contact the standby.
2 The Primary keeps committing, the Standby lag increases.
SCN 3 4 Status: TARGET UNDER LAG LIMIT
OVER LAG:
3 temporarily stalls (~3 seconds) until it gets permission from
the observer to continue.
No Failover Status: STALLED
6
After the observer pings the primary and gives permission
5 4 to continue, the primary resumes the commit activity.
2 Status: TARGET OVER LAG LIMIT
DATA LOSS
The primary reaches FastStartFailoverLagLimit. It cannot
3 obtain permission from the observer to continue. It stalls or
2 shuts down depending on FastStartFailoverPmyShutdown
Potential LAG Status: STALLED
1
4 The observer timer reaches FastStartFailoverThreshold.
Primary 4 Still no connection with the primary: it initiates the Failover
Standby
Status: REINSTATE REQUIRED
t0 TIME
PRIMARY disconnected
FastStartFailoverLagLimit
FastStartFailoverThreshold < FastStartFailoverLagLimit
FastStartFailoverThreshold 4 1
ASYNC Transport. The Standby has a residual lag
Status: TARGET UNDER LAG LIMIT
SCN
At t0 + ping time, the observer cannot contact the primary
2 and starts a timer for FastStartFailoverThreshold
seconds. The Primary keeps committing, the Standby lag
SPLIT BRAIN increases.
Status: TARGET UNDER LAG LIMIT
DATA LOSS
The observer timer reaches FastStartFailoverThreshold.
3 Still no connection with the primary: it initiates the Failover
2 Primary status: TARGET UNDER LAG LIMIT
Potential LAG Standby status: REINSTATE REQUIRED
1
The Primary keeps committing until it reaches
t0 TIME
STANDBY disconnected
FastStartFailoverThreshold
At t0 + ping time, the observer cannot contact the primary
2 and starts a timer for FastStartFailoverThreshold
SCN seconds. The Primary stalls and keeps retrying for
NetTimeout Seconds.
Status: STALLED
NetTimeout FastStartFailoverLagLimit
≤ FastStartFailoverThreshold
FastStartFailoverLagLimit
SYNC Transport. The Standby is synched.
1
FastStartFailoverThreshold
SCN At t0 + ping time, the primary cannot contact the standby.
2 The primary stalls and keeps retrying for NetTimeout
seconds. The observer starts the timer.
DATA LOSS
acknowledge the unsynched status.
FastStartFailoverLagLimit
SYNC Transport. The Standby is synched.
1
FastStartFailoverThreshold
SCN At t0 + ping time, the primary cannot contact the standby.
2 The primary stalls and keeps retrying for NetTimeout
seconds. The observer starts the timer.
5 At NetTimeout seconds, the primary switches to ASYNC
3 redo transport without requiring permission to the observer,
because FastStartFailoverLagLimit is set. The observer
does not acknowledge the unsynched status.
DATA LOSS
SPLIT
BRAIN
2
STALL
1
Primary 3
Standby
t0 TIME
STANDBY disconnected
FastStartFailoverThreshold
At t0 + ping time, the observer cannot contact the primary
2 and starts a timer for FastStartFailoverThreshold
SCN seconds. The Primary stalls and keeps retrying for
NetTimeout Seconds.
Status: STALLED
Primary Site
Standby Instance 1
Primary Instance 1
Thread 1 Redo
NFS/TT RFS MRP
Primary Instance 2
Thread 2 Redo SRL
NFS/TT RFS
Primary Instance 3
Thread 3 Redo SRL
NFS/TT RFS
SRL
OLTP Workload
Standby 800
Apply
Rate 600
MB/sec 400
OLTP Workload
200
0
Primary Site
Standby Instance 1
Primary Instance 1
Thread 1 Redo
NFS/TT RFS Coordinator MRP
Primary Instance 3
Thread 3 Redo SRL Instance 3
NFS/TT RFS
MRP
SRL
7000
6000
5000
• IO bottlenecks and database wait events affecting SIRA, will affect MIRA as well!
• We recommend Oracle Database 19.13 or higher (it includes critical fixes for MIRA)
• CPU-bounded apply coordinator or worker are the best indicators that MIRA is needed
From SQL*Plus:
SQL> alter database recover managed standby database disconnect from session instances <#|ALL>;
Without PMEM 19.13 and higher Set dynamic parameter on all instances:
"_cache_fusion_pipelined_updates_enable"=FALSE (*)
(*) MIRA can recover only redo generated with the “_cache_fusion_pipelined_updates_enable” set to FALSE
• If parallel recovery change buffer free is among the top wait events:
• Increase _change_vector_buffers to 2 or 4.
Primary Secondary
Site Site • Advanced Disaster Recovery
• Active-active*
- Queries, reports, backups
DML Redirection - Occasional updates (19c)
Zero data loss at
Any distance
- Assurance of knowing system is operational
Rolling Database
• Automatic block repair
Upgrades
https://www.oracle.com/database/technologies/high-availability/dataguard-activedataguard-demos.html
98 Copyright © 2022, Oracle and/or its affiliates
ADG
Active Data Guard
Option of Oracle Database for Advanced Capabilities and Protection
Activation
Primary Primary
Only and Standby
104 Copyright © 2022, Oracle and/or its affiliates
ADG
Real-Time Query
Not just Selects for your Application Workloads!
Primary Standby
Primary Standby
BCT BCT
• Disaster Recovery
• Query & Reporting Offload
BACK • DML Redirection
UP • Snapshot Standby for tests
Snapshot
Standby • Rolling Maintenance
• Rolling Upgrade
Primary DML • Migration to new Hardware
Redirect
• Source for thin clones
BACK
• Backup Offload
UP
• GoldenGate Extract Offload
• Recovery Appliance
FAR
SYNC • Zero data loss at any distance
Thin Clones
115 Copyright © 2022, Oracle and/or its affiliates
ADG
Real-Time Cascade Standby
Offload multiple redo transports to a first-level standby
BOSTON
RedoRoutes=
(LOCAL : ( NASHUA SYNC PRIORITY=1, NEWYORK ASYNC PRIORITY=8, NEWARK ASYNC PRIORITY=8 ))
(NASHUA : NEWYORK ASYNC, NEWARK ASYNC ))
NASHUA
RedoRoutes=
(LOCAL : ( BOSTON SYNC PRIORITY=1, NEWYORK ASYNC PRIORITY=8, NEWARK ASYNC PRIORITY=8 ))
(BOSTON : NEWYORK ASYNC, NEWARK ASYNC ))
ALTERNATE
NEWYORK
NEWARK • Explicit “ASYNC” in the cascading member means “Real-Time Cascade”. Such configuration
requires Active Data Guard.
• If not specified, the redo is shipped at log switch.
BOSTON
RedoRoutes=
(LOCAL : ( NASHUA SYNC PRIORITY=1, NEWYORK ASYNC PRIORITY=8, NEWARK ASYNC PRIORITY=8 ))
(NASHUA : NEWYORK ASYNC, NEWARK ASYNC ))
NASHUA
RedoRoutes=
(LOCAL : ( BOSTON SYNC PRIORITY=1, NEWYORK ASYNC PRIORITY=8, NEWARK ASYNC PRIORITY=8 ))
(BOSTON : NEWYORK ASYNC, NEWARK ASYNC ))
NEWYORK
ALTERNATE
NEWARK
• RedoRoutes is an efficient way to route the redo properly in any situation.
• The Broker takes care of all the complex LOG_ARCHIVE_DEST_n modifications.
ASYNC
PRIMARY STANDBY
FAR ASYNC
SYNC Optional RedoCompression (*)
SYNC STANDBY
PRIMARY
SYNC
• Supports FSFO in MaxAvailavility
• Supports FSFO in MaxPerformance (new in 21c)
Primary Site Standby Nearby Site
* The Far Sync nodes do not require any license, provided that all the other nodes running databases in the configuration are
licensed with Enterprise Edition and Active Data Guard
BOSTON
RedoRoutes=
(LOCAL : ( FS1 SYNC PRIORITY=1, FS2 SYNC PRIORITY=1, LONDON ASYNC PRIORITY=2 ))
FS1
FAR RedoRoutes=
ALTERNATE
P2 In the example: Prepare two additional Far Sync instances for LONDON!
The Data Guard broker supports Far Sync instance creation starting with 21c.
Benefits Downsides
• Increased performance for existing Sync configurations • Additional server(s) or VM(s) and components
• Increased protection for existing Async configurations • Local Far Sync instance might not prevent data loss in case of
• Zero Data Loss (Max Availability) across distant regions full site failure
FSFO and FAR SYNC Maximum Performance Maximum Availability Maximum Protection
ASYNC ✓ (21c+) ✘ ✘
FAST SYNC ✘ ✓ ✘
SYNC ✘ ✓ ✘
FSFO without FAR SYNC Maximum Performance Maximum Availability Maximum Protection
ASYNC ✓ ✘ ✘
FAST SYNC ✘ ✓ ✘
SYNC ✘ ✓ ✓
FAR SYNC without FSFO Maximum Performance Maximum Availability Maximum Protection
ASYNC ✓ ✘ ✘
FAST SYNC ✘ ✓ ✘
SYNC ✘ ✓ ✘
https://docs.oracle.com/en/database/oracle/oracle-database/21/dgbkr/using-data-guard-broker-to-manage-
switchovers-failovers.html#GUID-7423C774-27DF-49F9-BB43-7D547BCE7762
124 Copyright © 2022, Oracle and/or its affiliates
Oracle Active Data Guard
Rolling Maintenance and Upgrades
THEN SWITCHOVER
PRIMARY TRANSIENT
LOGICAL STANDBY
• Use a transient logical standby database to upgrade with very little downtime.
• The only downtime is as little as it takes to perform a switchover.
DB1
Trailing Group
REDO
Trailing Group Master (TGM) -- check DBA_ROLLING_UNSUPPORTED for incompatible data types
-- initialize the plan and set the future primary
DB2 DBMS_ROLLING.INIT_PLAN(future_primary=>'DB3');
REDO
DB3 DBMS_ROLLING.SET_PARAMETER('DB4','MEMBER',LEADING');
Leading Group
Leading Group Standby (LGS)
DB4
ACTIVE_SESSIONS_TIMEOUT MEMBER
ACTIVE_SESSIONS_WAIT READY_LGM_LAG_TIME
BACKUP_CONTROLFILE READY_LGM_LAG_TIMEOUT
DGBROKER READY_LGM_LAG_WAIT
DICTIONARY_LOAD_TIMEOUT SWITCH_LGM_LAG_TIME
DICTIONARY_LOAD_WAIT SWITCH_LGM_LAG_TIMEOUT
DICTIONARY_PLS_WAIT_INIT SWITCH_LGM_LAG_WAIT
DICTIONARY_PLS_WAIT_TIMEOUT SWITCH_LGS_LAG_TIME
EVENT_RECORDS SWITCH_LGS_LAG_TIMEOUT
FAILOVER SWITCH_LGS_LAG_WAIT
GRP_PREFIX UPDATED_LGS_TIMEOUT
IGNORE_BUILD_WARNINGS UPDATED_LGS_WAIT
IGNORE_LAST_ERROR UPDATED_TGS_TIMEOUT
LAD_ENABLED_TIMEOUT UPDATED_TGS_WAIT
LOG_LEVEL
Example:
-- Wait for the SQL Apply Lag to go below 1 minute before initiating the switchover
exec DBMS_ROLLING.SET_PARAMETER('SWITCH_LGM_LAG_WAIT', '1');
exec DBMS_ROLLING.SET_PARAMETER('SWITCH_LGM_LAG_TIME', '60');
DB1
REDO
DB3
DB4
GRP
DB1
REDO
DB2
• Creates the Guaranteed Restore Point (GRP)
REDO
DB3
GRP
DB4
GRP
DB1
REDO
DB2
• Creates the Guaranteed Restore Point (GRP)
SQL
DB3
• Starts SQL Apply
• With a configuration composed of 4 databases,
GRP
the LGM and TGM are still protected by a physical standby
DB4
DB3
Fast-Start Failover: DISABLED
seconds ago)
...
GRP Database Warning(s):
ORA-16866: database converted to transient logical
REDO
DB3
standby database for rolling database maintenance
DB3
ORA-16129: unsupported DML encountered
22-NOV-21 06.41.13 truncate table wri$_adv_addm_pdbs
GRP ORA-16247: DDL skipped on internal schema
DB4
GRP
DB1
REDO
DB3
• Use it for any major maintenance that requires longer downtimes
(change of physical layout, structure changes, offline operations)
GRP
+1
DB4
GRP
DB1
REDO
DB2
• Depending on the source version and HA configuration,
the old connections get FAN notifications and drain automatically
GRP Logstdby
+1 • New connections go to the new primary.
REDO
DB3
Application downtime is minimal.
GRP
+1
DB4
GRP
DB1
• Start the Trailing Group members with the new binaries (manual)
REDO
GRP Logstdby • Flashes back the Trailing Group Master and Standby to the GRP
+1
REDO
DB3
GRP
+1
DB4
GRP
+1
DB1
• Start the Trailing Group members with the new binaries (manual)
REDO
GRP Logstdby • Flashes back the Trailing Group Master and Standby to the GRP
+1
• Converts the Trailing Group Master to a physical standby
REDO
DB3
• Starts redo apply and catches up with the primary
GRP • Drops the guaranteed restore points and logical standby metadata
+1
DB4
+1
DB1
REDO
+1
-- destroy the plan to clean up everything
DB2 DBMS_ROLLING.DESTROY_PLAN()
REDO
+1
REDO
DB3
+1
DB4
DBA_ROLLING_DATABASES
Build
DBA_ROLLING_PLAN Verify the plan before and during the execution
Monitor DBA_ROLLING_STATISTICS
DBA_ROLLING_STATUS
Do not create the logical standby on the same server as the primary database
For optimal performance all tables should have primary keys or unique keys
12.1 • First version of DBMS_ROLLING for upgrades from 12.1 to higher versions
SOURCE VERSION
Automated Database Upgrades using Oracle Active Data Guard and DBMS_ROLLING
https://www.oracle.com/technetwork/database/availability/database-upgrade-dbms-rolling-4126957.pdf
MOS Notes:
• Transient Rolling Upgrade Using DBMS_ROLLING - Beginners Guide
• Rolling upgrade using DBMS_ROLLING - Complete Reference (Doc ID 2086512.1)
• MAA Whitepaper: SQL Apply Best Practices (Doc ID 1672310.1)
• Step by Step How to Do Swithcover/Failover on Logical Standby Environment (Doc ID 2535950.1)
• How To Skip A Complete Schema From Application on Logical Standby Database (Doc ID 741325.1)
• How to monitor the progress of the logical standby (Doc ID 1296954.1)
• How To Reduce The Performance Impact Of LogMiner Usage On A Production Database (Doc ID 1629300.1)
• Handling ORA-1403 ora-12801 on logical standby apply (Doc ID 1178284.1)
• Troubleshooting Example - Rolling Upgrade using DBMS_ROLLING (Doc ID 2535940.1)
• DBMS Rolling Upgrade Switchover Fails with ORA-45427: Logical Standby Redo Apply Process Was Not Running (Doc ID
2696017.1)
• SRDC - Collect Logical Standby Database Information (Doc ID 1910065.1)
• MRP fails with ORA-19906 after Flashback of Transient Logical Standby used for Rolling Upgrade (Doc ID 2069325.1)
• What Causes High Redo When Supplemental Logging is Enabled (Doc ID 1349037.1)
•Oracle Data Guard Multi-Instance Redo Apply Works with the In-Memory Column Store
READ ONLY
recurring queries and reduces resource usage (CPU, I/O)
RESULT
CACHE
Primary Standby
Database Database
READ ONLY
recurring queries and reduces resource usage (CPU, I/O)
Standby Primary
Database Database
(old primary) (old standby) PRESERVED
RESULT CACHE
$DG_ADMIN/
Shared across multiple configurations (observer.ora)
admin/
log/ observer_hostname.log
dat/ fsfo_hostname.dat
callout/ fsfocallout.dat & callout files
21c
Execute custom actions before and after the automatic failover occurs
$ cat $DG_ADMIN/config_ConfigName/callout/fsfocallout.ora
3 fsfo_postcallout.sh
Other issues:
FastStartFailoverThreshold may be too low for RAC databases.
FAR
SYNC
PRIMARY
Default behavior when primary goes ASYNC Optional behavior since 21c
NetTimeout
NetTimeout FastStartFailoverLagLimit
FastStartFailoverThreshold
SCN
SCN
STALL
DATA LOSS
Potential LAG
STALL Potential LAG
STALL No Failover
Primary Primary
Standby Standby
t0 TIME t0 TIME
PARAMETERS
DB_FILES = 1024
LOG_BUFFER = 256M
DB_BLOCK_CHECKSUM = TYPICAL
DB_LOST_WRITE_PROTECT = TYPICAL
DB_FLASHBACK_RETENTION_TARGET = 120
Data Guard Broker PARALLEL_THREADS_PER_CPU = 1
STANDBY_FILE_MANAGEMENT = AUTO
DG_BROKER_START = TRUE
plug pdb
?
Online
Redo Logs
SGA LGWR
REDO
BUFFER
NSS/
RFS MRP
TT
Primary Standby
Redo Logs
Standby
Database Database
Online
Redo Logs PDB RECOVERY
REQUIRES TEMPORARY
SGA LGWR STOP OF CDB RECOVERY
REDO
BUFFER
NSS/
RFS MRP
TT
Primary Standby
Redo Logs
Standby
Database Database
plug pdb
REDO
BUFFER
NSS/
RFS MRP
TT
Primary Standby
Redo Logs
Standby
Database Database
POST /database/dataguard/databases/
{
"connection_identifier": "site2-scan:1521/mydb", Add the standby databases
"database_name": "mydb_site2"
}
PUT /database/dataguard/configuration/
{
Enable the configuration
"operation": "ENABLE"
}
DG ADD DATABASE "<database name>" AS CONNECT IDENTIIFIER IS <connect identifier> [ INCLUDE CURRENT DESTINATIONS ];
DG CREATE CONFIGURATION "<config_name>" AS PRIMARY DATABASE IS <database name> CONNECT IDENTIIFIER IS <connect_identifier>
[ INCLUDE CURRENT DESTINATIONS ];
DG DISABLE CONFIGURATION;
DG DISABLE { DATABASE | RECOVERY_APPLIANCE | FAR_SYNC | MEMBER } <member name>;
DG EDIT CONFIGURATION SET PROPERTY <property name> = '<property value>';
DG EDIT { DATABASE | RECOVERY_APPLIANCE | FAR_SYNC | MEMBER } <member name> SET PROPERTY <property name> = '<property value>';
DG ENABLE CONFIGURATION;
DG ENABLE { DATABASE | RECOVERY_APPLIANCE | FAR_SYNC | MEMBER } <member name>;
DG FAILOVER TO <database name> [IMMEDIATE];
DG REINSTATE DATABASE <database name>;
DG REMOVE CONFIGURATION [PRESERVE DESTINATIONS];
DG REMOVE { DATABASE | RECOVERY_APPLIANCE | FAR_SYNC | MEMBER } <name> [PRESERVE DESTINATIONS];
DG SHOW CONFIGURATION [<property name>];
DG SHOW DATABASE <database name> [<property name];
DG SWITCHOVER TO <database name> [WAIT [<timeout in seconds]];
• Far Sync can now be used with Fast-Start Failover in Max Performance mode (ADG)
Primary can send redo asynchronously to Far Sync.
168
168 Copyright © 2022, Oracle and/or its affiliates