0% found this document useful (0 votes)
24 views

Active Data Guard 21c Master Technical

Uploaded by

Nuhu Magwai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Active Data Guard 21c Master Technical

Uploaded by

Nuhu Magwai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 169

Oracle (Active) Data Guard

Master Technical Slide Deck - Updated 2022.11.24

Ludovico Caldara
Senior Principal Product Manager @ludodba
Oracle Database High Availability (HA), Scalability and http://www.linkedin.com/in/ludovicocaldara
Maximum Availability Architecture (MAA) Team
www.ludovicocaldara.net
Table of contents (click on a thumbnail to jump to the section)
Oracle (Active) Data Guard & MAA

3 Copyright © 2022, Oracle and/or its affiliates


Impact of database downtime

$350K $10M
average cost of downtime average cost of unplanned data
per hour center outage or disaster

87 hours 91%
average amount of downtime percentage of companies that have
per year experienced an unplanned data
center outage in the last 24 months

4 Copyright © 2022, Oracle and/or its affiliates


Oracle Maximum Availability Architecture (MAA)

Continuous availability

Customer insights and expert recommendations


Platinum

Application Continuity Global Data Services

Data protection

Reference 24/7 HA features,


architectures configurations
Replication and operational Flashback RMAN + ZDLRA
practices
Gold

Active replication
Production site Replicated site

Active Data Guard GoldenGate


Deployment choices
Silver

Scale out

Generic Systems Engineered Systems DBCS ExaCS/ExaCC Autonomous DB


Bronze

RAC ASM Sharding

5 Copyright © 2022, Oracle and/or its affiliates


MAA reference architectures
Availability service levels

Bronze Silver Gold Platinum

Dev, test, prod Prod/departmental Business critical Mission critical

Bronze + Silver + Gold +

Single instance DB Database HA with RAC DB replication with Active GoldenGate


Data Guard
Restartable Application continuity
Edition based redefinition
Backup/restore

All tiers exist with on-premises and cloud. However, platinum currently must be
configured manually while bronze to gold are covered with cloud tool automation for
the most part depending on the desired RTO (i.e. FSFO & multiple standby databases
still must be manually configured for example)

6 Copyright © 2022, Oracle and/or its affiliates


Challenges of deploying highly available systems

Cost and complexity Lack of skills Risk of failure

7 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Overview

8 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard (DG)

Primary Secondary
Site Site
• Basic DR (included with DB EE)
- License primary and secondary sites

• Active-passive
- Standby is used only for failovers
Sync or Async Replication
via in-memory Redo
• Automatic failover to Standby site

• Zero / near-zero data loss

• Continuous data validation


Data Guard Broker
(Enterprise Manager Cloud Control or DGMGRL)
• Simple migrations and upgrades

https://www.oracle.com/database/technologies/high-availability/dataguard-activedataguard-demos.html
9 Copyright © 2022, Oracle and/or its affiliates
Data Guard
Capabilities Included with Oracle Database Enterprise Edition (EE)

Data Protection High Availability Performance and ROI

Zero or sub-second data loss


Automatic database failover Extreme throughput - supports
protection
all workloads
Strong isolation using
Automatic client failover Dual-purpose standby for
continuous Oracle validation
development and test
Lost-write detection Standby-first patch apply
Integrated management
Universal support – all data Database rolling maintenance
types and applications
Comprehensive monitoring with Select platform migrations
Enterprise Manager

10 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard
Actively protecting data for the future both on-premises and in the cloud
• Active Data Guard Real-Time Cascade • Updates on ADG (DML Redirect)
• Fast Sync • Finer granularity Supplemental Logging
• Broker for Cascaded Standby Databases • Flashback Standby when Primary database is flashed back
• Resumable Switchover Operations • In-Memory Column Store on Multi-Instance Redo apply
• Rolling Upgrade Using Active Data Guard • Observe only mode for FSFO 21c
• Single Command Role Transitions • Propagate Restore Points from Primary to Standby site
• Data Guard Broker PDB Migration or Failover • Simplified Database Parameter Management
• Multi-Instance Redo Apply • Dynamically Change FSFO target
• Zero Data Loss at any distance – Far Sync
• Protection During Database Rolling Upgrade • Data Guard per Pluggable Database
• Password Files Synchronization
• Oracle Database In-Memory on Oracle Active Data Guard
19c • Standby Result Cache preservation
• Fast Start Failover Configuration
• Preserving Application Connections During Role Changes Validation & Call Outs
• Application Continuity (ADG or RAC) • Data Guard Broker Client Side
18c Standardized Directory Structure
• Data Guard Broker Far Sync Instance
12c Creation
11.2 • Automatic Correction of Non-logged Blocks
• Fast Start Failover Lag Allowance in Max
Availability Mode
at a Data Guard Standby Database
• FarSync for Max Performance Mode
• RMAN recover standby simplification
• Configurable Real-Time Query Apply Lag Limit • PDB recovery isolation
• Shadow Lost Write Protection
• Integrated Support for Application Failover • Transparent Application Continuity
• SPA Support for Active Data Guard Environment
• AWR reports for the standby workload
• Support Up to 30 Standby Databases

11 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard vs Storage Replication

12 Copyright © 2022, Oracle and/or its affiliates


Storage Remote Mirroring Architecture
Mirrors every write to every file including those that are corrupted or encrypted by ransomware

Primary Site Standby Site


Primary Database • No Oracle validation
• Unusable destination
• 7x network volume
• 27x network I/O
• Standby needs warm-up

DATAFILES DATAFILES

SYNC or ASYNC
LOGFILES LOGFILES

block replication
CONTROLFILES CONTROLFILES

Primary Volumes Mirrored Volumes

13 Copyright © 2022, Oracle and/or its affiliates


Data Guard Does What Storage Mirroring Can’t
Isolate Corruption, Protect Data, Maintain Availability

Storage Remote Mirroring… Data Guard uses physical and logical


blocks are just bits on a disk data consistency checks for end to end data integrity

See My Oracle Support Note 1302539.1 for details


14 Copyright © 2022, Oracle and/or its affiliates
Data Guard is optimized for the database
It efficiently maintains a physical copy of production and guarantees its integrity

• Validation end to end


Primary Site
• Ransomware Protection
Online
• Only the essential information is replicated
Redo Logs • Efficient and performant

Standby Site

SGA LGWR

REDO
BUFFER
NSS/ SYNC or ASYNC RFS MRP
TT Redo transport
Primary Standby
Redo Logs
Standby
Database Database

15 Copyright © 2022, Oracle and/or its affiliates


Data Guard Provides Strongest Fault Isolation and Best Performance

Data Guard transmits redo blocks directly from SGA:


like a memcpy over the network
Redo received / applied by running Oracle instance:
continuous Oracle-integrated data validation
System Memory (SGA)
TCP/IP • Best isolation from lower layer faults
Oracle • Best performance since no disk I/O
Database • Best network utilization: only redo sent
Architecture To Standby • Transactional consistency: always
Databases • Corrupted blocks auto-repaired *
• Database-integrated application failover
* Requires Active Data Guard License

Oracle Active Data Guard Compared to Storage Remote Mirroring


https://www.oracle.com/a/tech/docs/adg-vs-storage-mirroring.pdf
Oracle Replication done right
16 Copyright © 2022, Oracle and/or its affiliates
https://blogs.oracle.com/maa/replication-done-right
Oracle Data Guard Redo Transport

17 Copyright © 2022, Oracle and/or its affiliates


Data Guard Transport for Best Performance
Data Guard ASYNC Process Architecture
Commit Acknowledge is local-only
Primary Site Data Guard Processes
1. TT – transmits redo from primary log buffer
2. RFS – receives redo, writes to log file
3. MRP – recovery process on standby database
COMMIT ACK
COMMIT Standby Site

Online
Redo Logs

SGA LGWR

REDO
BUFFER
TT RFS MRP
Oracle Net
Primary Standby
Redo Logs
Standby
Database Database

18 Copyright © 2022, Oracle and/or its affiliates


Data Guard Transport for Zero Data Loss
Data Guard FASTSYNC Process Architecture
Data Guard Processes
Primary Site 1. NSS – transmits redo from primary log buffer
2. RFS – receives redo, sends ACK back, writes to log file
3. MRP – recovery process on standby database

COMMIT ACK
COMMIT Standby Site

Online
Redo Logs

SGA LGWR

REDO
BUFFER
NSS RFS MRP
Oracle Net
Primary Standby
Redo Logs
Standby
Database Database

19 Copyright © 2022, Oracle and/or its affiliates


Data Guard Transport for Zero Data Loss
Data Guard SYNC Process Architecture
Data Guard Processes
Primary Site 1. NSS – transmits redo from primary log buffer
2. RFS – receives redo, writes to log file, sends ACK back
3. MRP – recovery process on standby database

COMMIT ACK
COMMIT Standby Site

Online
Redo Logs

SGA LGWR

REDO
BUFFER
NSS RFS MRP
Oracle Net
Primary Standby
Redo Logs
Standby
Database Database

20 Copyright © 2022, Oracle and/or its affiliates


Stalling Synchronous destinations
Data Guard FASTSYNC/SYNC Process Architecture
Data Guard Processes
Primary Site 1. NSS – tries to send the redo to the remote destination
2. The commits stall for NetTimeout seconds
3. The destination is abandoned, the commits resume

COMMIT ACK
COMMIT Standby Site

Online
Redo Logs

SGA LGWR

REDO
BUFFER
NSS RFS MRP

Primary Oracle Net


Standby
Redo Logs
Standby
Database Database

21 Copyright © 2022, Oracle and/or its affiliates


The difference between receiving the redo late and not receiving it
DATUM_TIME vs TRANSPORT LAG vs LAST_TIME
Standby not receiving the redo from the primary:
SQL> select value, datum_time, from v$dataguard_stats where name='transport lag';

VALUE DATUM_TIME
------------ -------------------
+00 00:00:00 07/11/2022 08:28:46

Standby receiving old redo from the primary:


SQL> select value, datum_time, from v$dataguard_stats where name='transport lag';

VALUE DATUM_TIME
------------ -------------------
+01 13:50:54 07/12/2022 21:48:10

The last redo written in the standby logs:


SQL> select max(last_time) from v$standby_log where status='ACTIVE';

MAX(LAST_TIME)
-------------------
07/11/2022 08:28:46

22 Copyright © 2022, Oracle and/or its affiliates


High Performance – Synchronous Redo Transport
Mixed OLTP workload with Metro-Area Network Latency

35 000 Workload profile

30 000 • Swingbench OLTP


plus large inserts
25 000

• 112 MB/s redo


20 000
3% impact at < 1ms RTT
Txn Rate

15 000 5% impact at 2ms RTT


6% impact at 5ms RTT
10 000

5 000 Use oratcptest to assess your


network bandwidth and latency
0
No Sync 0ms 2ms 5ms 10ms 20ms
Txns/s 29 496 28 751 27 995 27 581 26 860 26 206
Redo Rate (MB/sec) 116 112 109 107 104 102
% Workload 100% 97% 95% 94% 91% 89%

Note: 0ms latency on graph represents values <1ms

23 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Best Practices – Transport and Apply Tuning

Redo Apply Best Practices


https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/tune-and-troubleshoot-oracle-data-
guard.html#GUID-E8C27979-9D37-4899-9306-A5AE2B5CF6C0

Best Practices for Redo Transport Tuning


https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/tune-and-troubleshoot-oracle-data-
guard.html#GUID-A6963335-8C5A-4DD0-AD3F-22F4CBCE3DD0

Assessing Synchronous Redo Transport


https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/tune-and-troubleshoot-oracle-data-
guard.html#GUID-4C3E0CC9-3E54-48C4-8DD6-AB4EC0C51696

How To Calculate The Required Network Bandwidth Transfer Of Redo In Data Guard (Doc ID 736755.1)
https://support.oracle.com/rs?type=doc&id=736755.1

Assessing and Tuning Network Performance for Data Guard and RMAN (Doc ID 2064368.1)
https://support.oracle.com/rs?type=doc&id=2064368.1

24 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Role Transitions

25 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Planned Role Transition
Switchover: Planned role transition with Zero Data Loss

Primary Secondary • Switchover initiated


Site Site
• The primary ends the transactions and stops the
services

• All the transaction are synced to the standby

• The standby is converted to primary and the


services are started
• The replication starts again

• The applications reconnect transparently to the


new primary
• If properly configured, the application
Data Guard Broker experience just a freeze for 1-2 minutes or less
(Enterprise Manager Cloud Control or DGMGRL)

26 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Unplanned Role Transition
Failover: In case of failure the role transition can be without data loss

Primary Secondary • The observer detects the failure of the primary


Site Site • Depending on the protection mode and situation,
the observer initiates the failover after
FastStartFailoverThreshold seconds
Observer

• The standby is converted to primary and the


services are started
• Depending on the protection mode and situation,
there might be some data loss (the tolerated
amount is configurable)

• The applications reconnect to the new primary


• The reinstatement of the primary requires a single
broker command
Data Guard Broker
(Enterprise Manager Cloud Control or DGMGRL)
• The failover can be initiated also manually (DGMGRL)
or by the application (DBMS_DG.INITIATE_FS_FAILOVER) .
The amount of data loss is customer’s responsibility in
this case.
27 Copyright © 2022, Oracle and/or its affiliates
"FAILOVER TO" vs "DBMS_DG.INITIATE_FS_FAILOVER"

FAILOVER TO stdby DBMS_DG.INITIATE_FS_FAILOVER

3 3

1 2
1
Possible split-brain

1 The standby database becomes the new primary 1 The procedure shuts down the primary first

2 The former primary shuts down after FastStartFailoverThreshold 2 The standby database becomes the new primary

3 The former primary is reinstated automatically 3 The former primary is not reinstated

28 Copyright © 2022, Oracle and/or its affiliates


Data Guard Snapshot Standby
Standby database temporarily in Read Write

Primary Secondary
• The standby is converted to Snapshot Standby
Site Site • Standby open read write

• Users and DBAs perform tests (Upgrade,


Performance, etc)
• The primary is still protected by the redo
transfer

• When the tests are over, the standby is flashed


back and converted to physical standby again

• Note: the snapshot standby cannot relay the redo to


a cascaded standby

Data Guard Broker


(Enterprise Manager Cloud Control or DGMGRL)

29 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Role Transitions – Read More

Role Transition Assessment and Tuning


https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/tune-and-troubleshoot-oracle-data-
guard.html#GUID-CBA9FC61-9894-4D62-9569-EFBD7960267F

30 Copyright © 2022, Oracle and/or its affiliates


Client Failover and Application
Continuity

31 Copyright © 2022, Oracle and/or its affiliates


Services for Location Transparency and High Availability
Services provide a “dial in number” for your application

• Use Custom services with FAN notifications and Application Continuity


• Regardless of location, application keeps the name!
• Client failover best practices across the Oracle technology stack

Real Application Clusters Active Data Guard GoldenGate PDB Relocation

32 Copyright © 2022, Oracle and/or its affiliates


Connections Appear Continuous
Standard for All Drivers from 12.2 Automatic retries until the service is available

HR = (DESCRIPTION =
(CONNECT_TIMEOUT=120)(RETRY_COUNT=50)(RETRY_DELAY=3)
(TRANSPORT_CONNECT_TIMEOUT=3)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=cluster1-scan)(PORT=1521)))
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=cluster2-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME = HR.oracle.com)))

Always use a custom service!


Do NOT use PDB or DB Name

33 Copyright © 2022, Oracle and/or its affiliates


Client-side required technologies

Client draining/failover is a crucial part of high availability for applications connecting to the database.

PLANNED MAINTENANCE UNPLANNED MAINTENANCE


Fast Application Notification Application Continuity 1
(Session Draining) (Transaction Replay)

2. failure

ONS Replay
Driver
SERVICE

Transaction
Guard

1 Application
Checklist for Continuous Service for MAA Solutions
https://www.oracle.com/technetwork/database/clustering/checklist-ac-6676160.pdf
34 Copyright © 2022, Oracle and/or its affiliates
Fast Application Notification
Session Draining for planned maintenance

register

ONS ONS
connect
CRM_SVC

Real Applications Cluster / Data Guard

35 Copyright © 2022, Oracle and/or its affiliates


Fast Application Notification
Session Draining for planned maintenance

ONS ONS

CRM_SVC stop start CRM_SVC

Real Applications Cluster / Data Guard

36 Copyright © 2022, Oracle and/or its affiliates


Fast Application Notification
Session Draining for planned maintenance

Disconnect when the transaction


is over and reconnect

ONS ONS

CRM_SVC CRM_SVC

Real Applications Cluster / Data Guard

37 Copyright © 2022, Oracle and/or its affiliates


Fast Connection Failover (FCF)
FAN integrated in connection pools

• Pre-configured FAN integration


• Uses connection pools
• The application must be pool aware
• (borrow/release)
• The connection pool leverages FAN events to:
• Remove quickly dead connections on a DOWN event
• (opt.) Rebalance the load on a UP event

38 Copyright © 2022, Oracle and/or its affiliates


Fast Connection Failover (FCF)
FAN integrated in connection pools

• UCP (Universal Connection Pool, ucp.jar) and WebLogic Active GridLink


handle FAN out of the box.
No code changes! Just enable FastConnectionFailoverEnabled

• Third-party connection pools can implement FCF


• If JDBC driver version >= 12.2
• simplefan.jar and ons.jar in CLASSPATH
• Connection validation options are set in pool properties
• Connection pool can plug javax.sql.ConnectionPoolDataSource
• Connection pool checks connections at borrow/release

39 Copyright © 2022, Oracle and/or its affiliates


Application Continuity (AC)
Protects the in-flight transaction from failures and disconnections

Replay
Driver

in-flight
transaction

CRM_SVC CRM_SVC

Real Applications Cluster / Active Data Guard


Transaction Transaction
Guard Guard
40 Copyright © 2022, Oracle and/or its affiliates
Application Continuity (AC)
Protects the in-flight transaction from failures and disconnections

Replay
Driver

check the
transaction
CRM_SVC status CRM_SVC

Real Applications Cluster / Active Data Guard


Transaction Transaction
Guard Guard
41 Copyright © 2022, Oracle and/or its affiliates
Application Continuity (AC)
Protects the in-flight transaction from failures and disconnections

Replay
Driver

replay if
necessary
CRM_SVC CRM_SVC

Real Applications Cluster / Active Data Guard


Transaction Transaction
Guard Guard
42 Copyright © 2022, Oracle and/or its affiliates
Application Continuity (AC)
Protects the in-flight transaction from failures and disconnections

• AC with UCP: no code change


PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
pds.setConnectionFactoryClassName("oracle.jdbc.replay.OracleDataSourceImpl");
...
conn = pds.getConnection(); // Implicit database request begin
// calls protected by Application Continuity
conn.close(); // Implicit database request end

• AC without connection pool: code change


OracleDataSourceImpl ods = new OracleDataSourceImpl();
conn = ods.getConnection();
...
((ReplayableConnection)conn).beginRequest(); // Explicit database request begin
// calls protected by Application Continuity
((ReplayableConnection)conn).endRequest(); // Explicit database request end

43 Copyright © 2022, Oracle and/or its affiliates


Transparent Application Continuity (TAC)
Application Continuity for every connection and application type

• Introduced in 18c for JDBC thin, 19c for OCI (Oracle Call Interface)
• Records session and transaction state server-side
• No application change
• Works without connection pools (although they are still recommended)
• Replayable transactions are replayed
• Non-replayable transactions raise exception
• Good driver coverage but check the doc!
• Side effects are never replayed

44 Copyright © 2022, Oracle and/or its affiliates


Key Differences between FAN, AC, and TAC

Requires Replay
Since Application
Best for Connection Side JDBC/OCI
Version Changes
Pool Effects
Catch FAN No, but
Planned
FAN 10g events recommended N/A Both
Maintenance
(or use UCP) (FCF)
Use explicit
Unplanned Yes
AC 12c boundaries Yes Both
Maintenance (Choose)
(or use UCP)
Unplanned
TAC No, but
Maintenance 19c No Never Both
(Recommended) recommended

45 Copyright © 2022, Oracle and/or its affiliates


Documentation!

OCI
JDBC

46 Copyright © 2022, Oracle and/or its affiliates


Fast-Start Failover:
Oracle Data Guard Observer

47 Copyright © 2022, Oracle and/or its affiliates


Network partitioning
When did the primary disconnect?

Primary Secondary
Site Site

?
-- THE LAST TIME THE STANDBY HEARD FROM THE PRIMARY (1 second tolerance)
SQL> select datum_time, (sysdate-to_date(datum_time,'MM/DD/YYYY HH24:MI:SS'))*86400 secs_ago
2> from v$dataguard_stats where name='transport lag';

DATUM_TIME SECS_AGO
------------------------------ ---------- The columns might be null in some cases
07/11/2022 08:28:46 90361

Do not rely on the transport lag value


48 Copyright © 2022, Oracle and/or its affiliates
Network partitioning
Up to which point can the standby recover?

Primary Secondary
Site Site

? ?
Standby
Redo Logs

-- THE TIMESTAMP OF THE LAST REDO ENTRY RECEIVED FROM THE PRIMARY
SQL> alter session set nls_date_format='MM/DD/YYYY HH24:MI:SS';
SQL> select coalesce(max(s.last_time), max(a.next_time)) as last_redo_from_prim,
2> (sysdate-coalesce(max(s.last_time), max(a.next_time)))*86400 as secs_ago
3> from v$standby_log s , v$archived_log a ;

LAST_REDO_FROM_PRIM SECS_AGO
------------------- ----------
07/11/2022 08:28:46 91061
? Is it a reliable way to calculate data loss?

49 Copyright © 2022, Oracle and/or its affiliates


Network partitioning
Is there a way to calculate the data loss upon failover?

Primary Secondary
Site Site

?
DID IT CRASH? WHAT IS THE PRIMARY DOING?
DID IT STALL? • Primary still committing
DID IT KEEP COMMITTING? • DATA LOSS and possible split brain
• Primary crashed
• Maybe no data loss
50 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer acts as a quorum
Automatic failover when the Primary Database is unavailable

• The observer monitors both primary and standby


Primary Observer Secondary
Site Site Site • Standby isolated:
Observer
• The primary keeps writing
• The Observer Isolated:
• The configuration keeps working unobserved
• Primary Isolated:
• Failover!
The primary loses the quorum and stops committing

• The observer can work in “OBSERVE ONLY” mode


• Reports a failure without failing over

51 Copyright © 2022, Oracle and/or its affiliates


Fast-Start Failover callouts NEW IN

21c
Execute custom actions before and after the automatic failover occurs

$ cat $DG_ADMIN/config_ConfigName/callout/fsfocallout.ora

Primary Observer Secondary


Site Site Site # The pre-callout script is run before failover
Observer FastStartFailoverPreCallout=fsfo_precallout.sh
FastStartFailoverPreCalloutTimeout=1200
FastStartFailoverPreCalloutSucFileName=fsfo_precallout.suc
FastStartFailoverPreCalloutErrorFileName=precallout.err
FastStartFailoverActionOnPreCalloutFailure=STOP

# The post-callout script is run after failover succeeds


1 fsfo_precallout.sh FastStartFailoverPostCallout=fsfo_postcallout.sh
$
2 Failover

3 fsfo_precallout.sh

52 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Observer Placement
Observer at the primary site: OK

Primary Site Standby Site

Observer

Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer


53 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the primary site: OK

Primary Site Standby Site

Observer

0
Primary
keeps writing
unprotected Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer


54 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the primary site: OK

Primary Site Standby Site

Observer

Manual failover
required

Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer


55 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the primary site: OK

Primary Site Standby Site

Observer

Configuration
Primary DB keeps working Standby DB

unobserved

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer


56 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!

Primary Site Standby Site

Observer

Automatic
0 failover!

Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

✘ ✘
57 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!

Primary Site Standby Site

Observer

Primary keeps
writing
unprotected Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

✘ ✘
58 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!

Primary Site Standby Site

Observer

Automatic
Primary failover!
shuts down
to prevent
split-brain! Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

✘ ✘
59 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!

Primary Site Standby Site

Observer
No database
is available for writes!

Primary 0
shuts down
to prevent
split-brain! Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

✘ ✘
60 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at the standby site: BAD!

Primary Site Standby Site

Observer

Configuration
Primary DB keeps working Standby DB

unobserved

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

✘ ✘
61 Copyright © 2022, Oracle and/or its affiliates
Oracle Data Guard Observer Placement
Observer at an external size: BEST
Observer Site

Primary Site Standby Site


Observer

Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

62 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Observer Placement
Observer at an external size: BEST
Observer Site

Primary Site Standby Site


Observer

0
Primary keeps
writing
unprotected Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

63 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Observer Placement
Observer at an external size: BEST
Observer Site

Primary Site Standby Site


Observer

Primary keeps
writing
unprotected Primary DB Standby DB

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

64 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Observer Placement
Observer at an external size: BEST
Observer Site

Primary Site Standby Site


Observer

Configuration
Primary DB keeps working Standby DB

unobserved

Failure: Primary DB Standby DB Network Primary Site Standby Site Observer

65 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Observer High Availability
Up to three(*) observers configured (one active at a time)

Observer Site 1
Primary Inactive Observer Secondary
Site Site

• Optimal: 2 or 3 different Regions/Data Centers/Ads


Observer Site 2 • Ensure there are no SPOFs (network, power…)
Active Observer

• If one observer fails, another is promoted

Observer Site 3
Inactive Observer

66 Copyright © 2022, Oracle and/or its affiliates * Four starting with 21c
Oracle Data Guard Observer High Availability
Tolerate observer site failure but avoid observer on the standby site

Primary Site Observer Site 1 Standby Site


Active Observer
Observer Observer

Primary DB Standby DB

edit database db_site1 set property PreferredObserverHosts='obs_ext,obs_site1';

edit database db_site2 set property PreferredObserverHosts='obs_ext,obs_site2';

67 Copyright © 2022, Oracle and/or its affiliates * Four starting with 21c
Oracle Data Guard Observer High Availability
Optimal configuration with two sites

Primary Site Standby Site


Active 4th Observer
Observer Observer Observer in 21c only

Primary DB Standby DB

edit database db_site1 set property PreferredObserverHosts='obs1_site1,obs2_site1';

edit database db_site2 set property PreferredObserverHosts='obs1_site2,obs2_site2';

68 Copyright © 2022, Oracle and/or its affiliates


Oracle Data Guard Observer High Availability
Observer promotion requires both primary and standby databases

Primary Site Standby Site


Active
Observer Observer No observer promotion! Observer

Primary DB Standby DB

• The surviving observer cannot tell if the configuration isn't


working on the primary site: a failover would cause split-brain

69 Copyright © 2022, Oracle and/or its affiliates


Multiple Fast-Start Failover Targets
Don’t let a database failure compromise your protection

DGMGRL> edit database BOSTON set property


FastStartFailoverTarget='NASHUA,NEWYORK';

Observer
DGMGRL> edit database NASHUA set property
FastStartFailoverTarget='BOSTON,NEWYORK';

DGMGRL> edit database NEWYORK set property


Primary FSFO FastStartFailoverTarget='BOSTON,NASHUA';
DB Target
DGMGRL> show fast_start failover;
...
Active Target: NASHUA
FSFO
Candidate Potential Targets: NEWYORK
NEWYORK valid
...
Always use at least two standbys in Max Protection!

70 Copyright © 2022, Oracle and/or its affiliates


Network partitions and consistency
Multiple Fast-Start Failover Targets

Observer

Primary FSFO
DB Candidate

FSFO Remote
Target bystander

71 Copyright © 2022, Oracle and/or its affiliates


Network partitions and consistency
Multiple Fast-Start Failover Targets

Observer
Primary Isolated
• The observer can still contact the FSFO target.

What happens:
Reinstate needed New Target 1. The primary is STALLED.
2. The observer initiates the failover to FSFO Target.
Primary FSFO 3. The FSFO Target becomes primary.
DB Candidate 4. The new primary and the observer agree to a new FSFO Target.
5. The former Primary DB will require a reinstate.

Execution of Fast Start Failover

FSFO Remote
Target bystander
New Primary

72 Copyright © 2022, Oracle and/or its affiliates


Network partitions and consistency
Multiple Fast-Start Failover Targets

Observer
FSFO Target Isolated
• The observer can still contact the primary.

What happens:
1. The primary goes temporarily UNSYNCHRONIZED (no FSFO,
New Target unless Max Protection).
Primary FSFO 2. The new primary and the observer agree to a new FSFO Target.
DB Candidate 3. As soon as the new FSFO target is ready, FSFO is possible again.

Fast Start Failover not possible for the


time of target switch, then possible again

FSFO Remote
Target bystander
Unsynchronized

73 Copyright © 2022, Oracle and/or its affiliates


Network partitions and consistency
Multiple Fast-Start Failover Targets

Observer
Primary and FSFO Target Isolated
• The observer cannot contact the primary nor the FSFO target

What happens:
1. The observer cannot tell if the network is unreachable, or the
Primary Unobserved whole site is down. The primary might still write to the standby
Primary FSFO (valid LAD destination).
DB Candidate 2. The primary and FSFO targets keep working, the configuration
is UNOBSERVED.
3. The observer cannot initiate a failover, as it would lead to split-
brain and data loss.

Fast Start Failover not possible

FSFO Remote
Target bystander

74 Copyright © 2022, Oracle and/or its affiliates


Network partitions and consistency
Multiple Fast-Start Failover Targets

Observer
Primary and Observer Isolated
• No FSFO target or candidates can be contacted.

What happens:
Unsynchronized 1. The primary keeps working without protection (unless Max
Protection).
Primary FSFO
DB Candidate

Fast Start Failover not possible

FSFO Remote
Target bystander

75 Copyright © 2022, Oracle and/or its affiliates


Fast-Start Failover:
Oracle Data Guard Protection Modes

76 Copyright © 2022, Oracle and/or its affiliates


Data Guard Fast-Start Failover Protection Modes
Balance Data Protection with Performance and Availability

What does the primary do if the standby does not


acknowledge the transaction?

MAXIMUM PERFORMANCE
Never wait

Never waits for acknowledge (ASYNC).


Some transactions might get lost when primary fails.
Performance

MAXIMUM AVAILABILITY
Wait a bit

Waits until NetTimeout seconds (SYNC or FASTSYNC),


then continue without standby. Data loss possible with
manual failover or within specified limit (21c).
Wait forever

MAXIMUM PROTECTION
Data loss No/minimal data loss No data loss Waits until the standby is available again (SYNC only).
Protection No transactions are lost, ever.

77 Copyright © 2022, Oracle and/or its affiliates


Automatic Failover with MaxPerformance
Choose how much data loss you can tolerate
ASYNC Transport. The Standby has a residual lag
1
Status: TARGET UNDER LAG LIMIT

FastStartFailoverLagLimit
At t0 + ping time, the primary cannot contact the standby.
2 The Primary keeps committing, the Standby lag increases.
SCN 3 4 Status: TARGET UNDER LAG LIMIT

The primary reaches FastStartFailoverLagLimit. It


temporarily stalls (~3 seconds) until it gets permission from
3 the observer to continue.
Lag above the Limit: No Status: STALLED
Failover
2 After the observer pings the primary and gives permission
Potential LAG 4 to continue, the primary resumes the commit activity.
1 Status: TARGET OVER LAG LIMIT
5
Primary
Standby The observer acknowledges that the lag is above the limit
5 and will not permit a failover in case it loses connectivity
t0 TIME with the primary. The standby is declared out of sync.
STANDBY disconnected Status: TARGET OVER LAG LIMIT

78 Copyright © 2022, Oracle and/or its affiliates


Automatic Failover with MaxPerformance
Choose how much data loss you can tolerate
ASYNC Transport. The Standby has a residual lag
1
Status: TARGET UNDER LAG LIMIT

FastStartFailoverLagLimit At t0 the primary increases the activity rate.


2 The Standby lag increases.
Status: TARGET UNDER LAG LIMIT
SCN 3 4
The primary reaches FastStartFailoverLagLimit. It

OVER LAG:
3 temporarily stalls (~3 seconds) until it gets permission from
the observer to continue.
No Failover Status: STALLED
6
After the observer pings the primary and gives permission
5 4 to continue, the primary resumes the commit activity.
2 Status: TARGET OVER LAG LIMIT

1 The observer acknowledges that the lag is above the limit


5 and will not permit a failover in case it loses connectivity
with the primary. The standby is declared out of sync.
Primary Status: TARGET OVER LAG LIMIT
Standby
t0 The standby catches up with the primary and the lag goes
STANDBY accumulates lag TIME 6 under the limit. The observer can now failover if the
primary fails.
Status: TARGET UNDER LAG LIMIT
79 Copyright © 2022, Oracle and/or its affiliates
Automatic Failover with MaxPerformance
Choose how much data loss you can tolerate
FastStartFailoverLagLimit
≤ FastStartFailoverThreshold

ASYNC Transport. The Standby has a residual lag


FastStartFailoverLagLimit 1 Status: TARGET UNDER LAG LIMIT
FastStartFailoverThreshold
SCN 3 STALL At t0 + ping time, the observer cannot contact the primary
2 and starts a timer for FastStartFailoverThreshold
seconds. The Primary keeps committing, the Standby lag
increases.
Status: TARGET UNDER LAG LIMIT

DATA LOSS
The primary reaches FastStartFailoverLagLimit. It cannot
3 obtain permission from the observer to continue. It stalls or
2 shuts down depending on FastStartFailoverPmyShutdown
Potential LAG Status: STALLED
1
4 The observer timer reaches FastStartFailoverThreshold.
Primary 4 Still no connection with the primary: it initiates the Failover
Standby
Status: REINSTATE REQUIRED
t0 TIME
PRIMARY disconnected

80 Copyright © 2022, Oracle and/or its affiliates


Automatic Failover with MaxPerformance
Set the FastStartFailoverLagLimit wisely to avoid split-brain conditions

FastStartFailoverLagLimit
FastStartFailoverThreshold < FastStartFailoverLagLimit

FastStartFailoverThreshold 4 1
ASYNC Transport. The Standby has a residual lag
Status: TARGET UNDER LAG LIMIT

SCN
At t0 + ping time, the observer cannot contact the primary
2 and starts a timer for FastStartFailoverThreshold
seconds. The Primary keeps committing, the Standby lag
SPLIT BRAIN increases.
Status: TARGET UNDER LAG LIMIT
DATA LOSS
The observer timer reaches FastStartFailoverThreshold.
3 Still no connection with the primary: it initiates the Failover
2 Primary status: TARGET UNDER LAG LIMIT
Potential LAG Standby status: REINSTATE REQUIRED
1
The Primary keeps committing until it reaches

Primary 3 4 FastStartFailoverLagLimit, then stalls or shuts down.


A Split-Brain condition may occur depending on timings
Standby and parameter values.
Primary status: STALLED or REINSTATE REQUIRED
t0 TIME Standby status: REINSTATE REQUIRED
PRIMARY disconnected

81 Copyright © 2022, Oracle and/or its affiliates


Automatic Failover with MaxAvailability
Failover only if Zero Data Loss is guaranteed

NetTimeout SYNC Transport. The Standby is synched.


1 Status: SYNCHRONIZED

SCN At t0 + ping time, the primary and observer cannot contact


2 the standby. The primary stalls and keeps retrying for
NetTimeout seconds.
Status: STALLED

At NetTimeout seconds, the primary asks and obtain


3 permission from the observer to stop the redo transport to
the destination. The standby is declared unsynchronized.
2 Potential 4 Status: UNSYNCHRONIZED
Data Loss
STALL
No Failover
1 The Observer knows that the standby is out of sync. If later
4 the connection with the primary is lost, and the standby is
back, the observer will not initiate a failover because of the
Primary 3 unsynched status, unless FastStartFailoverLagLimit is set.
Standby Status: UNSYNCHRONIZED

t0 TIME
STANDBY disconnected

82 Copyright © 2022, Oracle and/or its affiliates


Automatic Failover with MaxAvailability
Failover only if Zero Data Loss is guaranteed
SYNC Transport. The Standby is synched.
NetTimeout 1 Status: SYNCHRONIZED

FastStartFailoverThreshold
At t0 + ping time, the observer cannot contact the primary
2 and starts a timer for FastStartFailoverThreshold
SCN seconds. The Primary stalls and keeps retrying for
NetTimeout Seconds.
Status: STALLED

The observer timer reaches FastStartFailoverThreshold.


3 Still no connection with the primary: it initiates the Failover.
If NetTimeout is higher than the threshold, the Primary is
reachable but not committing (read-only split).
Primary status: STALLED
2 Standby status: REINSTATE REQUIRED
STALL Read-Only
4
1 Split
The Primary keeps stalling or shuts down after NetTimeout
4 because it cannot get the permission from the observer to
Primary 3 abandon the destination. A lower NetTimeout reduces the
potential of read-only split.
Standby Status: REINSTATE REQUIRED
t0 TIME
PRIMARY disconnected

83 Copyright © 2022, Oracle and/or its affiliates


NEW IN

Automatic Failover with MaxAvailability 21c


Optionally choose a limit to have an automatic failover with potential data loss

NetTimeout FastStartFailoverLagLimit
≤ FastStartFailoverThreshold

FastStartFailoverLagLimit
SYNC Transport. The Standby is synched.
1
FastStartFailoverThreshold
SCN At t0 + ping time, the primary cannot contact the standby.
2 The primary stalls and keeps retrying for NetTimeout
seconds. The observer starts the timer.

4 STALL At NetTimeout seconds, the primary stops shipping the redo


3 without asking permission to the observer, because
FastStartFailoverLagLimit is set. The observer does not

DATA LOSS
acknowledge the unsynched status.

2 The primary reaches FastStartFailoverLagLimit. It cannot


STALL
4 obtain permission from the observer to continue. It stalls or
Potential LAG shuts down depending on FastStartFailoverPmyShutdown
1
The observer timer reaches FastStartFailoverThreshold.
Primary 3 5 5 Still no connection with the primary: it initiates the Failover.
Standby
t0 TIME
PRIMARY disconnected

84 Copyright © 2022, Oracle and/or its affiliates


NEW IN

Automatic Failover with MaxAvailability 21c


Set the FastStartFailoverLagLimit wisely to avoid split-brain conditions

NetTimeout FastStartFailoverThreshold < FastStartFailoverLagLimit

FastStartFailoverLagLimit
SYNC Transport. The Standby is synched.
1
FastStartFailoverThreshold
SCN At t0 + ping time, the primary cannot contact the standby.
2 The primary stalls and keeps retrying for NetTimeout
seconds. The observer starts the timer.
5 At NetTimeout seconds, the primary switches to ASYNC
3 redo transport without requiring permission to the observer,
because FastStartFailoverLagLimit is set. The observer
does not acknowledge the unsynched status.

DATA LOSS
SPLIT
BRAIN

2 The observer timer reaches FastStartFailoverThreshold.


STALL
4 Still no connection with the primary: it initiates the Failover
Potential LAG
1
The Primary keeps committing until it reaches
Primary 3 4 5 FastStartFailoverLagLimit, then stalls or shuts down.
A Split-Brain condition may occur depending on timings and
Standby parameter values.
t0 TIME
PRIMARY disconnected

85 Copyright © 2022, Oracle and/or its affiliates


Automatic Failover with MaxProtection
Zero Data Loss in any condition

SYNC Transport. The Standby is synched.


NetTimeout 1 Status: SYNCHRONIZED

At t0 + ping time, the primary cannot contact the standby


2 and keeps retrying for NetTimeout Seconds.
Status: STALLED
SCN
After NetTimeout, the primary still has connectivity with
3 the observer. Knowing that a failover did not occur, it keeps
trying forever.
Status: STALLED

2
STALL
1

Primary 3
Standby
t0 TIME
STANDBY disconnected

86 Copyright © 2022, Oracle and/or its affiliates


Automatic Failover with MaxProtection
Zero Data Loss in any condition

SYNC Transport. The Standby is synched.


NetTimeout 1 Status: SYNCHRONIZED

FastStartFailoverThreshold
At t0 + ping time, the observer cannot contact the primary
2 and starts a timer for FastStartFailoverThreshold
SCN seconds. The Primary stalls and keeps retrying for
NetTimeout Seconds.
Status: STALLED

The observer timer reaches FastStartFailoverThreshold.


3 Still no connection with the primary: it initiates the Failover.
If NetTimeout is higher than the threshold, the Primary is
reachable but not committing (read-only split).
Former Primary Status: STALLED
2 New Primary Status: REINSTATE_REQUIRED
STALL Read-Only
4
1 Split
4 The primary keeps stalling or shuts down after NetTimeout
depending on FastStartFailoverPmyShutdown. A lower
Primary 3 NetTimeout reduces the potential of read-only split.
Standby Former Primary Status: STALLED or REINSTATE_REQUIRED
New Primary Status: REINSTATE_REQUIRED
t0 TIME
PRIMARY disconnected

87 Copyright © 2022, Oracle and/or its affiliates


Multi-Instance Redo Apply (MIRA)

88 Copyright © 2022, Oracle and/or its affiliates


Single-Instance Redo Apply (SIRA) Secondary Site

Primary Site
Standby Instance 1
Primary Instance 1
Thread 1 Redo
NFS/TT RFS MRP

Primary Instance 2
Thread 2 Redo SRL
NFS/TT RFS

Primary Instance 3
Thread 3 Redo SRL
NFS/TT RFS

SRL

89 Copyright © 2022, Oracle and/or its affiliates


Single-Instance Redo Apply (SIRA)
• The MRP and its redo apply servers run on one node of a Physical Standby RAC.
• Single-Instance Redo apply performance generally meets all use cases.
• Before considering Multi-Instance apply, make sure you apply the best practices for Redo Apply

OLTP Workload
Standby 800
Apply
Rate 600
MB/sec 400
OLTP Workload
200
0

90 Copyright © 2022, Oracle and/or its affiliates


Multi-Instance Redo Apply Secondary Site

Primary Site
Standby Instance 1
Primary Instance 1
Thread 1 Redo
NFS/TT RFS Coordinator MRP

Primary Instance 2 Instance 2


Thread 2 Redo SRL
NFS/TT RFS
MRP

Primary Instance 3
Thread 3 Redo SRL Instance 3
NFS/TT RFS
MRP

SRL

91 Copyright © 2022, Oracle and/or its affiliates


Multi-Instance Redo Apply

• Utilizes all RAC nodes on the Standby database to parallelize recovery


• OLTP workloads on Exadata show great scalability
• Generally 30% improvement or more, depending on the workload

7000

6000

5000

Standby 4000 5000 Batch


Apply 3000
Rate
2000 2752
MB/sec
1000 1400
OLTP
700 1480
380 740
0 190
1 Instance 2 Instances 4 Instances 8 Instances

92 Copyright © 2022, Oracle and/or its affiliates


When to consider Multi-Instance Redo Apply

• IO bottlenecks and database wait events affecting SIRA, will affect MIRA as well!

• Consider MIRA only if SIRA cannot meet the SLA

• We recommend Oracle Database 19.13 or higher (it includes critical fixes for MIRA)

• CPU-bounded apply coordinator or worker are the best indicators that MIRA is needed

• Oracle Data Guard Configuration Best Practices


https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/configure-and-deploy-oracle-data-guard.html#GUID-97769612-4980-42C2-A28C-4C5E49FE2824

• Redo Apply Troubleshooting and Tuning


https://docs.oracle.com/en/database/oracle/oracle-database/19/haovw/tune-and-troubleshoot-oracle-data-guard.html#GUID-E8C27979-9D37-4899-9306-A5AE2B5CF6C0

93 Copyright © 2022, Oracle and/or its affiliates


How to Enable Multi-Instance Redo Apply

With the Data Guard Broker:


DGMGRL> edit database <standby> set property ApplyInstances=<#|ALL>;

From SQL*Plus:
SQL> alter database recover managed standby database disconnect from session instances <#|ALL>;

Entries in the alert log:


ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION INSTANCES ALL
2018-05-23T11:37:09.937690+01:00
Attempt to start background Managed Standby Recovery process (<db_unique_name>)

2018-05-23T11:37:15.111518+01:00
Started logmerger process on instance id 1
Started logmerger process on instance id 2
Starting Multi Instance Redo Apply (MIRA) on 2 instances

2018-05-23T11:37:16.027775+01:00
Started 24 apply slaves on instance id 1
2018-05-23T11:37:16.545221+01:00
Started 24 apply slaves on instance id 2
94 Copyright © 2022, Oracle and/or its affiliates
Multi-Instance Redo Apply on Exadata

• Exadata prerequisites to enable MIRA

Exadata Systems RDBMS version Steps


With PMEM 19.13 and higher No additional steps

Without PMEM 19.13 and higher Set dynamic parameter on all instances:
"_cache_fusion_pipelined_updates_enable"=FALSE (*)

Apply Patch 31962730 and set dynamic parameter on all instances:


Any Exadata System 19.12 and lower (*)
"_cache_fusion_pipelined_updates_enable"=FALSE

(*) MIRA can recover only redo generated with the “_cache_fusion_pipelined_updates_enable” set to FALSE

• Using ExaWatcher Charts to Monitor Exadata Database Machine Performance


https://docs.oracle.com/en/engineered-systems/exadata-database-machine/dbmmn/exadata-general-maintenance.html#GUID-5AEB3139-333D-
453F-91D6-8EB09CB6E6EB

95 Copyright © 2022, Oracle and/or its affiliates


Tuning Multi-Instance Redo Apply

• Tune Redo Apply by evaluating Database wait events


• If recovery apply pending and/or recovery receive buffer free are among the top wait events:
• Incrementally increase _mira_num_receive_buffers and _mira_num_local_buffers by 100
• Additionally set “_mira_rcv_max_buffers”=10000
• The additional memory requirements for each participating MIRA RAC instance:
(_mira_num_receive_buffers + _mira_num_local_buffers) * (#instances * 2MB)

• If parallel recovery change buffer free is among the top wait events:
• Increase _change_vector_buffers to 2 or 4.

96 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard Overview

97 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard (ADG) ADG

Primary Secondary
Site Site • Advanced Disaster Recovery

• Active-active*
- Queries, reports, backups
DML Redirection - Occasional updates (19c)
Zero data loss at
Any distance
- Assurance of knowing system is operational

Rolling Database
• Automatic block repair
Upgrades

Automatic Block Repair • Application Continuity

• Zero data loss across any distance


Data Guard Broker
(Enterprise Manager Cloud Control or DGMGRL)
Offload read mostly Offload fast
• Many other features
workload to open incremental
standby database backups

https://www.oracle.com/database/technologies/high-availability/dataguard-activedataguard-demos.html
98 Copyright © 2022, Oracle and/or its affiliates
ADG
Active Data Guard
Option of Oracle Database for Advanced Capabilities and Protection

Data Protection High Availability Performance and ROI

Zero data loss at any Extreme throughput - supports


distance Automatic block repair
all workloads
Real-time cascade Automated rolling database Dual-purpose standby for
maintenance development and test
Automatic Block Repair Application continuity Integrated management
Service management for Offload network compression
replicated databases
Intelligent load balancing for
Rolling Upgrade replicated databases
Active Standby DML redirection

99 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard
Real-Time Query

100 Copyright © 2022, Oracle and/or its affiliates


ADG
Real-Time Query
Read-only Standby while Recovery is Active

Activation

Primary Secondary With Data Guard Broker:


Site Site SQL> ALTER DATABASE OPEN;

Without Data Guard Broker:


RW RW RWRW RO RO RORO ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
ALTER DATABASE OPEN;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;

101 Copyright © 2022, Oracle and/or its affiliates


ADG
Real-Time Query Apply Lag Limit
Full read consistency at the session level

The primary never waits


Primary -- log transport to the standby must be synchronous
for remote apply
DGMGRL> edit database prim
set property LogXptMode='SYNC';

-- write on the primary


insert into emp values (...);
SYNC commit;

-- read on the standby


-- wait once until current SCN is applied
SRL alter session sync with primary;

-- or always have READ COMMITTED in the session


alter session set standby_max_data_delay=0;

select first_name from emp where ...;


Standby

102 Copyright © 2022, Oracle and/or its affiliates


ADG
Offload Read-Only Workloads
Increase Performance and ROI – Standby is a Production System

Any read-only workload

Data extracts and backups


Production Offload to
Active Data Guard Standby
EBS - Oracle Reports
PeopleSoft - PeopleTools
Siebel CRM
OBIEE, Hyperion, TopLink

103 Copyright © 2022, Oracle and/or its affiliates


ADG
Standby Offload Increases Performance for all Workloads
Bring Idle Capacity Online

3500 Double read-write throughput


2,610
3000 standby
Increase read-only throughput by 70%
2500
TPS
2000 Eliminate contention between read-write
and read-only workloads
1,530
1500
1000
R/O 500 630
R/W
0 290 primary

Primary Primary
Only and Standby
104 Copyright © 2022, Oracle and/or its affiliates
ADG
Real-Time Query
Not just Selects for your Application Workloads!

SQL Performance Analyzer Sequences

Oracle Database In-Memory Updates on ADG (DML Redirect) NEW in 19c

Global Temporary Tables Standby Result Cache preservation NEW in 21c

R/O Connections Preserved

105 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard
DML Redirection

106 Copyright © 2022, Oracle and/or its affiliates


ADG
Bigger Footprint of ADG Applications
DML on Active Data Guard

DML Re-direction is automatically performed from


an Active Data Guard standby to the primary without compromising ACID compliance

• New documented parameter ADG_REDIRECT_DML controls DML Redirection


• New alter system set ADG_REDIRECT_DML | alter session enable ADG_REDIRECT_DML
• New ADG_REDIRECT_PLSQL commands

Supported with Oracle Database 19c


Targeted for “Read-Mostly,
Occasional Updates” applications

107 Copyright © 2022, Oracle and/or its affiliates


ADG
DML Replication
Easy and ready to use

By default DMLs are not possible on the standby


SQL> update hr.employee set salary=salary+100 where employee_id=1;
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access

Enable DML redirection


SQL> alter session enable ADG_REDIRECT_DML;

DMLs work seamlessly


SQL> update hr.employee set salary=salary+100 where employee_id=1;
1 row updated.
SQL> commit;
Commit complete.

108 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard
Automatic Block Repair

109 Copyright © 2022, Oracle and/or its affiliates


ADG
Oracle Active Data Guard Automatic Block Repair
Transparently repairs corrupted blocks

• Oracle detects if a block is corrupted when reading it

• The corruption is automatically repaired using a good copy


• From the standby when the corruption is on the primary
• From the primary when the corruption is on the standby

110 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard
Fast Incremental Backup on Physical Standby

111 Copyright © 2022, Oracle and/or its affiliates


Fast Incremental Backup on Physical Standby
Enable the Block Change Tracking to speed up backups and avoid unnecessary I/O

Primary Standby

No Block Change Tracking on the Standby


BCT Incremental backups require full data file reads

112 Copyright © 2022, Oracle and/or its affiliates


ADG
Fast Incremental Backup on Physical Standby
Enable the Block Change Tracking to speed up backups and avoid unnecessary I/O

Block Change Tracking on the Standby:


Incremental backups read only the blocks modified since the last Level 0
Requires Active Data Guard

Primary Standby

BCT BCT

113 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard
Real-Time Cascade Standbys

114 Copyright © 2022, Oracle and/or its affiliates


Active Data Guard: up to 30 direct standbys and 253 total members ADG
Far Sync and Cascading Standby open endless possibilities

• Disaster Recovery
• Query & Reporting Offload
BACK • DML Redirection
UP • Snapshot Standby for tests
Snapshot
Standby • Rolling Maintenance
• Rolling Upgrade
Primary DML • Migration to new Hardware
Redirect
• Source for thin clones
BACK
• Backup Offload
UP
• GoldenGate Extract Offload
• Recovery Appliance
FAR
SYNC • Zero data loss at any distance

Thin Clones
115 Copyright © 2022, Oracle and/or its affiliates
ADG
Real-Time Cascade Standby
Offload multiple redo transports to a first-level standby

BOSTON
RedoRoutes=
(LOCAL : ( NASHUA SYNC PRIORITY=1, NEWYORK ASYNC PRIORITY=8, NEWARK ASYNC PRIORITY=8 ))
(NASHUA : NEWYORK ASYNC, NEWARK ASYNC ))

NASHUA
RedoRoutes=
(LOCAL : ( BOSTON SYNC PRIORITY=1, NEWYORK ASYNC PRIORITY=8, NEWARK ASYNC PRIORITY=8 ))
(BOSTON : NEWYORK ASYNC, NEWARK ASYNC ))
ALTERNATE

NEWYORK

NEWARK • Explicit “ASYNC” in the cascading member means “Real-Time Cascade”. Such configuration
requires Active Data Guard.
• If not specified, the redo is shipped at log switch.

116 Copyright © 2022, Oracle and/or its affiliates


ADG
Real-Time Cascade Standby
Offload multiple redo transports to a first-level standby

BOSTON
RedoRoutes=
(LOCAL : ( NASHUA SYNC PRIORITY=1, NEWYORK ASYNC PRIORITY=8, NEWARK ASYNC PRIORITY=8 ))
(NASHUA : NEWYORK ASYNC, NEWARK ASYNC ))

NASHUA
RedoRoutes=
(LOCAL : ( BOSTON SYNC PRIORITY=1, NEWYORK ASYNC PRIORITY=8, NEWARK ASYNC PRIORITY=8 ))
(BOSTON : NEWYORK ASYNC, NEWARK ASYNC ))

NEWYORK
ALTERNATE

NEWARK
• RedoRoutes is an efficient way to route the redo properly in any situation.
• The Broker takes care of all the complex LOG_ARCHIVE_DEST_n modifications.

117 Copyright © 2022, Oracle and/or its affiliates


Oracle Active Data Guard Far Sync

118 Copyright © 2022, Oracle and/or its affiliates


ADG
The Zero Data Loss Challenge
Trade-off Performance for Protection

ASYNC

PRIMARY STANDBY

SYNC is not an option with high network latency

119 Copyright © 2022, Oracle and/or its affiliates


ADG
Active Data Guard Far Sync
Zero Data Loss Protection at Any Distance

FAR ASYNC
SYNC Optional RedoCompression (*)
SYNC STANDBY

PRIMARY

* Requires Advanced Compression Option

120 Copyright © 2022, Oracle and/or its affiliates


ADG
Active Data Guard Far Sync
Trade-off Performance for Protection
Far Sync
Standby Site
• Requires ADG Option on the DB nodes
• The Far Sync nodes do not require licenses *
Primary Nearby Site
• Special instance:
ASYNC • No datafiles
FAR
• No Media Recovery
SYNC • Only control files, archives and standby logs
STANDBY • Up to 30 direct destinations
• Offload transport compression (Advanced Compression)
SYNC

SYNC
• Supports FSFO in MaxAvailavility
• Supports FSFO in MaxPerformance (new in 21c)
Primary Site Standby Nearby Site

ASYNC FAR Use different Datacenters or Availability Domains!


SYNC • Upon failover, the standby will fetch the very last
redo from the Far Sync
PRIMARY

* The Far Sync nodes do not require any license, provided that all the other nodes running databases in the configuration are
licensed with Enterprise Edition and Active Data Guard

121 Copyright © 2022, Oracle and/or its affiliates


ADG
Active Data Guard Far Sync
Use RedoRoutes for Far Sync High Availability

BOSTON
RedoRoutes=
(LOCAL : ( FS1 SYNC PRIORITY=1, FS2 SYNC PRIORITY=1, LONDON ASYNC PRIORITY=2 ))

FS1
FAR RedoRoutes=
ALTERNATE

(BOSTON : LONDON ASYNC))


SYNC
FS2
P1 FAR RedoRoutes=
SYNC (BOSTON : LONDON ASYNC))
LONDON

P2 In the example: Prepare two additional Far Sync instances for LONDON!
The Data Guard broker supports Far Sync instance creation starting with 21c.

122 Copyright © 2022, Oracle and/or its affiliates


ADG
Benefits and Downsides of Far Sync
When to consider Far Sync?

Benefits Downsides
• Increased performance for existing Sync configurations • Additional server(s) or VM(s) and components
• Increased protection for existing Async configurations • Local Far Sync instance might not prevent data loss in case of
• Zero Data Loss (Max Availability) across distant regions full site failure

• Fully integrated with the broker


• Automatic gap resolution through the Far Sync

123 Copyright © 2022, Oracle and/or its affiliates


Far Sync and Fast Start Failover
Which Fast Start Failover protection modes are compatible with Far Sync?

FSFO and FAR SYNC Maximum Performance Maximum Availability Maximum Protection

ASYNC ✓ (21c+) ✘ ✘
FAST SYNC ✘ ✓ ✘
SYNC ✘ ✓ ✘

FSFO without FAR SYNC Maximum Performance Maximum Availability Maximum Protection

ASYNC ✓ ✘ ✘
FAST SYNC ✘ ✓ ✘
SYNC ✘ ✓ ✓

FAR SYNC without FSFO Maximum Performance Maximum Availability Maximum Protection

ASYNC ✓ ✘ ✘
FAST SYNC ✘ ✓ ✘
SYNC ✘ ✓ ✘

https://docs.oracle.com/en/database/oracle/oracle-database/21/dgbkr/using-data-guard-broker-to-manage-
switchovers-failovers.html#GUID-7423C774-27DF-49F9-BB43-7D547BCE7762
124 Copyright © 2022, Oracle and/or its affiliates
Oracle Active Data Guard
Rolling Maintenance and Upgrades

125 Copyright © 2022, Oracle and/or its affiliates


Solutions for Database Rolling Maintenance and Upgrades

Manual DBMS_ROLLING GoldenGate

Part of Enterprise Edition Requires Active Data Guard Requires GoldenGate


Source >= 11.1.0.7 Source >= 12.1.0.2 Source >= 11.2.0.4 (for OCI GG)
Manual approach Automated Manual approach
Limited feature support Comprehensive feature support Best feature support
Fallback mechanism

Using SQL Apply to Upgrade the Oracle Database


https://docs.oracle.com/en/database/oracle/oracle-database/19/sbydb/using-sql-apply-to-perform-rolling-upgrade.html

Using DBMS_ROLLING to Perform a Rolling Upgrade


https://docs.oracle.com/en/database/oracle/oracle-database/19/sbydb/using-DBMS_ROLLING-to-perform-rolling-upgrade.html

Overview of Steps for Upgrading Oracle Database Using Oracle GoldenGate


https://docs.oracle.com/en/database/oracle/oracle-database/19/upgrd/converting-databases-upgrades.html#GUID-8E029631-8265-497C-983B-B8A4ACD47B98

126 Copyright © 2022, Oracle and/or its affiliates


ADG
Active Data Guard Rolling Maintenance and Upgrades
Using DBMS_ROLLING package

WHILE THE USERS


ACCESS THIS UPGRADE THIS

THEN SWITCHOVER

PRIMARY TRANSIENT
LOGICAL STANDBY

• Use a transient logical standby database to upgrade with very little downtime.
• The only downtime is as little as it takes to perform a switchover.

127 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.INIT_PLAN phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

Trailing Group Standby (TGS)

DB1
Trailing Group
REDO

Trailing Group Master (TGM) -- check DBA_ROLLING_UNSUPPORTED for incompatible data types
-- initialize the plan and set the future primary
DB2 DBMS_ROLLING.INIT_PLAN(future_primary=>'DB3');
REDO

-- add the required standbys to the TRAILING GROUP


Leading Group Master (LGM) DBMS_ROLLING.SET_PARAMETER('DB1','MEMBER','TRAILING');
-- add the required standbys to the LEADING GROUP
REDO

DB3 DBMS_ROLLING.SET_PARAMETER('DB4','MEMBER',LEADING');
Leading Group
Leading Group Standby (LGS)

DB4

128 Copyright © 2022, Oracle and/or its affiliates


The DBMS_ROLLING parameters
INIT BUILD START UPGRADE SWITCHOVER FINISH

ACTIVE_SESSIONS_TIMEOUT MEMBER
ACTIVE_SESSIONS_WAIT READY_LGM_LAG_TIME
BACKUP_CONTROLFILE READY_LGM_LAG_TIMEOUT
DGBROKER READY_LGM_LAG_WAIT
DICTIONARY_LOAD_TIMEOUT SWITCH_LGM_LAG_TIME
DICTIONARY_LOAD_WAIT SWITCH_LGM_LAG_TIMEOUT
DICTIONARY_PLS_WAIT_INIT SWITCH_LGM_LAG_WAIT
DICTIONARY_PLS_WAIT_TIMEOUT SWITCH_LGS_LAG_TIME
EVENT_RECORDS SWITCH_LGS_LAG_TIMEOUT
FAILOVER SWITCH_LGS_LAG_WAIT
GRP_PREFIX UPDATED_LGS_TIMEOUT
IGNORE_BUILD_WARNINGS UPDATED_LGS_WAIT
IGNORE_LAST_ERROR UPDATED_TGS_TIMEOUT
LAD_ENABLED_TIMEOUT UPDATED_TGS_WAIT
LOG_LEVEL

129 Copyright © 2022, Oracle and/or its affiliates


The DBMS_ROLLING parameters
INIT BUILD START UPGRADE SWITCHOVER FINISH

Example:

-- Activate full logging


exec DBMS_ROLLING.SET_PARAMETER (scope=>null, name=>'LOG_LEVEL', value=>'FULL');

-- Wait for the SQL Apply Lag to go below 1 minute before initiating the switchover
exec DBMS_ROLLING.SET_PARAMETER('SWITCH_LGM_LAG_WAIT', '1');
exec DBMS_ROLLING.SET_PARAMETER('SWITCH_LGM_LAG_TIME', '60');

130 Copyright © 2022, Oracle and/or its affiliates


Final touches before starting
INIT BUILD START UPGRADE SWITCHOVER FINISH

$ # The standby must be mounted


$ srvctl stop database -d DB3
$ srvctl start database -d DB3 -o mount

SQL> -- The PDBs must be open


SQL> alter pluggable database all open;

DGMGRL> # no FSFO or MaxProtection


DGMGRL> disable fast_start failover
DGMGRL> edit configuration set protection mode as MaxAvailability;

131 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.BUILD_PLAN phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

DB1
REDO

-- build the plan


DBMS_ROLLING.BUILD_PLAN();

DB2 -- check for any errors or warnings


REDO

SELECT * FROM DBA_ROLLING_EVENTS;


-- review the plan
SELECT * FROM DBA_ROLLING_PLAN ORDER BY INSTID;
REDO

DB3

DB4

132 Copyright © 2022, Oracle and/or its affiliates


ADG
The DBMS_ROLLING.BUILD_PLAN phase
1 START Notify Data Guard broker that DBMS_ROLLING has started 44 START Log pre-switchover instructions to events table
2 START Notify Data Guard broker that DBMS_ROLLING has started 45 START Record start of user upgrade of DB3
3 START Verify database is a primary 46 SWITCH Verify database is in OPENRW mode
4 START Verify MAXIMUM PROTECTION is disabled 47 SWITCH Record completion of user upgrade of DB3
5 START Verify database is a physical standby 48 SWITCH Scan LADs for presence of DB2 destination
6 START Verify physical standby is mounted 49 SWITCH Test if DB2 is reachable using configured TNS service
7 START Verify future primary is configured with standby redo logs 50 SWITCH Call Data Guard broker to enable redo transport to DB3
8 START Verify server parameter file exists and is modifiable 51 SWITCH Archive all current online redo logs
9 START Verify server parameter file exists and is modifiable 52 SWITCH Archive all current online redo logs
10 START Verify Data Guard broker configuartion is enabled 53 SWITCH Stop logical standby apply
11 START Verify Data Guard broker configuartion is enabled 54 SWITCH Start logical standby apply
12 START Verify Fast-Start Failover is disabled 55 SWITCH Wait until apply lag has fallen below 600 seconds
13 START Verify Fast-Start Failover is disabled 56 SWITCH Notify Data Guard broker that switchover to logical standby database is starting
14 START Verify fast recovery area is configured 57 SWITCH Log post-switchover instructions to events table
15 START Verify available flashback restore points 58 SWITCH Switch database to a logical standby
16 START Verify fast recovery area is configured 59 SWITCH Notify Data Guard broker that switchover to logical standby database has completed
17 START Verify available flashback restore points 60 SWITCH Wait until end-of-redo has been applied
18 START Stop media recovery 61 SWITCH Archive all current online redo logs
19 START Drop guaranteed restore point DBMSRU_INITIAL 62 SWITCH Notify Data Guard broker that switchover to primary is starting
20 START Create guaranteed restore point DBMSRU_INITIAL 63 SWITCH Switch database to a primary
21 START Drop guaranteed restore point DBMSRU_INITIAL 64 SWITCH Notify Data Guard broker that switchover to primary has completed
22 START Create guaranteed restore point DBMSRU_INITIAL 65 SWITCH Enable compatibility advance despite presence of GRPs
23 START Start media recovery 66 SWITCH Synchronize plan with new primary
24 START Verify media recovery is running 67 FINISH Reduce to a single instance for FINISH
25 START Verify user_dump_dest has been specified 68 FINISH Verify only a single instance is active
26 START Backup control file to rolling_change_backup.f 69 FINISH Verify database is mounted
27 START Verify user_dump_dest has been specified 70 FINISH Flashback database
28 START Backup control file to rolling_change_backup.f 71 FINISH Convert into a physical standby
29 START Get current supplemental logging on the primary database 72 FINISH Verify database is open
30 START Get current redo branch of the primary database 73 FINISH Save the DBID of the new primary
31 START Wait until recovery is active on the primary's redo branch 74 FINISH Save the logminer session start scn
32 START Reduce to a single instance if database is a RAC 75 FINISH Wait until transient logical redo branch has been registered
33 START Verify only a single instance is active if future primary is RAC 76 FINISH Start media recovery
34 START Stop media recovery 77 FINISH Wait until apply/recovery has started on the transient branch
35 START Execute dbms_logstdby.build 78 FINISH Wait until upgrade redo has been fully recovered
36 START Convert into a transient logical standby 79 FINISH Prevent compatibility advance if GRPs are present
37 START Open database including instance-peers if RAC 80 FINISH Prevent compatibility advance if GRPs are present
38 START Verify logical standby is open read/write 81 FINISH Drop guaranteed restore point DBMSRU_INITIAL
39 START Get redo branch of transient logical standby 82 FINISH Drop guaranteed restore point DBMSRU_INITIAL
40 START Get reset scn of transient logical redo branch 83 FINISH Purge logical standby metadata from database if necessary
41 START Configure logical standby parameters 84 FINISH Notify Data Guard broker that DBMS_ROLLING has finished
42 START Start logical standby apply 85 FINISH Notify Data Guard broker that DBMS_ROLLING has finished
43 START Enable compatibility advance despite presence of GRPs 86 FINISH Restore Supplemental Logging

133 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.START phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

GRP

DB1
REDO

-- start the plan


DBMS_ROLLING.START_PLAN();
GRP Logstdby

DB2
• Creates the Guaranteed Restore Point (GRP)
REDO

• Builds the logical standby metadata (dbms_logstdby.build)


GRP
REDO

DB3

GRP

DB4

134 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.START phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

GRP

DB1
REDO

-- start the plan


DBMS_ROLLING.START_PLAN();
GRP Logstdby

DB2
• Creates the Guaranteed Restore Point (GRP)
SQL

• Builds the LogMiner directory (dbms_logstdby.build)


GRP
• Converts the LGM to Logical Standby
REDO

DB3
• Starts SQL Apply
• With a configuration composed of 4 databases,
GRP
the LGM and TGM are still protected by a physical standby
DB4

135 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.START phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

DGMGRL> show configuration;


GRP

DB1 Configuration - geneva


REDO

Protection Mode: MaxAvailability


GRP Logstdby
Members:
DB2 DB1 - Primary database
SQL

DB3 - Physical standby database


Warning: ORA-16854: apply lag could not be
GRP determined
REDO

DB3
Fast-Start Failover: DISABLED

GRP Configuration Status:


ROLLING DATABASE MAINTENANCE IN PROGRESS
DB4

136 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.START phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

DGMGRL> show database DB3


GRP
...
DB1 Role: PHYSICAL STANDBY
REDO

Intended State: APPLY-ON


Transport Lag: 0 seconds (computed 0 seconds
GRP Logstdby
ago)
DB2 Apply Lag: 3 minutes 18 seconds (computed 0
SQL

seconds ago)
...
GRP Database Warning(s):
ORA-16866: database converted to transient logical
REDO

DB3
standby database for rolling database maintenance

GRP Database Status:


WARNING
DB4

137 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.START phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

-- check the status of the SQL apply:


GRP
SQL> select * from V$LOGSTDBY_PROGRESS;
DB1
REDO

-- use SQL apply commands if you need


SQL> alter database start logical standby apply immediate;
GRP Logstdby

DB2 -- check for logical standby error messages


SQL

SQL> select * from DBA_LOGSTDBY_EVENTS


2> order by event_timestamp;
GRP

22-NOV-21 06.41.12 DML on "AUDSYS"."AUD$UNIFIED"


REDO

DB3
ORA-16129: unsupported DML encountered
22-NOV-21 06.41.13 truncate table wri$_adv_addm_pdbs
GRP ORA-16247: DDL skipped on internal schema
DB4

138 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The Upgrade/Maintenance phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

GRP

DB1
REDO

• Do the maintenance on the Leading Group Master

GRP Logstdby -- e.g. upgrade to a major version with AutoUpgrade


DB2
$ java -jar autoupgrade.jar -config CDB1.cfg -mode deploy
SQL

• This is out of DBMS_ROLLING scope (it is a manual step)


GRP
+1 • Don't forget to align the Leading Group Standbys if necessary
REDO

DB3
• Use it for any major maintenance that requires longer downtimes
(change of physical layout, structure changes, offline operations)
GRP
+1
DB4

139 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.SWITCHOVER phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

GRP

DB1
REDO

-- switchover to the upgraded database


GRP DBMS_ROLLING.SWITCHOVER()

DB2
• Depending on the source version and HA configuration,
the old connections get FAN notifications and drain automatically
GRP Logstdby
+1 • New connections go to the new primary.
REDO

DB3
Application downtime is minimal.

GRP
+1
DB4

140 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.SWITCHOVER phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

GRP

DB1
• Start the Trailing Group members with the new binaries (manual)
REDO

GRP -- run the final part of the plan


DBMS_ROLLING.FINISH_PLAN()
DB2

GRP Logstdby • Flashes back the Trailing Group Master and Standby to the GRP
+1
REDO

DB3

GRP
+1
DB4

141 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.SWITCHOVER phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

GRP
+1
DB1
• Start the Trailing Group members with the new binaries (manual)
REDO

GRP -- run the final part of the plan


+1
DBMS_ROLLING.FINISH_PLAN()
DB2
REDO

GRP Logstdby • Flashes back the Trailing Group Master and Standby to the GRP
+1
• Converts the Trailing Group Master to a physical standby
REDO

DB3
• Starts redo apply and catches up with the primary

GRP • Drops the guaranteed restore points and logical standby metadata
+1
DB4

142 Copyright © 2022, Oracle and/or its affiliates


User sessions
+1 Upgraded
The DBMS_ROLLING.SWITCHOVER phase Primary
Physical Standby
INIT BUILD START UPGRADE SWITCHOVER FINISH
Logical Standby

+1
DB1
REDO

+1
-- destroy the plan to clean up everything
DB2 DBMS_ROLLING.DESTROY_PLAN()
REDO

+1
REDO

DB3

+1
DB4

143 Copyright © 2022, Oracle and/or its affiliates


ADG
DBMS_ROLLING catalog views

Evaluate DBA_ROLLING_UNSUPPORTED Check here for unsupported data types!

Initialize DBA_ROLLING_PARAMETERS Get the current parameters before building

DBA_ROLLING_DATABASES
Build
DBA_ROLLING_PLAN Verify the plan before and during the execution

DBA_ROLLING_EVENTS Warning and errors are visible here

Monitor DBA_ROLLING_STATISTICS
DBA_ROLLING_STATUS

144 Copyright © 2022, Oracle and/or its affiliates


DBMS_ROLLING points of attention

Do not create the logical standby on the same server as the primary database

Supplemental logging is enabled automatically which introduces an overhead


and increases the amount of redo generated

When supplemental logging is enabled all DML cursors are invalidated

Not all data types and partitioning types are supported

For optimal performance all tables should have primary keys or unique keys

145 Copyright © 2022, Oracle and/or its affiliates


ADG
Important DBMS_ROLLING milestones
The driver is the SOURCE database!

12.1 • First version of DBMS_ROLLING for upgrades from 12.1 to higher versions
SOURCE VERSION

• Integration with the Data Guard broker


12.2 • Services, roles changes, and instances are managed automatically by Clusterware
• FAN events for Clusterware-backed databases
• Support for Identity columns

• FAN events without Clusterware


21c • Support for JSON datatype

146 Copyright © 2022, Oracle and/or its affiliates


ADG
DBMS_ROLLING and client failover

DBMS_ROLLING.SWITCHOVER Broker + OCW Broker Only

12.1 Broker Not supported Broker Not supported

12.2 FAN events No FAN events

19c FAN events No FAN events

21c FAN events FAN events

AC/TAC support is in the roadmap

147 Copyright © 2022, Oracle and/or its affiliates


ADG
DBMS_ROLLING – Read More

Using DBMS_ROLLING to Perform a Rolling Upgrade


https://docs.oracle.com/en/database/oracle/oracle-database/19/sbydb/using-DBMS_ROLLING-to-perform-
rolling-upgrade.html

DBMS_ROLLING - PL/SQL Packages and Types Reference


https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_ROLLING.html#GUID-097F1B39-
E623-43B5-BA30-DF377BFE05CF

Automated Database Upgrades using Oracle Active Data Guard and DBMS_ROLLING
https://www.oracle.com/technetwork/database/availability/database-upgrade-dbms-rolling-4126957.pdf

Oracle Database Rolling Upgrades (without DBMS_ROLLING)


https://www.oracle.com/technetwork/database/availability/database-rolling-upgrade-3206539.pdf

148 Copyright © 2022, Oracle and/or its affiliates


DBMS_ROLLING – Read More

MOS Notes:
• Transient Rolling Upgrade Using DBMS_ROLLING - Beginners Guide
• Rolling upgrade using DBMS_ROLLING - Complete Reference (Doc ID 2086512.1)
• MAA Whitepaper: SQL Apply Best Practices (Doc ID 1672310.1)
• Step by Step How to Do Swithcover/Failover on Logical Standby Environment (Doc ID 2535950.1)
• How To Skip A Complete Schema From Application on Logical Standby Database (Doc ID 741325.1)
• How to monitor the progress of the logical standby (Doc ID 1296954.1)
• How To Reduce The Performance Impact Of LogMiner Usage On A Production Database (Doc ID 1629300.1)
• Handling ORA-1403 ora-12801 on logical standby apply (Doc ID 1178284.1)
• Troubleshooting Example - Rolling Upgrade using DBMS_ROLLING (Doc ID 2535940.1)
• DBMS Rolling Upgrade Switchover Fails with ORA-45427: Logical Standby Redo Apply Process Was Not Running (Doc ID
2696017.1)
• SRDC - Collect Logical Standby Database Information (Doc ID 1910065.1)
• MRP fails with ORA-19906 after Flashback of Transient Logical Standby used for Rolling Upgrade (Doc ID 2069325.1)
• What Causes High Redo When Supplemental Logging is Enabled (Doc ID 1349037.1)

149 Copyright © 2022, Oracle and/or its affiliates


Features introduced in 19c

150 Copyright © 2022, Oracle and/or its affiliates


Recap of Data Guard 19c new features

•Dynamically Change Oracle Data Guard Broker Fast-Start Failover Target

•Simplified Database Parameter Management in Oracle Data Guard Broker

•Observe-only Mode for Oracle Data Guard Broker's Fast-Start Failover

•Propagate Restore Points from Primary to Standby Site

•Flashback Standby Database When Primary Database is Flashed Back

•Oracle Data Guard Multi-Instance Redo Apply Works with the In-Memory Column Store

•Active Data Guard DML Redirection

151 Oracle and/or its affiliates


Copyright © 2022,
New 21c features for Data Guard and Broker

152 Copyright © 2022, Oracle and/or its affiliates


NEW IN
ADG
Standby Result Cache preservation 21c
Keep the Result Cache warm after a role transition

• Real-Time Query supports the Result Cache for


queries run on the standby database (tables only)

Result Cache improves query performance for


READ/WRITE

READ ONLY
recurring queries and reduces resource usage (CPU, I/O)

RESULT
CACHE

Primary Standby
Database Database

ALTER TABLE employee RESULT_CACHE (STANDBY ENABLE)

153 Copyright © 2022, Oracle and/or its affiliates


NEW IN
ADG
Standby Result Cache preservation 21c
Keep the Result Cache warm after a role transition

PRESERVED • Real-Time Query supports the Result Cache for


CONNECTIONS queries run on the standby database (tables only)

• Result Cache improves query performance for

READ ONLY
recurring queries and reduces resource usage (CPU, I/O)

• In 21c, after a role transition (switchover or failover),


the Result Cache is preserved
• Query performance not impacted
RESULT
• No cache warm-up required
CACHE
SWITCHOVER

Standby Primary
Database Database
(old primary) (old standby) PRESERVED
RESULT CACHE

154 Copyright © 2022, Oracle and/or its affiliates


NEW IN

Data Guard Broker Client Side Standardized Directory Structure 21c


A single environment variable to define all the locations

New Environment Variable

$DG_ADMIN/
Shared across multiple configurations (observer.ora)
admin/

config_ConfigurationSimpleName/ One subdirectory per configuration

log/ observer_hostname.log
dat/ fsfo_hostname.dat
callout/ fsfocallout.dat & callout files

New configuration property. It defaults to the Configuration Name

Location of Client-side Broker Files


https://docs.oracle.com/en/database/oracle/oracle-database/21/dgbkr/using-data-guard-broker-to-manage-switchovers-failovers.html#GUID-0C8473F6-33B5-479F-9208-
9CA651F1B483

155 Copyright © 2022, Oracle and/or its affiliates


Fast-Start Failover callouts NEW IN

21c
Execute custom actions before and after the automatic failover occurs

$ cat $DG_ADMIN/config_ConfigName/callout/fsfocallout.ora

Primary Observer Secondary


Site Site Site # The pre-callout script is run before failover
Observer FastStartFailoverPreCallout=fsfo_precallout.sh
FastStartFailoverPreCalloutTimeout=1200
FastStartFailoverPreCalloutSucFileName=fsfo_precallout.suc
FastStartFailoverPreCalloutErrorFileName=precallout.err
FastStartFailoverActionOnPreCalloutFailure=STOP

# The post-callout script is run after failover succeeds


1 fsfo_precallout.sh FastStartFailoverPostCallout=fsfo_postcallout.sh
$
2 Failover

3 fsfo_postcallout.sh

156 Copyright © 2022, Oracle and/or its affiliates


NEW IN

Fast Start Failover Configuration Validation 21c


Ensure everything is configured properly for the automatic failover

DGMGRL> VALIDATE FAST_START FAILOVER;


Fast-Start Failover: Enabled in Potential Data Loss Mode
Protection Mode: MaxPerformance
Observer Primary: North_Sales
Active Target: South_Sales

Fast-Start Failover Not Possible:


Fast-Start Failover observer not started

Post Fast-Start Failover Issues:


Flashback database disabled for database ‘dgv1’
VALIDATE

Other issues:
FastStartFailoverThreshold may be too low for RAC databases.

Fast-start failover callout configuration file "fsfocallout.ora" has


VALIDATE VALIDATE the following issues:
Invalid lines
foo=foo
Data Guard Broker The specified file "./precallout" contains a path.

157 Copyright © 2022, Oracle and/or its affiliates


NEW IN
ADG
Data Guard Broker Far Sync Instance Creation 21c
One step further automated by the broker

Standby Site DGMGRL> CREATE FAR_SYNC bostonfs


AS CONNECT IDENTIFIER IS "bostonfs_conn_str"
PARAMETER_VALUE_CONVERT "boston","bostonfs"
Data Guard Broker
SET LOG_FILE_NAME_CONVERT "boston","bostonfs"
SET DB_RECOVERY_FILE_DEST "$ORACLE_HOME/dbs/"
SET DB_RECOVERY_FILE_DEST_SIZE "100G"
RESET UNDO_TABLESPACE;
STANDBY

• Automated SPFILE and controlfile creation


• The Far Sync is created, started and added to the
Primary Nearby Site configuration
Primary Site

FAR
SYNC
PRIMARY

158 Copyright © 2022, Oracle and/or its affiliates


NEW IN

Fast Start Failover Lag Allowance in Max Availability Mode 21c


Choose to failover in Max Availability mode even if the standby is lagging behind

Default behavior when primary goes ASYNC Optional behavior since 21c

NetTimeout
NetTimeout FastStartFailoverLagLimit

FastStartFailoverThreshold
SCN
SCN
STALL

DATA LOSS
Potential LAG
STALL Potential LAG
STALL No Failover

Primary Primary
Standby Standby

t0 TIME t0 TIME

159 Copyright © 2022, Oracle and/or its affiliates


NEW IN

Automatic Primary Database Preparation 21c


Faster and easier creation of Data Guard environments
DGMGRL> PREPARE DATABASE FOR DATA GUARD
SPFILE CREATION WITH DB_UNIQUE_NAME IS boston
DB_RECOVERY_FILE_DEST IS "+FRA"
ARCHIVELOG DB_RECOVERY_FILE_DEST_SIZE is "400G"
DG_BROKER_CONFIG_FILE1 IS "+DATA/BOSTON/dg1.dat"
FORCE LOGGING
DG_BROKER_CONFIG_FILE2 IS "+FRA/BOSTON/dg2.dat";
FLASHBACK ON
• If the parameters are good enough, they are not modified
DELETION POLICY
• It restarts the database for:
BOSTON DB_UNIQUE_NAME • Changes to static parameters
• Enabling the Archivelog mode
STANDBY LOGS

PARAMETERS
DB_FILES = 1024
LOG_BUFFER = 256M
DB_BLOCK_CHECKSUM = TYPICAL
DB_LOST_WRITE_PROTECT = TYPICAL
DB_FLASHBACK_RETENTION_TARGET = 120
Data Guard Broker PARALLEL_THREADS_PER_CPU = 1
STANDBY_FILE_MANAGEMENT = AUTO
DG_BROKER_START = TRUE

160 Copyright © 2022, Oracle and/or its affiliates


NEW IN
ADG
Pluggable Database Recovery Isolation 21c
Simplified PDB cloning in Data Guard configurations

Primary Site 12.1.0.1 Standby Site 12.1.0.1

plug pdb

?
Online
Redo Logs

SGA LGWR

REDO
BUFFER
NSS/
RFS MRP
TT
Primary Standby
Redo Logs
Standby
Database Database

161 Copyright © 2022, Oracle and/or its affiliates


NEW IN
ADG
Pluggable Database Recovery Isolation 21c
Simplified PDB cloning in Data Guard configurations

Primary Site 12.1.0.2 Standby Site 12.1.0.2


plug pdb
standbys=none

MANUAL RESTORE VIA SERVICE OR OTHER MEANS

Online
Redo Logs PDB RECOVERY
REQUIRES TEMPORARY
SGA LGWR STOP OF CDB RECOVERY

REDO
BUFFER
NSS/
RFS MRP
TT
Primary Standby
Redo Logs
Standby
Database Database

162 Copyright © 2022, Oracle and/or its affiliates


NEW IN
ADG
Pluggable Database Recovery Isolation 21c
Simplified PDB cloning in Data Guard configurations

Primary Site 21.0.0.0 Standby Site 21.0.0.0

plug pdb

MANUAL RESTORE VIA SERVICE OR OTHER MEANS

Online PDB RECOVERY IS


Redo Logs
TRANSPARENT TO THE
ONGOING RECOVERY
SGA LGWR

REDO
BUFFER
NSS/
RFS MRP
TT
Primary Standby
Redo Logs
Standby
Database Database

163 Copyright © 2022, Oracle and/or its affiliates


NEW IN

ORDS REST API for Data Guard management 21c


Ready for modern DevOps deployment

New in ORDS 21.4 for 21c databases


POST /database/dataguard/configuration/
{
"primary_connection_identifier": "site1-scan:1521/mydb", Create the configuration
"primary_database": "mydb_site1"
}

POST /database/dataguard/databases/
{
"connection_identifier": "site2-scan:1521/mydb", Add the standby databases
"database_name": "mydb_site2"
}

PUT /database/dataguard/configuration/
{
Enable the configuration
"operation": "ENABLE"
}

Oracle REST Data Services API - Data Guard REST Endpoints


https://docs.oracle.com/en/database/oracle/oracle-rest-data-services/21.4/orrst/api-data-guard.html

164 Copyright © 2022, Oracle and/or its affiliates


NEW IN

Data Guard management from SQLcl 21c


Everything under control with a single command-line tool
New in SQLcl 22.1 for 21c databases
SQL> help dg
DG
------
Run DG commands

DG ADD DATABASE "<database name>" AS CONNECT IDENTIIFIER IS <connect identifier> [ INCLUDE CURRENT DESTINATIONS ];
DG CREATE CONFIGURATION "<config_name>" AS PRIMARY DATABASE IS <database name> CONNECT IDENTIIFIER IS <connect_identifier>
[ INCLUDE CURRENT DESTINATIONS ];
DG DISABLE CONFIGURATION;
DG DISABLE { DATABASE | RECOVERY_APPLIANCE | FAR_SYNC | MEMBER } <member name>;
DG EDIT CONFIGURATION SET PROPERTY <property name> = '<property value>';
DG EDIT { DATABASE | RECOVERY_APPLIANCE | FAR_SYNC | MEMBER } <member name> SET PROPERTY <property name> = '<property value>';
DG ENABLE CONFIGURATION;
DG ENABLE { DATABASE | RECOVERY_APPLIANCE | FAR_SYNC | MEMBER } <member name>;
DG FAILOVER TO <database name> [IMMEDIATE];
DG REINSTATE DATABASE <database name>;
DG REMOVE CONFIGURATION [PRESERVE DESTINATIONS];
DG REMOVE { DATABASE | RECOVERY_APPLIANCE | FAR_SYNC | MEMBER } <name> [PRESERVE DESTINATIONS];
DG SHOW CONFIGURATION [<property name>];
DG SHOW DATABASE <database name> [<property name];
DG SWITCHOVER TO <database name> [WAIT [<timeout in seconds]];

165 Copyright © 2022, Oracle and/or its affiliates


NEW IN

Other changes in Oracle Data Guard 21c 21c

• Far Sync can now be used with Fast-Start Failover in Max Performance mode (ADG)
Primary can send redo asynchronously to Far Sync.

• The broker configuration now supports up to four observers


Before 21c, the limit was three observers.

• The PreferredObserverHosts property now supports priorities


Example: PreferredObserverHosts='host-a:1, host-b:2'

• Properties deprecated in 19c are now desupported


ArchiveLagTarget DbFileNameConvert LsbyPreserveCommitOrder
DataGuardSyncLatency LogArchiveFormat LsbyRecordAppliedDdl
LogArchiveMaxProcesses LogFileNameConvert LsbyRecordSkipDdl
LogArchiveMinSucceedDest LsbyMaxEventsRecorded LsbyRecordSkipErrors
LogArchiveTrace LsbyMaxServers LsbyParameters
StandbyFileManagement LsbyMaxSga

166 Copyright © 2022, Oracle and/or its affiliates


Questions & Answers

167 Copyright © 2022, Oracle and/or its affiliates


Thank you

168
168 Copyright © 2022, Oracle and/or its affiliates

You might also like