100% found this document useful (1 vote)

261 views13 pages

Advanced Replication Monitoring Presentation

This document discusses monitoring replication in MySQL databases. It begins with an introduction to replication concepts and then covers specific metrics and tools for monitoring replication status and health, including SHOW SLAVE STATUS, replication lag measured in seconds behind master, estimating replication capacity using the replication capacity index, and using tools like Maatkit's mk-heartbeat to monitor replication heartbeat. It emphasizes that there is no single best approach and monitoring needs to be tailored to each individual database environment and workload.

Uploaded by

Oleksiy Kovyrin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

261 views13 pages

Advanced Replication Monitoring Presentation

Uploaded by

Oleksiy Kovyrin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Advance Replication

Monitoring
Gerardo “Gerry” Narvaja
@seattlegaucho
Agenda
 Short Introduction
- Make sure we all speak the same language
 Scenarios
- What can go wrong and why it may be OK
 What To Look For / At
- What the variables mean
- Some pretty pictures
 Conclusion
Introduction
 What happens in the master …

B1 B2 B3 B4

C1 C2

D1 D2 D3 D4 D5 D6 D7 D8

TIME

 … in the slave it becomes …

D1 B1 D2 C1 D3 ...

 Replication is single-threaded
- IO Thread + SQL Thread
- No contention in the slave, it should run faster
Most Basic Monitoring
 SHOW SLAVE STATUS
- IO Thread
- Usually flags communication issues
- SQL Thread
- Usually flags data related issues
 Application code
- Maatkit: mk-heartbeat
- Simple monitoring can be implemented at the shell
- Implement your own heartbeat table
- Can be used to measure quality of data on the slaves

 If you don't have this basic monitoring in place, is like taking backups and not testing restores.
Replication Status
 SHOW SLAVE STATUS\G
Slave_IO_State: Waiting for master to send event
Master_Host: 10.55.197.108
Master_User: repl IO thread health status
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000447
Read_Master_Log_Pos: 673847271
Relay_Log_File: relay-bin.005771
Relay_Log_Pos: 673847416
Relay_Master_Log_File: mysql-bin.000447
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table: mysql.user,mysql.columns_priv,mysql.tables_priv,mysql.db,mysql.procs_priv,mysql.host
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0 SQL thread health status
Exec_Master_Log_Pos: 673847271
Relay_Log_Space: 673847506
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path: General health status
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Seconds Behind Master
 What happens when storing BLOBs and loading them in batches

B1 B2 B3 B4

C1 C2 C3 C3

 SBC is based on the timestamp for the transaction

- You can get crazy values based on the actual traffic
- Is this a bad situation?
- How do master_log_file and read_master_log_pos look like?
Bytes Behind Master
 Not provided directly
- On the master: SHOW MASTER STATUS, SHOW BINARY LOGS
show master status; show binary logs;
+------------------+-----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+-----------+--------------+------------------+
| mysql-bin.009734 | 153545495 | | |
+------------------+-----------+--------------+------------------+

+------------------+------------+
| Log_name | File_size |
+------------------+------------+
....
| mysql-bin.009730 | 1073764076 |
| mysql-bin.009731 | 1073772807 |
| mysql-bin.009732 | 1073761932 |
| mysql-bin.009733 | 1073756776 |
| mysql-bin.009734 | 153545495 |
+------------------+------------+

- On the slave: SHOW SLAVE STATUS

 Challenges
- Not easy way to get information from the master, but only need past files info
- Master position is a moving target
- ROW vs STATEMENT vs MIXED replication
- Example: Data purges → DELETE … FROM table WHERE ...
Replication Capacity Index
 Based on Estimating Replication Capacity blog by Percona
- Estimate the capacity of the slave to keep up with the master load
 Some bash scripts and real data
- #!/bin/bash
# Test RCI (Replication Capacity Index)
echo "$(date +%Y%m%d-%H%M%S) - Starting test"
mysql -e "stop slave"
sleep 600
mysql -e "start slave"
- while true; do
echo $(date +%Y%m%d-%H%M%S) - `mysql -e "show slave status\G" | grep -i seconds` >> test.log
sleep 10
done
RCI (cont)
 (CONT.)
- 20100729-205134 - Seconds_Behind_Master: 0
20100729-205140 - Starting test --> Initial timestamp
20100729-205144 - Seconds_Behind_Master: NULL
…
20100729-210134 - Seconds_Behind_Master: NULL
20100729-210144 - Seconds_Behind_Master: 161
20100729-210154 - Seconds_Behind_Master: 0 --> Last timestamp

Pause Start TS 1st TS SBM 2nd TS Diff 1 Diff 2 RCI

044 00:10:00 20:51:40 21:01:44 161 21:01:54 00:10:04 00:10:14 43.9

045 00:10:00 17:32:13 17:42:17 320 17:42:27 00:10:04 00:10:14 43.9

005 00:10:00 15:37:12 15:47:21 441 15:47:41 00:10:09 00:10:29 21.7

001 00:10:00 18:54:28 19:04:33 520 19:04:53 00:10:05 00:10:25 25.0

002 00:10:00 18:02:32 18:12:39 389 18:12:49 00:10:07 00:10:17 36.3

RCI (cont)
 Revisiting the replication delay chart
- Lt: Time while replication falls behind
- Rt: Time it takes for replication to catch up
- RCI = Rt/Lt
Replication Heartbeat
 Using Maatkit's mk-heartbeat
- Run on the active master with -update option
- Run on the slaves with -monitor or –check option
- Output similar to Linux' uptime

mk-heartbeat --monitor --host localhost --database maatkit

18s [ 2.85s, 0.57s, 0.19s ]
19s [ 3.17s, 0.63s, 0.21s ]
20s [ 3.50s, 0.70s, 0.23s ]
18s [ 3.80s, 0.76s, 0.25s ]
16s [ 4.07s, 0.81s, 0.27s ]

 Issues
- Highly sensitive to clocks in the master and slave(s) being in sync
- It has to run on the active master in master-to-master setups
- Better than seconds behind master
How To Monitor?
 There is no silver bullet
- Avoid noise alerts
 Know your monitoring system
- Tools: OpenNMS (SNMP), MONyog, MySQL Enterprise, home grown
- Don't rely on just one
 Alarms
- Thresholds and hysteresis
- Number of incidents until it alarms
- Sampling intervals
 Know your load
- Low / High traffic? Bursts?
- Small / big transactions? Concurrency?
 Replication type
- Row / Statement / Mixed
Thank you very much

MariaDB Compress Backup
No ratings yet
MariaDB Compress Backup
175 pages
Parameter UAT RAC Linux
No ratings yet
Parameter UAT RAC Linux
8 pages
MySQL Configuration Settings
No ratings yet
MySQL Configuration Settings
19 pages
Archiver Error
No ratings yet
Archiver Error
11 pages
MariaDB 10
No ratings yet
MariaDB 10
5 pages
cBR-8 Privacy Log
No ratings yet
cBR-8 Privacy Log
2 pages
(Done - GFR) 20mlg047 - Kalpatarmlg - Erwin
No ratings yet
(Done - GFR) 20mlg047 - Kalpatarmlg - Erwin
211 pages
M580 CPU Syslog Messages - v2
No ratings yet
M580 CPU Syslog Messages - v2
1 page
Huawei Same Model Device Change Script Load Procedure Updated
No ratings yet
Huawei Same Model Device Change Script Load Procedure Updated
3 pages
Wfver
No ratings yet
Wfver
28 pages
oracle 11 2 dataguard监控sql
No ratings yet
oracle 11 2 dataguard监控sql
4 pages
Tumbang Jutuh
No ratings yet
Tumbang Jutuh
2 pages
Show Processes
No ratings yet
Show Processes
9 pages
DLMS UA 2929 Product Certificate DTZY566 M
No ratings yet
DLMS UA 2929 Product Certificate DTZY566 M
2 pages
Oracle Database Parameters List
No ratings yet
Oracle Database Parameters List
6 pages
Client Information
No ratings yet
Client Information
24 pages
ATGSuppJavaMailerSetup12 241037 Wfver SQL
No ratings yet
ATGSuppJavaMailerSetup12 241037 Wfver SQL
36 pages
GPON Configuration for VLANs 100 & 200
No ratings yet
GPON Configuration for VLANs 100 & 200
2 pages
Forum6 995340 0 1210288642
100% (2)
Forum6 995340 0 1210288642
1 page
Dumpf110 1
No ratings yet
Dumpf110 1
4 pages
Oracle Apps Check Diagnostics Report
No ratings yet
Oracle Apps Check Diagnostics Report
653 pages
Order Management Diagnostics Report
No ratings yet
Order Management Diagnostics Report
188 pages
26 172 - Reason For This Error - Product Lifecycle Management - Community Wiki
No ratings yet
26 172 - Reason For This Error - Product Lifecycle Management - Community Wiki
3 pages
Luna7 HA Troubleshooting
0% (1)
Luna7 HA Troubleshooting
59 pages
PandaPay API: QRDeposit & Withdraw
No ratings yet
PandaPay API: QRDeposit & Withdraw
1 page
Database Schema for Afiliado System
No ratings yet
Database Schema for Afiliado System
1 page
V3 Updated Error List
No ratings yet
V3 Updated Error List
2 pages
Seas AuditMessageCodes
No ratings yet
Seas AuditMessageCodes
6 pages
Console Output CLI Console
No ratings yet
Console Output CLI Console
27 pages
Console Output CLI Console
No ratings yet
Console Output CLI Console
12 pages
SQL Us-Pswd
No ratings yet
SQL Us-Pswd
4 pages
API Tester Dump
No ratings yet
API Tester Dump
4 pages
Diagnostic Oracle Ebs
No ratings yet
Diagnostic Oracle Ebs
107 pages
Diagnostics Apps Check 270511
No ratings yet
Diagnostics Apps Check 270511
484 pages
Create Directory On Unix Server SM69
No ratings yet
Create Directory On Unix Server SM69
3 pages
Messages To Error Log
No ratings yet
Messages To Error Log
243 pages
Wfverout
No ratings yet
Wfverout
56 pages
SOP MySQL-Binary Installation
No ratings yet
SOP MySQL-Binary Installation
7 pages
Diagnostics Apps Check 220410
No ratings yet
Diagnostics Apps Check 220410
262 pages
Practices For Lesson 12: Security Chapter 12 - Page 1
No ratings yet
Practices For Lesson 12: Security Chapter 12 - Page 1
12 pages
Identity Accessmgmt 11gr1certmatrix 161244
No ratings yet
Identity Accessmgmt 11gr1certmatrix 161244
55 pages
Diagnostics Apps Check 170916
No ratings yet
Diagnostics Apps Check 170916
507 pages
Awrrpt 1 7072 7078
No ratings yet
Awrrpt 1 7072 7078
117 pages
Awrrpt 1 87751 87755
No ratings yet
Awrrpt 1 87751 87755
287 pages
Parameter Check 112 Result
No ratings yet
Parameter Check 112 Result
29 pages
BMC - Control-M - EnterpriseManager - 9 - DB Squema
100% (1)
BMC - Control-M - EnterpriseManager - 9 - DB Squema
1 page
Instalar Pgpool 3
No ratings yet
Instalar Pgpool 3
14 pages
Lenteur SSL Palo Alto
No ratings yet
Lenteur SSL Palo Alto
17 pages
Processor Sale - (M)
No ratings yet
Processor Sale - (M)
6 pages
Diagnostics Apps Check 090215
No ratings yet
Diagnostics Apps Check 090215
86 pages
Revision y Creacion de RET Huawei RET80 RET81 RET82
No ratings yet
Revision y Creacion de RET Huawei RET80 RET81 RET82
1 page
Lab 2
100% (2)
Lab 2
15 pages
String
No ratings yet
String
5 pages
2.3.3.1 - TCP - Ip v3r2 For Mvs - Cics TCP - Ip Socket-Cics-Files-Ezaconfg
No ratings yet
2.3.3.1 - TCP - Ip v3r2 For Mvs - Cics TCP - Ip Socket-Cics-Files-Ezaconfg
2 pages
MML Report 20181114 105449
No ratings yet
MML Report 20181114 105449
9 pages
Network Resource & CDR Details
No ratings yet
Network Resource & CDR Details
3 pages
O Que Pediu
No ratings yet
O Que Pediu
5 pages
Oracle Database Schema Overview
No ratings yet
Oracle Database Schema Overview
7 pages
Deploying IP Unicast
No ratings yet
Deploying IP Unicast
83 pages
A Beginner's Guide To MariaDB Presentation
67% (3)
A Beginner's Guide To MariaDB Presentation
26 pages
MySQL and SSD: Usage Patterns
No ratings yet
MySQL and SSD: Usage Patterns
29 pages
MySQL and Linux Tuning - Better Together
100% (1)
MySQL and Linux Tuning - Better Together
26 pages
Interview With Stana Katic
No ratings yet
Interview With Stana Katic
5 pages
Metadata Locking and Deadlock Detection in MySQL 5.5
No ratings yet
Metadata Locking and Deadlock Detection in MySQL 5.5
14 pages
MySQL Cluster Tutorial
100% (3)
MySQL Cluster Tutorial
64 pages
Granular Archival and Nearline Storage Using MySQL, S3 and SQS Presentation
No ratings yet
Granular Archival and Nearline Storage Using MySQL, S3 and SQS Presentation
28 pages
Forecasting MySQL Performance and Scalability
100% (1)
Forecasting MySQL Performance and Scalability
41 pages
Lessons Learned: Scaling A Social Network
No ratings yet
Lessons Learned: Scaling A Social Network
52 pages
Large Datasets in MySQL On Amazon EC2
No ratings yet
Large Datasets in MySQL On Amazon EC2
30 pages
MariaDB Dynamic Columns Guide
No ratings yet
MariaDB Dynamic Columns Guide
18 pages
Data in The Cloud Presentation
No ratings yet
Data in The Cloud Presentation
13 pages
Bottom-Up Database Benchmarking
No ratings yet
Bottom-Up Database Benchmarking
43 pages
A Code Stub Generator For MySQL and Drizzle Plugins Presentation
No ratings yet
A Code Stub Generator For MySQL and Drizzle Plugins Presentation
29 pages
Automated, Non-Stop MySQL Operations and Failover Presentation
100% (1)
Automated, Non-Stop MySQL Operations and Failover Presentation
46 pages
Zipcar Incident Report
No ratings yet
Zipcar Incident Report
2 pages
Book Upload
100% (20)
Book Upload
270 pages
Linux and H/W Optimizations For MySQL
100% (2)
Linux and H/W Optimizations For MySQL
160 pages
Search Analytics With Flume and HBase
No ratings yet
Search Analytics With Flume and HBase
24 pages
Top 10 Lessons Learned From Deploying Hadoop in A Private Cloud
No ratings yet
Top 10 Lessons Learned From Deploying Hadoop in A Private Cloud
33 pages
Transaction History
No ratings yet
Transaction History
10 pages
Office Software Overview & Features
No ratings yet
Office Software Overview & Features
56 pages
Rci-8510 A3 Graphical Cable-Based Display Manual
No ratings yet
Rci-8510 A3 Graphical Cable-Based Display Manual
30 pages
The Sap Hana Angularjs
No ratings yet
The Sap Hana Angularjs
8 pages
2024 APDB101 MajorTheoryTest2 QuestionPaper
No ratings yet
2024 APDB101 MajorTheoryTest2 QuestionPaper
7 pages
Software Testing & Quality Guide
No ratings yet
Software Testing & Quality Guide
27 pages
IoT Week 13 Lecture 2 SCADA Systems and Fourth Industrial Revolution (Industry 4.0) (Slides To Be Updated)
No ratings yet
IoT Week 13 Lecture 2 SCADA Systems and Fourth Industrial Revolution (Industry 4.0) (Slides To Be Updated)
17 pages
R-Link2 2016
No ratings yet
R-Link2 2016
130 pages
DUNGS ValveDrive Config App Manual ENG
No ratings yet
DUNGS ValveDrive Config App Manual ENG
23 pages
An Efficient Data Gathering Mechanism Using M - Collectors
No ratings yet
An Efficient Data Gathering Mechanism Using M - Collectors
5 pages
Edulib-CSE /IT Modules: S.No Module Name Total Topics
No ratings yet
Edulib-CSE /IT Modules: S.No Module Name Total Topics
6 pages
Zettabyte File System
No ratings yet
Zettabyte File System
20 pages
Searchq 8x8+Chess+Board+Photo&Rlz 1C9BKJA EnIN812IN816&Oq 8x8+Chess+Board+Photo&Aqs Chrome..69i57j0i333
No ratings yet
Searchq 8x8+Chess+Board+Photo&Rlz 1C9BKJA EnIN812IN816&Oq 8x8+Chess+Board+Photo&Aqs Chrome..69i57j0i333
1 page
Human AI Collaboration
No ratings yet
Human AI Collaboration
16 pages
Kubernetes For MLOps Engineers
No ratings yet
Kubernetes For MLOps Engineers
7 pages
Unity Requiriment
No ratings yet
Unity Requiriment
8 pages
Lab Report
No ratings yet
Lab Report
10 pages
User Manual - Citizens Etribe Validity PDF
No ratings yet
User Manual - Citizens Etribe Validity PDF
27 pages
Optimize Your Indeed Resume PDF
100% (1)
Optimize Your Indeed Resume PDF
4 pages
Oracle Berkeley DB Tutorial
No ratings yet
Oracle Berkeley DB Tutorial
68 pages
Data Centre Fundamentals
No ratings yet
Data Centre Fundamentals
164 pages
Concepts To Learn: Lecture 3: Developing Our First Game - Roller Madness
No ratings yet
Concepts To Learn: Lecture 3: Developing Our First Game - Roller Madness
9 pages
Stored Procedure
No ratings yet
Stored Procedure
18 pages
Ledm0096 00
No ratings yet
Ledm0096 00
2 pages
XINJE HMI PLC Connection Manual PDF
No ratings yet
XINJE HMI PLC Connection Manual PDF
110 pages
7XV5652 Catalog SIP2004s en
No ratings yet
7XV5652 Catalog SIP2004s en
2 pages
USB 13025 Drivers
No ratings yet
USB 13025 Drivers
57 pages
JZ Gamma
No ratings yet
JZ Gamma
2 pages
Bentley
No ratings yet
Bentley
2 pages
Gtu Advanced Java Practicals
No ratings yet
Gtu Advanced Java Practicals
100 pages

Advanced Replication Monitoring Presentation

Uploaded by

Advanced Replication Monitoring Presentation

Uploaded by

Advance Replication

 … in the slave it becomes …

 SBC is based on the timestamp for the transaction

- On the slave: SHOW SLAVE STATUS

Pause Start TS 1st TS SBM 2nd TS Diff 1 Diff 2 RCI

044 00:10:00 20:51:40 21:01:44 161 21:01:54 00:10:04 00:10:14 43.9

045 00:10:00 17:32:13 17:42:17 320 17:42:27 00:10:04 00:10:14 43.9

005 00:10:00 15:37:12 15:47:21 441 15:47:41 00:10:09 00:10:29 21.7

001 00:10:00 18:54:28 19:04:33 520 19:04:53 00:10:05 00:10:25 25.0

002 00:10:00 18:02:32 18:12:39 389 18:12:49 00:10:07 00:10:17 36.3

mk-heartbeat --monitor --host localhost --database maatkit

You might also like