青青草国产精品久久,国产亚洲婷婷香蕉久久精品,久久综合香蕉国产蜜臀AV

下面小編就為大家帶來一篇線上mysql同步報錯故障處理方法總結(必看篇)。小編覺得挺不錯的，現在就分享給大家，也給大家做個參考。一起跟隨小編過來看看吧

前言

在發生故障切換后，經常遇到的問題就是同步報錯，數據庫很小的時候，dump完再導入很簡單就處理好了，但線上的數據庫都150G-200G，如果用單純的這種方法，成本太高，故經過一段時間的摸索，總結了幾種處理方法。

生產環境架構圖

目前現網的架構，保存著兩份數據，通過異步復制做的高可用集群，兩臺機器提供對外服務。在發生故障時，切換到slave上，并將其變成master，壞掉的機器反向同步新的master，在處理故障時，遇到最多的就是主從報錯。下面是我收錄下來的報錯信息。

常見錯誤

最常見的3種情況

這3種情況是在HA切換時，由于是異步復制，且sync_binlog=0，會造成一小部分binlog沒接收完導致同步報錯。

第一種：在master上刪除一條記錄，而slave上找不到。

Last_SQL_Error:?Could?not?execute?Delete_rows?event?on?table?hcy.t1;?  Can't?find?record?in?'t1',?  Error_code:?1032;?handler?error?HA_ERR_KEY_NOT_FOUND;?  the?event's?master?log?mysql-bin.000006,?end_log_pos?254

第二種：主鍵重復。在slave已經有該記錄，又在master上插入了同一條記錄。

Last_SQL_Error:?Could?not?execute?Write_rows?event?on?table?hcy.t1;?  Duplicate?entry?'2'?for?key?'PRIMARY',?  Error_code:?1062;?  handler?error?HA_ERR_FOUND_DUPP_KEY;?the?event's?master?log?mysql-bin.000006,?end_log_pos?924

第三種：在master上mysql一條記錄，而slave上找不到，丟失了數據。

Last_SQL_Error:?Could?not?execute?Update_rows?event?on?table?hcy.t1;  Can't?find?record?in?'t1',?  Error_code:?1032;?  handler?error?HA_ERR_KEY_NOT_FOUND;?the?event's?master?log?mysql-bin.000010,?end_log_pos?263

異步半同步區別

異步復制
簡單的說就是master把binlog發送過去，不管slave是否接收完，也不管是否執行完，這一動作就結束了.

半同步復制
簡單的說就是master把binlog發送過去，slave確認接收完，但不管它是否執行完，給master一個信號我這邊收到了，這一動作就結束了。（谷歌寫的代碼，5.5上正式應用。）

異步的劣勢
當master上寫操作繁忙時，當前POS點例如是10，而slave上IO_THREAD線程接收過來的是3，此時master宕機，會造成相差7個點未傳送到slave上而數據丟失。

特殊的情況

slave的中繼日志relay-bin損壞。

Last_SQL_Error:?Error?initializing?relay?log?position:?I/O?error?reading?the?header?from?the?binary?log  Last_SQL_Error:?Error?initializing?relay?log?position:?Binlog?has?bad?magic?number;?  It's?not?a?binary?log?file?that?can?be?used?by?this?version?of?MySQL

這種情況SLAVE在宕機，或者非法關機，例如電源故障、主板燒了等，造成中繼日志損壞，同步停掉。

人為失誤需謹慎：多臺slave存在重復server-id
這種情況同步會一直延時，永遠也同步不完，error錯誤日志里一直出現上面兩行信息。解決方法就是把server-id改成不一致即可。

Slave:?received?end?packet?from?server,?apparent?master?shutdown:  Slave?I/O?thread:?Failed?reading?log?event,?reconnecting?to?retry,?log?'mysql-bin.000012'?at?postion?106

問題處理

刪除失敗

在master上刪除一條記錄，而slave上找不到。

Last_SQL_Error:?Could?not?execute?Delete_rows?event?on?table?hcy.t1;?  Can't?find?record?in?'t1',  Error_code:?1032;?handler?error?HA_ERR_KEY_NOT_FOUND;?  the?event's?master?log?mysql-bin.000006,?end_log_pos?254

解決方法：

由于master要刪除一條記錄，而slave上找不到故報錯，這種情況主上都將其刪除了，那么從機可以直接跳過。可用命令：

stop?slave;  set?global?sql_slave_skip_counter=1;  start?slave;

如果這種情況很多，可用我寫的一個腳本skip_error_replcation.sh，默認跳過10個錯誤（只針對這種情況才跳，其他情況輸出錯誤結果，等待處理），這個腳本是參考maakit工具包的mk-slave-restart原理用shell寫的，功能上定義了一些自己的東西，不是無論什么錯誤都一律跳過。）

主鍵重復

在slave已經有該記錄，又在master上插入了同一條記錄。

Last_SQL_Error:?Could?not?execute?Write_rows?event?on?table?hcy.t1;?  Duplicate?entry?'2'?for?key?'PRIMARY',?  Error_code:?1062;?  handler?error?HA_ERR_FOUND_DUPP_KEY;?the?event's?master?log?mysql-bin.000006,?end_log_pos?924

解決方法：

在slave上用desc hcy.t1; 先看下表結構：

mysql&gt;?desc?hcy.t1;  +-------+---------+------+-----+---------+-------+  |?Field?|?Type??|?Null?|?Key?|?Default?|?Extra?|  +-------+---------+------+-----+---------+-------+  |?id??|?int(11)?|?NO??|?PRI?|?0????|????|?  |?name?|?char(4)?|?YES?|???|?NULL??|????|?  +-------+---------+------+-----+---------+-------+

刪除重復的主鍵

mysql&gt;?delete?from?t1?where?id=2;  Query?OK,?1?row?affected?(0.00?sec)    mysql&gt;?start?slave;  Query?OK,?0?rows?affected?(0.00?sec)    mysql&gt;?show?slave?statusG;  ……  Slave_IO_Running:?Yes  Slave_SQL_Running:?Yes  ……  mysql&gt;?select?*?from?t1?where?id=2;

在master上和slave上再分別確認一下。

更新丟失

在master上更新一條記錄，而slave上找不到，丟失了數據。

Last_SQL_Error:?Could?not?execute?Update_rows?event?on?table?hcy.t1;?  Can't?find?record?in?'t1',?  Error_code:?1032;?  handler?error?HA_ERR_KEY_NOT_FOUND;?  the?event's?master?log?mysql-bin.000010,?end_log_pos?794

解決方法：

在master上，用mysqlbinlog 分析下出錯的binlog日志在干什么。

/usr/local/mysql/bin/mysqlbinlog?--no-defaults?-v?-v?--base64-output=DECODE-ROWS?mysql-bin.000010?|?grep?-A?'10'?794    #120302?12:08:36?server?id?22?end_log_pos?794?Update_rows:?table?id?33?flags:?STMT_END_F  ###?UPDATE?hcy.t1  ###?WHERE  ###??@1=2?/*?INT?meta=0?nullable=0?is_null=0?*/  ###??@2='bbc'?/*?STRING(4)?meta=65028?nullable=1?is_null=0?*/  ###?SET  ###??@1=2?/*?INT?meta=0?nullable=0?is_null=0?*/  ###??@2='BTV'?/*?STRING(4)?meta=65028?nullable=1?is_null=0?*/  #?at?794  #120302?12:08:36?server?id?22?end_log_pos?821?Xid?=?60  COMMIT/*!*/;  DELIMITER?;  #?End?of?log?file  ROLLBACK?/*?added?by?mysqlbinlog?*/;  /*!50003?SET?COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

在slave上，查找下更新后的那條記錄，應該是不存在的。

mysql&gt;?select?*?from?t1?where?id=2;  Empty?set?(0.00?sec)

然后再到master查看

mysql&gt;?select?*?from?t1?where?id=2;  +----+------+  |?id?|?name?|  +----+------+  |?2?|?BTV?|?  +----+------+  1?row?in?set?(0.00?sec)

把丟失的數據在slave上填補，然后跳過報錯即可。

mysql&gt;?insert?into?t1?values?(2,'BTV');  Query?OK,?1?row?affected?(0.00?sec)    mysql&gt;?select?*?from?t1?where?id=2;??  +----+------+  |?id?|?name?|  +----+------+  |?2?|?BTV?|?  +----+------+  1?row?in?set?(0.00?sec)    mysql&gt;?stop?slave?;set?global?sql_slave_skip_counter=1;start?slave;  Query?OK,?0?rows?affected?(0.01?sec)  Query?OK,?0?rows?affected?(0.00?sec)  Query?OK,?0?rows?affected?(0.00?sec)    mysql&gt;?show?slave?statusG;  ……  ?Slave_IO_Running:?Yes  ?Slave_SQL_Running:?Yes  ……

中繼日志損壞

slave的中繼日志relay-bin損壞。

Last_SQL_Error:?Error?initializing?relay?log?position:?I/O?error?reading?the?header?from?the?binary?log  Last_SQL_Error:?Error?initializing?relay?log?position:?Binlog?has?bad?magic?number;?  It's?not?a?binary?log?file?that?can?be?used?by?this?version?of?MySQL

手工修復

解決方法：找到同步的binlog和POS點，然后重新做同步，這樣就可以有新的中繼日值了。

例子：

mysql&gt;?show?slave?statusG;  ***************************?1.?row?***************************  ???????Master_Log_File:?mysql-bin.000010  ?????Read_Master_Log_Pos:?1191  ????????Relay_Log_File:?vm02-relay-bin.000005  ????????Relay_Log_Pos:?253  ????Relay_Master_Log_File:?mysql-bin.000010  ???????Slave_IO_Running:?Yes  ??????Slave_SQL_Running:?No  ???????Replicate_Do_DB:?  ?????Replicate_Ignore_DB:?  ??????Replicate_Do_Table:?  ????Replicate_Ignore_Table:?  ???Replicate_Wild_Do_Table:?  ?Replicate_Wild_Ignore_Table:?  ??????????Last_Errno:?1593  ??????????Last_Error:?Error?initializing?relay?log?position:?I/O?error?reading?the?header?from?the?binary?log  ?????????Skip_Counter:?1  ?????Exec_Master_Log_Pos:?821

Slave_IO_Running ：接收master的binlog信息???????

Master_Log_File
?????????????????? Read_Master_Log_Pos

Slave_SQL_Running：執行寫操作

?????????????????? Relay_Master_Log_File
?????????????????? Exec_Master_Log_Pos

以執行寫的binlog和POS點為準。

Relay_Master_Log_File:?mysql-bin.000010  Exec_Master_Log_Pos:?821  mysql&gt;?stop?slave;  Query?OK,?0?rows?affected?(0.01?sec)    mysql&gt;?CHANGE?MASTER?TO?MASTER_LOG_FILE='mysql-bin.000010',MASTER_LOG_POS=821;  Query?OK,?0?rows?affected?(0.01?sec)    mysql&gt;?start?slave;  Query?OK,?0?rows?affected?(0.00?sec)      mysql&gt;?show?slave?statusG;  ***************************?1.?row?***************************  ????????Slave_IO_State:?Waiting?for?master?to?send?event  ?????????Master_Host:?192.168.8.22  ?????????Master_User:?repl  ?????????Master_Port:?3306  ????????Connect_Retry:?10  ???????Master_Log_File:?mysql-bin.000010  ?????Read_Master_Log_Pos:?1191  ????????Relay_Log_File:?vm02-relay-bin.000002  ????????Relay_Log_Pos:?623  ????Relay_Master_Log_File:?mysql-bin.000010  ???????Slave_IO_Running:?Yes  ??????Slave_SQL_Running:?Yes  ???????Replicate_Do_DB:?  ?????Replicate_Ignore_DB:?  ??????Replicate_Do_Table:?  ????Replicate_Ignore_Table:?  ???Replicate_Wild_Do_Table:?  ?Replicate_Wild_Ignore_Table:?  ??????????Last_Errno:?0  ??????????Last_Error:?  ?????????Skip_Counter:?0  ?????Exec_Master_Log_Pos:?1191  ???????Relay_Log_Space:?778  ???????Until_Condition:?None  ????????Until_Log_File:?  ????????Until_Log_Pos:?0  ??????Master_SSL_Allowed:?No  ??????Master_SSL_CA_File:?  ??????Master_SSL_CA_Path:?  ???????Master_SSL_Cert:?  ??????Master_SSL_Cipher:?  ????????Master_SSL_Key:?  ????Seconds_Behind_Master:?0  Master_SSL_Verify_Server_Cert:?No  ????????Last_IO_Errno:?0  ????????Last_IO_Error:?  ????????Last_SQL_Errno:?0  ????????Last_SQL_Error:?  Ibbackup

各種大招都用上了，無奈slave數據丟失過多，ibbackup（需要銀子）該你登場了。

Ibbackup熱備份工具，是付費的。xtrabackup是免費的，功能上一樣。

Ibbackup備份期間不鎖表，備份時開啟一個事務（相當于做一個快照），然后會記錄一個點，之后數據的更改保存在ibbackup_logfile文件里，恢復時把ibbackup_logfile 變化的數據再寫入到ibdata里。

Ibbackup 只備份數據（ ibdata、.ibd ），表結構.frm不備份。

下面一個演示例子：

備份：ibbackup /bak/etc/my_local.cnf /bak/etc/my_bak.cnf

恢復：ibbackup –apply-log /bak/etc/my_bak.cnf

[root@vm01?etc]#?more?my_local.cnf?    datadir?=/usr/local/mysql/data  innodb_data_home_dir?=?/usr/local/mysql/data  innodb_data_file_path?=?ibdata1:10M:autoextend  innodb_log_group_home_dir?=?/usr/local/mysql/data  innodb_buffer_pool_size?=?100M  innodb_log_file_size?=?5M  innodb_log_files_in_group=2      [root@vm01?etc]#?ibbackup?/bak/etc/my_local.cnf?/bak/etc/my_bak.cnf?    InnoDB?Hot?Backup?version?3.0.0;?Copyright?2002-2005?Innobase?Oy  License?A21488?is?granted?to?vm01?(chunyang_he@126.com)  (--apply-log?works?in?any?computer?regardless?of?the?hostname)  Licensed?for?use?in?a?computer?whose?hostname?is?'vm01'  Expires?2012-5-1?(year-month-day)?at?00:00  See?http://www.innodb.com?for?further?information  Type?ibbackup?--license?for?detailed?license?terms,?--help?for?help    Contents?of?/bak/etc/my_local.cnf:  innodb_data_home_dir?got?value?/usr/local/mysql/data  innodb_data_file_path?got?value?ibdata1:10M:autoextend  datadir?got?value?/usr/local/mysql/data  innodb_log_group_home_dir?got?value?/usr/local/mysql/data  innodb_log_files_in_group?got?value?2  innodb_log_file_size?got?value?5242880    Contents?of?/bak/etc/my_bak.cnf:  innodb_data_home_dir?got?value?/bak/data  innodb_data_file_path?got?value?ibdata1:10M:autoextend    datadir?got?value?/bak/data  innodb_log_group_home_dir?got?value?/bak/data  innodb_log_files_in_group?got?value?2  innodb_log_file_size?got?value?5242880    ibbackup:?Found?checkpoint?at?lsn?0?1636898  ibbackup:?Starting?log?scan?from?lsn?0?1636864  120302?16:47:43?ibbackup:?Copying?log...  120302?16:47:43?ibbackup:?Log?copied,?lsn?0?1636898  ibbackup:?We?wait?1?second?before?starting?copying?the?data?files...  120302?16:47:44?ibbackup:?Copying?/usr/local/mysql/data/ibdata1  ibbackup:?A?copied?database?page?was?modified?at?0?1636898  ibbackup:?Scanned?log?up?to?lsn?0?1636898  ibbackup:?Was?able?to?parse?the?log?up?to?lsn?0?1636898  ibbackup:?Maximum?page?number?for?a?log?record?0  120302?16:47:46?ibbackup:?Full?backup?completed!  [root@vm01?etc]#  [root@vm01?etc]#?cd?/bak/data/  [root@vm01?data]#?ls  ibbackup_logfile?ibdata1    [root@vm01?data]#?ibbackup?--apply-log?/bak/etc/my_bak.cnf?    InnoDB?Hot?Backup?version?3.0.0;?Copyright?2002-2005?Innobase?Oy  License?A21488?is?granted?to?vm01?(chunyang_he@126.com)  (--apply-log?works?in?any?computer?regardless?of?the?hostname)  Licensed?for?use?in?a?computer?whose?hostname?is?'vm01'  Expires?2012-5-1?(year-month-day)?at?00:00  See?http://www.innodb.com?for?further?information  Type?ibbackup?--license?for?detailed?license?terms,?--help?for?help    Contents?of?/bak/etc/my_bak.cnf:  innodb_data_home_dir?got?value?/bak/data  innodb_data_file_path?got?value?ibdata1:10M:autoextend  datadir?got?value?/bak/data  innodb_log_group_home_dir?got?value?/bak/data  innodb_log_files_in_group?got?value?2  innodb_log_file_size?got?value?5242880    120302?16:48:38?ibbackup:?ibbackup_logfile's?creation?parameters:  ibbackup:?start?lsn?0?1636864,?end?lsn?0?1636898,  ibbackup:?start?checkpoint?0?1636898      ibbackup:?start?checkpoint?0?1636898  InnoDB:?Doing?recovery:?scanned?up?to?log?sequence?number?0?1636898  InnoDB:?Starting?an?apply?batch?of?log?records?to?the?database...  InnoDB:?Progress?in?percents:?0?1?2?3?4?5?6?7?8?9?10?11?12?13?14?15?.....99  Setting?log?file?size?to?0?5242880  ibbackup:?We?were?able?to?parse?ibbackup_logfile?up?to  ibbackup:?lsn?0?1636898  ibbackup:?Last?MySQL?binlog?file?position?0?1191,?file?name?./mysql-bin.000010  ibbackup:?The?first?data?file?is?'/bak/data/ibdata1'  ibbackup:?and?the?new?created?log?files?are?at?'/bak/data/'  120302?16:48:38?ibbackup:?Full?backup?prepared?for?recovery?successfully!    [root@vm01?data]#?ls  ibbackup_logfile?ibdata1?ib_logfile0?ib_logfile1

把ibdata1 ib_logfile0 ib_logfile1拷貝到從，把.frm也拷貝過去，啟動MySQL后，做同步，那個點就是上面輸出的：

ibbackup:?Last?MySQL?binlog?file?position?0?1191,?file?name?./mysql-bin.000010  CHANGE?MASTER?TO?MASTER_LOG_FILE='mysql-bin.000010',MASTER_LOG_POS=1191;

Maatkit工具包

簡介

maatkit是一個開源的工具包，為mysql日常管理提供了幫助。目前，已被Percona公司收購并維護。其中：

mk-table-checksum是用來檢測master和slave上的表結構和數據是否一致。

mk-table-sync是發生主從數據不一致時，來修復的。

這兩個工具包，沒有在現網實際操作的經驗，這里僅僅是新技術探討和學術交流，下面展示下如何使用。

[root@vm02]#?mk-table-checksum?h=vm01,u=admin,p=123456?h=vm02,u=admin,p=123456?-d?hcy?-t?t1  Cannot?connect?to?MySQL?because?the?Perl?DBI?module?is?not?installed?or?not?found.?  Run?'perl?-MDBI'?to?see?the?directories?that?Perl?searches?for?DBI.  If?DBI?is?not?installed,?try:  ?Debian/Ubuntu?apt-get?install?libdbi-perl  ?RHEL/CentOS??yum?install?perl-DBI  ?OpenSolaris??pgk?install?pkg:/SUNWpmdbi

提示缺少perl-DBI模塊，那么直接 yum install perl-DBI。

[root@vm02?bin]#?mk-table-checksum?h=vm01,u=admin,p=123456?h=vm02,u=admin,p=123456?-d?hcy?-t?t1  DATABASE?TABLE?CHUNK?HOST?ENGINE???COUNT?????CHECKSUM?TIME?WAIT?STAT?LAG  hcy???t1????0?vm02?InnoDB????NULL????1957752020??0??0?NULL?NULL  hcy???t1????0?vm01?InnoDB????NULL????1957752020??0??0?NULL?NULL

如果表數據不一致，CHECKSUM的值是不相等的。

解釋下輸出的意思：

DATABASE：數據庫名
TABLE：表名
CHUNK：checksum時的近似數值
HOST：MYSQL的地址
ENGINE：mysql
COUNT：表的行數
CHECKSUM：校驗值
TIME：所用時間
WAIT：等待時間
STAT：MASTER_POS_WAIT()返回值
LAG：slave的延時時間

如果你想過濾出不相等的都有哪些表，可以用mk-checksum-filter這個工具，只要在后面加個管道符就行了。

[root@vm02?~]#?mk-table-checksum?h=vm01,u=admin,p=123456?h=vm02,u=admin,p=123456?-d?hcy?|?mk-checksum-filter????  hcy???t2????0?vm01?InnoDB????NULL????1957752020??0??0?NULL?NULL  hcy???t2????0?vm02?InnoDB????NULL????1068689114??0??0?NULL?NULL

知道有哪些表不一致，可以用mk-table-sync這個工具來處理。

注：在執行mk-table-checksum時會鎖表，表的大小取決于執行的快慢。

MASTER上的t2表數據：

SLAVE上的t2表數據：

mysql&gt;?select?*?from?t2;?????????mysql&gt;?select?*?from?t2;??  +----+------+???????????????+----+------+  |?id?|?name?|???????????????|?id?|?name?|  +----+------+???????????????+----+------+  |?1?|?a??|???????????????|?1?|?a??|?  |?2?|?b??|???????????????|?2?|?b??|?  |?3?|?ss??|???????????????|?3?|?ss??|?  |?4?|?asd?|???????????????|?4?|?asd?|?  |?5?|?ss??|???????????????+----+------+  +----+------+???????????????4?rows?in?set?(0.00?sec)  5?rows?in?set?(0.00?sec)?  ?????????????????????mysql&gt;?!?hostname;?  mysql&gt;?!?hostname;????????????vm02????  vm01?  [root@vm02?~]#?mk-table-sync?--execute?--print?--no-check-slave?--transaction?--databases?hcy?h=vm01,u=admin,p=123456?h=vm02,u=admin,p=123456?  INSERT?INTO?`hcy`.`t2`(`id`,?`name`)?VALUES?('5',?'ss')?/*maatkit?src_db:hcy?src_tbl:t2?src_dsn:h=vm01,p=...,u=admin?dst_db:hcy?dst_tbl:t2?  dst_dsn:h=vm02,p=...,u=admin?lock:0?transaction:1?changing_src:0?replicate:0?bidirectional:0?pid:3246?user:root?host:vm02*/;

它的工作原理是：先一行一行檢查主從庫的表是否一樣，如果哪里不一樣，就執行刪除，更新，插入等操作，使其達到一致。表的大小決定著執行的快慢。

If?C?is?specified,?C<lock>?is?not?used.?Instead,?lock  and?unlock?are?implemented?by?beginning?and?committing?transactions.  The?exception?is?if?L?is?3.  If?C?is?specified,?then?C<lock>?is?used?for?any  value?of?L.?See?L.  When?enabled,?either?explicitly?or?implicitly,?the?transaction?isolation?level  is?set?C<repeatable>?and?transactions?are?started?C<with snapshot></with></repeatable></lock></lock>

MySQL復制監控

MySQL常見錯誤類型

1005：創建表失敗
1006：mysql失敗
1007：數據庫已存在，創建數據庫失敗
1008：數據庫不存在，刪除數據庫失敗
1009：不能刪除數據庫文件導致刪除數據庫失敗
1010：不能刪除數據目錄導致刪除數據庫失敗
1011：刪除數據庫文件失敗
1012：不能讀取系統表中的記錄
1020：記錄已被其他用戶修改
1021：硬盤剩余空間不足，請加大硬盤可用空間
1022：關鍵字重復，更改記錄失敗
1023：關閉時發生錯誤
1024：讀文件錯誤
1025：更改名字時發生錯誤
1026：寫文件錯誤
1032：記錄不存在
1036：數據表是只讀的，不能對它進行修改
1037：系統內存不足，請重啟數據庫或重啟服務器
1038：用于排序的內存不足，請增大排序緩沖區
1040：已到達數據庫的最大連接數，請加大數據庫可用連接數
1041：系統內存不足
1042：無效的主機名
1043：無效連接
1044：當前用戶沒有訪問數據庫的權限
1045：不能mysql，用戶名或密碼錯誤
1048：字段不能為空
1049：數據庫不存在
1050：數據表已存在
1051：數據表不存在
1054：字段不存在
1065：無效的SQL語句，SQL語句為空
1081：不能建立Socket連接
1114：數據表已滿，不能容納任何記錄
1116：打開的數據表太多
1129：數據庫出現異常，請重啟數據庫
1130：連接數據庫失敗，沒有連接數據庫的權限
1133：數據庫用戶不存在
1141：當前用戶無權訪問數據庫
1142：當前用戶無權訪問數據表
1143：當前用戶無權訪問數據表中的字段
1146：數據表不存在
1147：未定義用戶對數據表的訪問權限
1149：SQL語句mysql
1158：網絡錯誤，出現讀錯誤，請檢查網絡連接狀況
1159：網絡錯誤，讀超時，請檢查網絡連接狀況
1160：網絡錯誤，出現寫錯誤，請檢查網絡連接狀況
1161：網絡錯誤，寫超時，請檢查網絡連接狀況
1062：字段值重復，入庫失敗
1169：字段值重復，更新記錄失敗
1177：打開數據表失敗
1180：提交事務失敗
1181：回滾事務失敗
1203：當前用戶和數據庫建立的連接已到達數據庫的最大連接數，請增大可用的數據庫連接數或重啟數據庫
1205：加鎖超時
1211：當前用戶沒有創建用戶的權限
1216：外鍵mysql檢查失敗，更新子表記錄失敗
1217：外鍵約束檢查失敗，刪除或修改主表記錄失敗
1226：當前用戶使用的資源已超過所允許的資源，請重啟數據庫或重啟服務器
1227：權限不足，您無權進行此操作
1235：MySQL版本過低，不具有本功能

復制監控腳本

參考原文修改。

原腳本

#!/bin/bash  #  #check_mysql_slave_replication_status  #  #  #  parasum=2  help_msg(){  ?  cat??$SlaveStatusFile  echo?""?&gt;&gt;  ?$SlaveStatusFile  ?  #get  ?slave?status  ${MYSQL_CMD}  ?-e?"show  ?slave?statusG"?&gt;&gt;  ?$SlaveStatusFile?#取得salve進程的狀態  ?  #get  ?io_thread_status,sql_thread_status,last_errno??取得以下狀態值  ?  IOStatus=$(cat?$SlaveStatusFile|grep?Slave_IO_Running|awk?'{print  ?$2}')  SQLStatus=$(cat?$SlaveStatusFile|grep?Slave_SQL_Running  ?|awk?'{print  ?$2}')  ??Errno=$(cat?$SlaveStatusFile|grep?Last_Errno  ?|?awk?'{print  ?$2}')  ??Behind=$(cat?$SlaveStatusFile|grep?Seconds_Behind_Master  ?|?awk?'{print  ?$2}')  ?  echo?""?&gt;&gt;  ?$SlaveStatusFile  ?  if?[  "$IOStatus"?==  "No"?]  ?||?[?"$SQLStatus"?==  "No"?];then??#判斷錯誤類型  ????if?[  "$Errno"?-eq?0  ?];then??#可能是salve線程未啟動  ??????$MYSQL_CMD  ?-e?"start  ?slave?io_thread;start?slave?sql_thread;"  ??????echo?"Cause  ?slave?threads?doesnot's?running,trying?start?slsave?io_thread;start?slave?sql_thread;"?&gt;&gt;  ?$SlaveStatusFile  ??????MailTitle="[Warning]  ?Slave?threads?stoped?on?$HOST_IP?$HOST_PORT"  ????elif?[  "$Errno"?-eq?1007  ?]?||?[?"$Errno"?-eq?1053  ?]?||?[?"$Errno"?-eq?1062  ?]?||?[?"$Errno"?-eq?1213  ?]?||?[?"$Errno"?-eq?1032  ?]  ??????||  ?[?"Errno"?-eq?1158  ?]?||?[?"$Errno"?-eq?1159  ?]?||?[?"$Errno"?-eq?1008  ?];then?#忽略此些錯誤  ??????$MYSQL_CMD  ?-e?"stop  ?slave;set?global?sql_slave_skip_counter=1;start?slave;"  ??????echo?"Cause  ?slave?replication?catch?errors,trying?skip?counter?and?restart?slave;stop?slave?;set?global?sql_slave_skip_counter=1;slave?start;"?&gt;&gt;  ?$SlaveStatusFile  ??????MailTitle="[Warning]  ?Slave?error?on?$HOST_IP?$HOST_PORT!?ErrNum:?$Errno"  ????else  ??????echo?"Slave  ?$HOST_IP?$HOST_PORT?is?down!"?&gt;&gt;  ?$SlaveStatusFile  ??????MailTitle="[ERROR]Slave  ?replication?is?down?on?$HOST_IP?$HOST_PORT?!?ErrNum:$Errno"  ????fi  fi  if?[  ?-n?"$Behind"?];then  ????Behind=0  fi  echo?"$Behind"?&gt;&gt;  ?$SlaveStatusFile  ?  #delay  ?behind?master?判斷延時時間  if?[  ?$Behind?-gt?300?];then  ??echo?`date?+"%Y-%m%d  ?%H:%M:%S"`  "slave  ?is?behind?master?$Bebind?seconds!"?&gt;&gt;  ?$SlaveStatusFile  ??MailTitle="[Warning]Slave  ?delay?$Behind?seconds,from?$HOST_IP?$HOST_PORT"  fi  ?  if?[  ?-n?"$MailTitle"?];then?#若出錯或者延時時間大于300s則發送郵件  ????cat?${SlaveStatusFile}  ?|?/bin/mail?-s  "$MailTitle"?$Mail_Address_MysqlStatus  fi  ?  #del  ?tmpfile:SlaveStatusFile  &gt;  ?$SlaveStatusFile

修改后腳本

只做了簡單的整理，修正了Behind為NULL的判斷，但均未測試；

應可考慮增加：

對修復執行結果的判斷；多條錯誤的mysql修復、檢測、再修復？

取消SlaveStatusFile臨時文件。

Errno、Behind兩種告警分別發郵件，告警正文增加show slave結果原文。

增加PATH，以便加到crontab中。

考慮crontab中周期執行(加鎖避免執行沖突、執行周期選擇)

增加執行日志？

#!/bin/sh  #  ?check_mysql_slave_replication_status  #  ?參考:http://www.tianfeiyu.com/?p=2062  ?  Usage(){  ??echo?Usage:  ??echo?"$0  ?HOST?PORT?USER?PASS"  }  ?  [  ?-z?"$1"?-o  ?-z?"$2"?-o  ?-z?"$3"?-o  ?-z?"$4"?]  ?&amp;&amp;?Usage?&amp;&amp;?exit?1  HOST=$1  PORT=$2  USER=$3  PASS=$4  ?  MYSQL_CMD="mysql  ?-h$HOST?-P$PORT?-u$USER?-p$PASS"  ?  MailTitle=""????????#郵件主題  Mail_Address_MysqlStatus="root@localhost.localdomain"??#收件人郵箱??  ?  time1=$(date?+"%Y%m%d%H%M%S")  time2=$(date?+"%Y-%m-%d  ?%H:%M:%S")  ?  SlaveStatusFile=/tmp/salve_status_${HOST_PORT}.${time1}?  #郵件內容所在文件  echo?"--------------------Begin  ?at:?"$time2  ?&gt;?$SlaveStatusFile  echo?""?&gt;&gt;  ?$SlaveStatusFile  ?  #get  ?slave?status  ${MYSQL_CMD}  ?-e?"show  ?slave?statusG"?&gt;&gt;  ?$SlaveStatusFile?#取得salve進程的狀態  ?  #get  ?io_thread_status,sql_thread_status,last_errno??取得以下狀態值  ?  ?IOStatus=$(cat?$SlaveStatusFile|grep?Slave_IO_Running|awk?'{print  ?$2}')  SQLStatus=$(cat?$SlaveStatusFile|grep?Slave_SQL_Running  ?|awk?'{print  ?$2}')  ??Errno=$(cat?$SlaveStatusFile|grep?Last_Errno  ?|?awk?'{print  ?$2}')  ??Behind=$(cat?$SlaveStatusFile|grep?Seconds_Behind_Master  ?|?awk?'{print  ?$2}')  ?  echo?""?&gt;&gt;  ?$SlaveStatusFile  ?  if?[  "$IOStatus"?=  "No"?-o  "$SQLStatus"?=  "No"?];then  ??case?"$Errno"?in  ??0)  ????#  ?可能是slave未啟動  ????$MYSQL_CMD  ?-e?"start  ?slave?io_thread;start?slave?sql_thread;"  ????echo?"Cause  ?slave?threads?doesnot's?running,trying?start?slsave?io_thread;start?slave?sql_thread;"?&gt;&gt;  ?$SlaveStatusFile  ????;;  ??1007|1053|1062|1213|1032|1158|1159|1008)  ????#  ?忽略這些錯誤  ????$MYSQL_CMD  ?-e?"stop  ?slave;set?global?sql_slave_skip_counter=1;start?slave;"  ????echo?"Cause  ?slave?replication?catch?errors,trying?skip?counter?and?restart?slave;stop?slave?;set?global?sql_slave_skip_counter=1;slave?start;"?&gt;&gt;  ?$SlaveStatusFile  ????MailTitle="[Warning]  ?Slave?error?on?$HOST:$PORT!?ErrNum:?$Errno"  ????;;  ??*)  ????echo?"Slave  ?$HOST:$PORT?is?down!"?&gt;&gt;  ?$SlaveStatusFile  ????MailTitle="[ERROR]Slave  ?replication?is?down?on?$HOST:$PORT!?Errno:$Errno"  ????;;  ??esac  fi  ?  if?[  "$Behind"?=  "NULL"?-o  ?-z?"$Behind"?];then  ??Behind=0  fi  echo?"Behind:$Behind"?&gt;&gt;  ?$SlaveStatusFile  ?  #delay  ?behind?master?判斷延時時間  if?[  ?$Behind?-gt?300?];then  ??echo?`date?+"%Y-%m%d  ?%H:%M:%S"`  "slave  ?is?behind?master?$Bebind?seconds!"?&gt;&gt;  ?$SlaveStatusFile  ??MailTitle="[Warning]Slave  ?delay?$Behind?seconds,from?$HOST?$PORT"  fi  ?  if?[  ?-n?"$MailTitle"?];then?#若出錯或者延時時間大于300s則發送郵件  ??cat?${SlaveStatusFile}  ?|?/bin/mail?-s  "$MailTitle"?$Mail_Address_MysqlStatus  fi  ?  #del  ?tmpfile:SlaveStatusFile  &gt;  ?$SlaveStatusFile

文章版權歸作者所有，未經允許請勿轉載。

THE END

數據庫
# 解決方法

詳解在線上MYSQL同步報錯故障處理方法代碼總結