MySQL 實例無法啟動的問題分析及解決

前言

前幾天,有位朋友微信聯系我,告知一個生產數據庫,在機器宕機恢復後,實例啟動失敗,而且該實例沒有做任何的高可用、容災、備份等,對業務影響非常大,希望能夠協助排查一下,我也在第一時間就加入到排查中。

場景分析

(1)首先查看錯誤日志,報錯很清晰”Could not open log file”,無法打開日志文件

2021-01-06 13:23:51 20464 [ERROR] Failed to open log (file 'something is definitely wrong and this may fail.', errno 2)
2021-01-06 13:23:51 20464 [ERROR] Could not open log file
2021-01-06 13:23:51 20464 [ERROR] Can't init tc log
2021-01-06 13:23:51 20464 [ERROR] Aborting

(2)看到上述報錯後,當然就應該想到,檢查下my.cnf配置是否正確、日志目錄和權限是否正確,但排查並未發現問題

# less my.cnf
datadir=/var/lib/mysql
log-bin=mysql-bin
relay-log=relay-bin

# ls -lrt
-rw-rw---- 1 mysql mysql 1073761373 Jan 4 06:18 mysql-bin.007351
-rw-rw---- 1 mysql mysql 1073755587 Jan 4 09:26 mysql-bin.007352
-rw-rw---- 1 mysql mysql 1073777045 Jan 4 12:07 mysql-bin.007353
-rw-rw---- 1 mysql mysql 1073742801 Jan 4 15:12 mysql-bin.007354
-rw-rw---- 1 mysql mysql 1074087344 Jan 4 18:13 mysql-bin.007355
-rw-rw---- 1 mysql mysql 1073869414 Jan 4 21:32 mysql-bin.007356
-rw-rw---- 1 mysql mysql 1073771900 Jan 5 00:16 mysql-bin.007357
-rw-rw---- 1 mysql mysql 213063247 Jan 5 01:00 mysql-bin.007358
-rw-rw---- 1 mysql mysql 1073753668 Jan 5 02:11 mysql-bin.007359
-rw-rw---- 1 mysql mysql 671219722 Jan 5 03:31 mysql-bin.007360
-rw-rw---- 1 mysql mysql 1073774928 Jan 5 07:34 mysql-bin.007361
-rw-rw---- 1 mysql mysql 1073845285 Jan 5 11:33 mysql-bin.007362
-rw-rw---- 1 mysql mysql 1073756444 Jan 5 15:37 mysql-bin.007363
-rw-rw---- 1 mysql mysql 1073790555 Jan 5 19:37 mysql-bin.007364
-rw-rw---- 1 mysql mysql 1073768027 Jan 5 23:59 mysql-bin.007365
-rw-rw---- 1 mysql mysql 311398643 Jan 6 01:00 mysql-bin.007366
-rw-rw---- 1 mysql mysql 1071242043 Jan 6 03:31 mysql-bin.007367
-rw-rw---- 1 mysql mysql 1010516229 Jan 6 07:27 mysql-bin.007368
-rw-rw---- 1 mysql mysql 1651 Jan 6 07:27 mysql-bin.index
-rw-rw---- 1 mysql mysql 1073741824 Jan 6 12:08 ib_logfile1
-rw-r--r-- 1 mysql mysql 183 Jan 6 13:23 VM_58_10_centos-slow.log
-rw-rw---- 1 mysql mysql 1073741824 Jan 6 13:23 ib_logfile0
-rw-rw---- 1 mysql mysql 7492941 Jan 6 13:23 VM_58_10_centos.err

(3)報錯有一個非常奇怪的點,file ‘something is definitely wrong and this may fail.’ ,為什麼日志文件名會那麼奇怪呢?這裡需要知道的是,mysql-bin.index記錄瞭binlog相關信息,MySQL實例啟動時需要讀取該文件獲取信息;那接下來就檢查一下該文件,發現果然有問題,mysql-bin.index後半部分錯誤寫入瞭錯誤日志的內容,從而導致實例啟動時讀取到錯誤內容(當作binlog日志文件進行處理)報錯失敗

# cat mysql-bin.index 
./mysql-bin.007351
./mysql-bin.007352
./mysql-bin.007353
./mysql-bin.007354
./mysql-bin.007355
./mysql-bin.007356
./mysql-bin.007357
./mysql-bin.007358
./mysql-bin.007359
./mysql-bin.007360
./mysql-bin.007361
./mysql-bin.007362
./mysql-bin.007363
./mysql-bin.007364
./mysql-bin.007365
./mysql-bin.007366
./mysql-bin.007367
./mysql-bin.007368
23:27:31 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=16777216
read_buffer_size=3145728
max_used_connections=523
max_threads=800
thread_count=522
connection_count=522
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 9037821 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

(4)原因定位後,解決辦法就是將mysql-bin.index文件備份後,手動進行修復,隨後啟動實例成功

# ./mysql start
Starting MySQL.... SUCCESS! 
Checking mysql connection: connection ok!

# ps -ef | grep mysqld
root 22955 1 0 13:30 pts/5 00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --pid-file=/var/lib/mysql/VM_58_10_centos.pid
mysql 23733 22955 24 13:30 pts/5 00:00:05 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/lib/mysql/VM_58_10_centos.err --open-files-limit=20000 --pid-file=/var/lib/mysql/VM_58_10_centos.pid --socket=/var/lib/mysql/mysql.sock --port=3306
root 32075 14929 0 13:30 pts/5 00:00:00 grep mysqld

總結

至此,問題得到解決,至於為什麼錯誤日志內容會寫入到mysql-bin.index,個人懷疑是宕機導致文件錯亂(該宿主機上的其他虛擬機有出現文件系統損壞的情況);最後,還是得強調一下,生產系統還是得重視起來,備份、高可用、容災都是不可或缺的。

以上就是MySQL 實例無法啟動的問題分析及解決的詳細內容,更多關於MySQL 實例無法啟動的資料請關註WalkonNet其它相關文章!

推薦閱讀: