Postgresql中xlog生成和清理邏輯操作
0 前言
1、2部分是對XLOG生成和清理邏輯的分析,XLOG暴漲的處理直接看第3部分。
1 WAL歸檔
# 在自動的WAL檢查點之間的日志文件段的最大數量 checkpoint_segments = # 在自動WAL檢查點之間的最長時間 checkpoint_timeout = # 緩解io壓力 checkpoint_completion_target = # 日志文件段的保存最小數量,為瞭備庫保留更多段 wal_keep_segments = # 已完成的WAL段通過archive_command發送到歸檔存儲 archive_mode = # 強制timeout切換到新的wal段文件 archive_timeout = max_wal_size = min_wal_size =
1.1 不開啟歸檔時
文件數量受下面幾個參數控制,通常不超過
(2 + checkpoint_completion_target) * checkpoint_segments + 1
或
checkpoint_segments + wal_keep_segments + 1
個文件。
如果一個舊段文件不再需要瞭會重命名然後繼續覆蓋使用,如果由於短期的日志輸出高峰導致瞭超過
3 * checkpoint_segments + 1
個文件,直接刪除文件。
1.2 開啟歸檔時
文件數量:刪除歸檔成功的段文件
抽象來看一個運行的PG生成一個無限長的WAL日志序列。每段16M,這些段文件的名字是數值命名的,反映在WAL序列中的位置。在不用WAL歸檔的時候,系統通常隻是創建幾個段文件然後循環使用,方法是把不再使用的段文件重命名為更高的段編號。
當且僅當歸檔命令成功時,歸檔命令返回零。 在得到一個零值結果之後,PostgreSQL將假設該WAL段文件已經成功歸檔,稍後將刪除段文件。一個非零值告訴PostgreSQL該文件沒有被歸檔,會周期性的重試直到成功。
2 PG源碼分析
2.1 刪除邏輯
觸發刪除動作
RemoveOldXlogFiles > CreateCheckPoint > CreateRestartPoint
wal_keep_segments判斷(調用這個函數修改_logSegNo,然後再傳入RemoveOldXlogFiles)
static void KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo) { XLogSegNo segno; XLogRecPtr keep; XLByteToSeg(recptr, segno); keep = XLogGetReplicationSlotMinimumLSN(); /* compute limit for wal_keep_segments first */ if (wal_keep_segments > 0) { /* avoid underflow, don't go below 1 */ if (segno <= wal_keep_segments) segno = 1; else segno = segno - wal_keep_segments; } /* then check whether slots limit removal further */ if (max_replication_slots > 0 && keep != InvalidXLogRecPtr) { XLogSegNo slotSegNo; XLByteToSeg(keep, slotSegNo); if (slotSegNo <= 0) segno = 1; else if (slotSegNo < segno) segno = slotSegNo; } /* don't delete WAL segments newer than the calculated segment */ if (segno < *logSegNo) *logSegNo = segno; }
刪除邏輯
static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr endptr) { ... ... while ((xlde = ReadDir(xldir, XLOGDIR)) != NULL) { /* Ignore files that are not XLOG segments */ if (strlen(xlde->d_name) != 24 || strspn(xlde->d_name, "0123456789ABCDEF") != 24) continue; /* * We ignore the timeline part of the XLOG segment identifiers in * deciding whether a segment is still needed. This ensures that we * won't prematurely remove a segment from a parent timeline. We could * probably be a little more proactive about removing segments of * non-parent timelines, but that would be a whole lot more * complicated. * * We use the alphanumeric sorting property of the filenames to decide * which ones are earlier than the lastoff segment. */ if (strcmp(xlde->d_name + 8, lastoff + 8) <= 0) { if (XLogArchiveCheckDone(xlde->d_name)) # 歸檔關閉返回真 # 存在done文件返回真 # 存在.ready返回假 # recheck存在done文件返回真 # 重建.ready文件返回假 { /* Update the last removed location in shared memory first */ UpdateLastRemovedPtr(xlde->d_name); # 回收 或者 直接刪除,清理.done和.ready文件 RemoveXlogFile(xlde->d_name, endptr); } } } ... ... }
2.2 歸檔邏輯
static void pgarch_ArchiverCopyLoop(void) { char xlog[MAX_XFN_CHARS + 1]; # 拿到最老那個沒有被歸檔的xlog文件名 while (pgarch_readyXlog(xlog)) { int failures = 0; for (;;) { /* * Do not initiate any more archive commands after receiving * SIGTERM, nor after the postmaster has died unexpectedly. The * first condition is to try to keep from having init SIGKILL the * command, and the second is to avoid conflicts with another * archiver spawned by a newer postmaster. */ if (got_SIGTERM || !PostmasterIsAlive()) return; /* * Check for config update. This is so that we'll adopt a new * setting for archive_command as soon as possible, even if there * is a backlog of files to be archived. */ if (got_SIGHUP) { got_SIGHUP = false; ProcessConfigFile(PGC_SIGHUP); } # archive_command沒設的話不再執行 # 我們的command沒有設置,走的是這個分支 if (!XLogArchiveCommandSet()) { /* * Change WARNING to DEBUG1, since we will left archive_command empty to * let external tools to manage archive */ ereport(DEBUG1, (errmsg("archive_mode enabled, yet archive_command is not set"))); return; } # 執行歸檔命令! if (pgarch_archiveXlog(xlog)) { # 成功瞭,把.ready改名為.done pgarch_archiveDone(xlog); /* * Tell the collector about the WAL file that we successfully * archived */ pgstat_send_archiver(xlog, false); break; /* out of inner retry loop */ } else { /* * Tell the collector about the WAL file that we failed to * archive */ pgstat_send_archiver(xlog, true); if (++failures >= NUM_ARCHIVE_RETRIES) { ereport(WARNING, (errmsg("archiving transaction log file \"%s\" failed too many times, will try again later", xlog))); return; /* give up archiving for now */ } pg_usleep(1000000L); /* wait a bit before retrying */ } } } }
2.3 ready生成邏輯
static void XLogWrite(XLogwrtRqst WriteRqst, bool flexible) { ... if (finishing_seg) { issue_xlog_fsync(openLogFile, openLogSegNo); /* signal that we need to wakeup walsenders later */ WalSndWakeupRequest(); LogwrtResult.Flush = LogwrtResult.Write; /* end of page */ # 歸檔打開 && wal_level >= archive if (XLogArchivingActive()) # 生成ready文件 XLogArchiveNotifySeg(openLogSegNo); XLogCtl->lastSegSwitchTime = (pg_time_t) time(NULL); ...
2.4 總結
ready文件隻要滿足archive_mode=on和wal_lever>=archive,就總會生成(XLogWrite函數調用生成)
因為archive_command設置空,所以ready文件的消費完全由外部程序控制
done文件的處理由PG完成,兩個地方會觸發done文件處理,檢查點和重啟點
處理多少done文件受wal_keep_segments和replication_slot控制(KeepLogSeg函數)
3 WAL段累積的原因(長求總?)
註意:無論如何註意不要手動刪除xlog文件
註意:checkpoint產生的日志回不立即生成ready文件,是在下一個xlog後一塊生成的
3.1 ReplicationSlot
打開流瞭復制槽
-- 流復制插槽 -- 如果restart_lsn和當前XLOG相差非常大的字節數, 需要排查slot的訂閱者是否能正常接收XLOG, -- 或者訂閱者是否正常. 長時間不將slot的數據取走, pg_xlog目錄可能會撐爆 select pg_xlog_location_diff(pg_current_xlog_location(),restart_lsn), * from pg_replication_slots;
刪除
select pg_drop_replication_slot('xxx');
刪除後PG會在下一個checkpoint清理xlog
3.2 較大的wal_keep_segments
檢查參數配置,註意打開這個參數會使xlog和ready有一定延遲
3.3 回收出現問題
如果不使用PG自動回收機制,數據庫依賴外部程序修改.ready文件,需要檢測回收進程
(archive_mode=on archive_command='')
3.4 檢查點間隔過長
檢查參數配置
以上為個人經驗,希望能給大傢一個參考,也希望大傢多多支持WalkonNet。如有錯誤或未考慮完全的地方,望不吝賜教。
推薦閱讀:
- Postgresql 如何清理WAL日志
- PostgreSQL pg_archivecleanup與清理archivelog的操作
- postgresql 利用xlog進行熱備操作
- pgsql 如何手動觸發歸檔
- 使用pg_basebackup對Postgre進行備份與恢復的實現