- 論壇徽章:
- 0
|
PostgreSQL的PITR技術(shù)(Point-In-Time-Recovery)
--Seamus Dean 2005-04-11(at PostgreSQL-8.0.2 release)
為什么要寫這篇文章?
因?yàn)槲铱戳艘幌,國?nèi)所有的PostgreSQL教程都沒有很詳細(xì)的介紹該功能,而相反,國內(nèi)的Oracle
文章對這塊非常的看重。雖然,PostgreSQL的官方文檔有一個章節(jié)是介紹這塊內(nèi)容的,
但是寫得太過‘文學(xué)’化。
的確,一個數(shù)據(jù)庫的可靠性和完整性是非常重要的,否則,很難叫人們所接受它。
本文假設(shè)讀者對PostgreSQL已經(jīng)有基本的認(rèn)識,如果你對PostgreSQL還完全不熟悉的話,建議你先
去http://www.postgresql.org看看它的Documentation.
作為最強(qiáng)大的開源數(shù)據(jù)庫,PostgreSQL擁有一切商業(yè)數(shù)據(jù)庫所擁有的功能,甚至比商業(yè)數(shù)據(jù)庫更好。
在以前的版本中,它在數(shù)據(jù)恢復(fù),可靠性方面做的不太好,但經(jīng)過最近幾年的發(fā)展,已經(jīng)可以和Oracle
媲美了。
在PostgreSQL7的時候就引入了WAL(Write Ahead Logging)的概念,即預(yù)寫日志,所有對數(shù)據(jù)庫的更改,
在更改之前必須寫到該LOG中,這樣,就算機(jī)器斷電,PostgreSQL也可以從該LOG中知道數(shù)據(jù)庫在斷電前做
了什么操作,已經(jīng)做到第幾步了,這樣保證了所有事務(wù)的完整性,但PostgreSQL7沒有提供很好的災(zāi)難恢復(fù)
機(jī)制,一旦數(shù)據(jù)庫崩潰,除非你曾經(jīng)對數(shù)據(jù)庫作過pg_dump或者file system level backup,否則,你的數(shù)據(jù)
將全部丟失,并且,就算你曾經(jīng)對數(shù)據(jù)庫做過備份,也只能恢復(fù)到你備份的那一刻的數(shù)據(jù),這對一個生產(chǎn)數(shù)據(jù)庫
(特別是24*7生產(chǎn)庫)來說,是無法容忍的。
PostgreSQL8的推出,使PostgreSQL的穩(wěn)定性和可靠性又邁出了劃時代的一步。
除了提供對tablespace的支持外,PostgreSQL8提供了支持時間點(diǎn)的恢復(fù)---PITR.
其基本原理和Oracle的熱備份完全一樣:
首先,對數(shù)據(jù)庫在file system level做一個backup(PostgreSQL是首先用pg_start_backup('label')命令,
然后用tar直接tar整個data目錄,假設(shè)命名為base.tar,然后pg_stop_backup();結(jié)束熱備。
Oracle首先是用alter tablespace xxx begin backup,然后直接cp數(shù)據(jù)文件);
然后,備份相關(guān)的配置文件(PostgreSQL只需備份postgresql.conf,pg_hba.conf,pg_ident.conf就可以了,其實(shí),
前面的tar已經(jīng)將這些文件備份了,Oracle需要alter database backup control file......);
最后,備份WAL(
可以設(shè)置postgresql.conf中的archive_command,
該命令可以讓PostgreSQL8自動將需要的歸檔的日志文件備份的其他地方中。
但是注意:如果你是讓PostgreSQL8調(diào)用archive_command來備份WAL的話,
可能根本就做不到PITR,我做過實(shí)驗(yàn),如果依靠base.tar和archive_command產(chǎn)生的WAL其實(shí)只能恢復(fù)到最后一個
archive_command保存的WAL的數(shù)據(jù),pg_xlog/下面可能還有數(shù)據(jù),如果PostgreSQL8的數(shù)據(jù)目錄徹底損壞的話,還是會
丟失數(shù)據(jù),所以,我建議,在寫數(shù)據(jù)備份腳本的時候,最好將pg_xlog/下面的WAL也一起備份,見下面的cpArch.sh。
)。
如果數(shù)據(jù)庫崩潰,我們就可以使用熱備產(chǎn)生的base.tar和archive_command產(chǎn)生的WAL和我們自己備份的WAL(pg_xlog)來進(jìn)行數(shù)據(jù)庫的
recovery.
下面舉例來說明:
我的PostgreSQL運(yùn)行在:/home/pgsql/下面
數(shù)據(jù)目錄在:/home/pgsql/database/
將熱備數(shù)據(jù)文件備份到/disk3/PostgreSQL/base/下面
將WAL備份到/disk3/PostgreSQL/archives/下面
postgresql.conf中定義了如下的archive_command:
archive_command = 'cp -f %p /disk3/PostgreSQL/archives/%f'
該命令會將PostgreSQL產(chǎn)生的WAL cp到/disk3/PostgreSQL/archives/中。
我的熱備腳本如下:
(1)為了使丟失的數(shù)據(jù)在一分鐘之內(nèi),在crontab中每分鐘將pg_xlog/下面的WAL
backup到/disk3/PostgreSQL/archives/。
crontab:
*/1 * * * * /home/pgsql/bin/cpArch.sh
cpArch.sh:
#!/bin/sh
cp -f /home/pgsql/database/pg_xlog/[0-9]* /disk3/PostgreSQL/archives/
(2)編寫熱備腳本hotBackup.pl(我用perl):
#!/usr/bin/perl
#############################################################
# hotBackup.pl
# Use to hot backup the PostgreSQL database.
# Author:Seamus Dean
# Date:2005-04-11
##############################################################
my($datadir) ="/home/pgsql/database";
my($bindir) ="/home/pgsql/bin";
my($backupdir) ="/disk3/PostgreSQL/base";
my($receiver) ="ljh13\@sina.com.cn";
sub begin_backup()
{
open(PSQL,"$bindir/psql" or mail_user("begin backup error." && exit(100);
print PSQL "select pg_start_backup('backupnow');\n";
close(PSQL);
}
sub end_backup()
{
open(PSQL,"$bindir/psql" or mail_user("end backup error." && exit(100);
print PSQL "select pg_end_backup();\n";
close(PSQL);
}
sub do_backup()
{
system("/bin/tar cvf base.tar $datadir" ;
system("/bin/mv -f base.tar $backupdir/" ;
}
sub mail_user()
{
my($msg) =@_;
open(MAIL,"|/bin/mail -s backup-result $receiver" or die("can not talk to:mail command.\n" ;
print MAIL $msg;
close(MAIL);
}
###################################
# tell psql begin our backup
###################################
&begin_backup();
###################################
# do tar
###################################
&do_backup();
####################################
# tell psql end backup
####################################
&end_backup();
####################################
# mail the user about the result
####################################
&mail_user(" ostgreSQL backup successfully." ;
到這里,備份腳本基本上就完了,你可以將hotBackup.pl放在crontab中周期性的執(zhí)行。
就算/home/pgsql/database目錄徹底崩潰,我們可以像下面這樣迅速恢復(fù)到1分鐘內(nèi)的數(shù)據(jù):
#cp /disk3/PostgreSQL/base/base.tar ./
#tar xvf base.tar
#cd database/
#vi recovery.conf
輸入如下內(nèi)容:
restore_command='cp /disk3/PostgreSQL/archives/%f "%p"'
然后將/home/pgsql/database/pg_xlog/下面的WAL清空。
啟動PostgreSQL,我們可以看到如下的LOG信息:
LOG: could not create IPv6 socket: Address family not supported by protocol
LOG: database system was interrupted at 2005-04-11 23:13:28 PDT
LOG: starting archive recovery
LOG: restore_command = "cp /disk3/PostgreSQL/archives/%f "%p""
cp: cannot stat `/disk3/PostgreSQL/archives/00000001.history': No such file or directory
LOG: restored log file "00000001000000000000002E.008EFCAC.backup" from archive
LOG: restored log file "00000001000000000000002E" from archive
LOG: checkpoint record is at 0/2E8EFCAC
LOG: redo record is at 0/2E8EFCAC; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 5271; next OID: 6351357
LOG: automatic recovery in progress
LOG: redo starts at 0/2E8EFCE8
LOG: restored log file "00000001000000000000002F" from archive
LOG: restored log file "000000010000000000000030" from archive
LOG: restored log file "000000010000000000000031" from archive
LOG: restored log file "000000010000000000000032" from archive
LOG: restored log file "000000010000000000000033" from archive
LOG: restored log file "000000010000000000000034" from archive
LOG: restored log file "000000010000000000000035" from archive
LOG: restored log file "000000010000000000000036" from archive
LOG: restored log file "000000010000000000000037" from archive
LOG: restored log file "000000010000000000000038" from archive
LOG: restored log file "000000010000000000000039" from archive
LOG: restored log file "00000001000000000000003A" from archive
LOG: restored log file "00000001000000000000003B" from archive
LOG: restored log file "00000001000000000000003C" from archive
LOG: restored log file "00000001000000000000003D" from archive
LOG: restored log file "00000001000000000000003E" from archive
LOG: restored log file "00000001000000000000003F" from archive
LOG: restored log file "000000010000000000000040" from archive
LOG: restored log file "000000010000000000000041" from archive
LOG: restored log file "000000010000000000000042" from archive
LOG: restored log file "000000010000000000000043" from archive
LOG: restored log file "000000010000000000000044" from archive
LOG: restored log file "000000010000000000000045" from archive
LOG: restored log file "000000010000000000000046" from archive
LOG: restored log file "000000010000000000000047" from archive
LOG: restored log file "000000010000000000000048" from archive
LOG: restored log file "000000010000000000000049" from archive
LOG: restored log file "00000001000000000000004A" from archive
LOG: restored log file "00000001000000000000004B" from archive
LOG: restored log file "00000001000000000000004C" from archive
LOG: record with zero length at 0/4C2BABE4
LOG: redo done at 0/4C2BABA8
LOG: restored log file "00000001000000000000004C" from archive
LOG: archive recovery complete
LOG: database system is ready
顯示數(shù)據(jù)已經(jīng)成功恢復(fù)。
/home/pgsql/database/下面的recovery.conf會變?yōu)椋簉ecovery.done.
結(jié)論:
PostgreSQL8的PITR已經(jīng)做得非常的成功,完全有可能替代Oracle,Sqlserver
而成為企業(yè)的首選。所以,我們玩PostgreSQL的兄弟們,一定要對它有信心! |
|