當(dāng)前位置:首頁(yè) >  站長(zhǎng) >  數(shù)據(jù)庫(kù) >  正文

Postgresql 如何選擇正確的關(guān)閉模式

 2021-06-03 17:45  來(lái)源: 腳本之家   我來(lái)投稿 撤稿糾錯(cuò)

  域名預(yù)訂/競(jìng)價(jià),好“米”不錯(cuò)過(guò)

停止數(shù)據(jù)庫(kù)的命令:

1pg_ctl stop -D $PGDATA [-m shutdown-mode]

shutdown-mode有如下幾種模式:

1. smart: 等所有的連接中止后,關(guān)閉數(shù)據(jù)庫(kù)。如果客戶(hù)端連接不終止, 則無(wú)法關(guān)閉數(shù)據(jù)庫(kù)。

開(kāi)啟一個(gè)空會(huì)話:

[root@localhost ~]# su - postgres
[postgres@localhost ~]$ psql
psql (9.4.4)
Type "help" for help.
 
postgres=#

 

用smart關(guān)閉數(shù)據(jù)庫(kù):

[postgres@localhost ~]$ pg_ctl stop -D $PGDATA -m smart
waiting for server to shut down............................................................... failed
pg_ctl: server does not shut down
HINT: The "-m fast" option immediately disconnects sessions rather than
waiting for session-initiated disconnection

2. fast: 快速關(guān)閉數(shù)據(jù)庫(kù), 斷開(kāi)客戶(hù)端的連接,讓已有的事務(wù)回滾,然后正常關(guān)閉數(shù)據(jù)庫(kù)。

[postgres@localhost ~]$ pg_ctl stop -D $PGDATA -m fast
waiting for server to shut down.... done
server stopped

 

查看關(guān)閉日志:

LOG: received fast shutdown request
LOG: aborting any active transactions
FATAL: terminating connection due to administrator command
LOG: shutting down
LOG: database system is shut down

 

會(huì)話被強(qiáng)制中斷,然后關(guān)閉數(shù)據(jù)庫(kù)。

起一個(gè)事務(wù),然后測(cè)試關(guān)閉:

postgres=# create table t(id int primary key, name varchar(9));
CREATE TABLE
postgres=# begin;
BEGIN
postgres=# insert into t values(1,'a')
postgres-# ;
INSERT 0 1

 

不提交, 然后用FAST MODE去關(guān)閉數(shù)據(jù)庫(kù):

[postgres@localhost ~]$ pg_ctl stop -D $PGDATA -m fast
waiting for server to shut down.... done
server stopped

 

查看日志:

LOG: received fast shutdown request
LOG: aborting any active transactions
LOG: autovacuum launcher shutting down
FATAL: terminating connection due to administrator command
LOG: shutting down
LOG: database system is shut down

 

同樣是直接中斷會(huì)話, 而不去管事務(wù)有沒(méi)有提交。

postgres=# select * from t;
id | name
----+------
(0 rows)

 

沒(méi)有提交的數(shù)據(jù), 在重啟之后并不能查到。

3. immediate: 立即關(guān)閉數(shù)據(jù)庫(kù),立即停止數(shù)據(jù)庫(kù)進(jìn)程,直接退出,下次啟動(dòng)時(shí)會(huì)進(jìn)行實(shí)例恢復(fù)。

postgres=# insert into t values(1,'a')
;
INSERT 0 1
postgres=# select * from t;
id | name
----+------
1 | a
(1 row)

 

關(guān)閉數(shù)據(jù)庫(kù):

[postgres@localhost ~]$ pg_ctl stop -D $PGDATA -m immediate
waiting for server to shut down.... done
server stopped

 

查看日志:

LOG: received immediate shutdown request
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.

 

啟動(dòng)數(shù)據(jù)庫(kù):

[postgres@localhost ~]$ pg_ctl -D /apps/pgsql/pgdata -l 1.log start
server starting

 

查看日志:

LOG: database system was interrupted; last known up at 2017-04-27 18:56:47 PDT
LOG: database system was not properly shut down; automatic recovery in progress #提示非正常關(guān)機(jī),自動(dòng)開(kāi)啟恢復(fù)。
LOG: redo starts at 0/181F910
LOG: record with zero length at 0/181FA90
LOG: redo done at 0/181FA60
LOG: last completed transaction was at log time 2017-04-27 18:59:13.727213-07
LOG: MultiXact member wraparound protections are now enabled
LOG: autovacuum launcher started
LOG: database system is ready to accept connections

 

查看數(shù)據(jù):

[postgres@localhost ~]$ psql
psql (9.4.4)
Type "help" for help.
 
postgres=# select * from t;
id | name
----+------
1 | a
(1 row)

 

提交的數(shù)據(jù)已通過(guò)實(shí)例恢復(fù)。

小結(jié):

對(duì)比以上三種關(guān)庫(kù)模式:

smart最為安全,但最慢, 需要將所有連接都斷開(kāi)后,才會(huì)關(guān)庫(kù),默認(rèn)關(guān)庫(kù)模式。

fast強(qiáng)制中斷會(huì)話,而不管有操作有沒(méi)有提交,在做系統(tǒng)維護(hù)(系統(tǒng)維護(hù)時(shí)一般應(yīng)用都正常關(guān)閉了,或者不再會(huì)有事務(wù)操作。)時(shí),需要這種模式來(lái)關(guān)閉數(shù)據(jù)庫(kù)。

immediate最暴力的方式,不管數(shù)據(jù)有沒(méi)有落盤(pán)(POSGRE是遵循WAL機(jī)制),就直接關(guān)掉, 待啟動(dòng)時(shí)進(jìn)行實(shí)例恢復(fù), 如果在關(guān)庫(kù)前有大量的事務(wù)沒(méi)有寫(xiě)入磁盤(pán), 那這個(gè)恢復(fù)過(guò)程可能會(huì)非常的漫長(zhǎng)。

補(bǔ)充:postgresql 異步 stream replication 環(huán)境關(guān)閉 master 的驗(yàn)證

os: ubuntu 16.04

db: postgresql 9.6.8

驗(yàn)證在異步 stream replication環(huán)境下,主動(dòng)關(guān)閉master時(shí),數(shù)據(jù)是否有丟失,能丟失多少。

版本

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.5 LTS
Release: 16.04
Codename: xenial
# su - postgres -c "psql -c \"select version();\""
                 version                
----------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.8 on x86_64-pc-linux-gnu (Ubuntu 9.6.8-1.pgdg16.04+1), compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609, 64-bit
(1 row)

 

用 pgbench 模擬數(shù)據(jù)庫(kù)的大量數(shù)據(jù)操作

postgres=# create database pgbenchdb;
# su - postgres
$ pgbench -i -s 20 pgbenchdb
$ pgbench -r -j2 -c4 -T300 pgbenchdb

 

關(guān)閉 master

# su - postgres
$ /usr/lib/postgresql/9.6/bin/pg_ctl stop -m fast -D "/data/pg9.6/main"

 

提升 slave

# su - postgres
$ /usr/lib/postgresql/9.6/bin/pg_ctl promote -D "/data/pg9.6/main"

 

查看 old master 的 xlog location

$ /usr/lib/postgresql/9.6/bin/pg_xlogdump 000000010000000000000016
rmgr: XLOG  len (rec/tot): 106/ 106, tx:   0, lsn: 0/16000028, prev 0/152C9A10, desc: CHECKPOINT_SHUTDOWN redo 0/16000028; tli 1; prev tli 1; fpw true; xid 0:118746; oid 16432; multi 1; offset 0; oldest xid 543 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 550/118745; oldest running xid 0; shutdown
pg_xlogdump: FATAL: error in WAL record at 0/16000028: invalid record length at 0/16000098: wanted 24, got 0

 

可以看到 lsn: 0/16000028, prev 0/152C9A10, desc: CHECKPOINT_SHUTDOWN redo 0/16000028;

查看 new master 的 .history文件

$ ls -lt|more
total 360456
-rw------- 1 postgres postgres 16777216 Nov 30 10:32 000000020000000000000016
drwx------ 2 postgres postgres  4096 Nov 30 10:16 archive_status
-rw------- 1 postgres postgres  42 Nov 30 10:16 00000002.history
-rw------- 1 postgres postgres 16777216 Nov 30 10:16 000000010000000000000016.partial
-rw------- 1 postgres postgres 16777216 Nov 30 10:16 000000010000000000000015
-rw------- 1 postgres postgres 16777216 Nov 30 10:16 000000010000000000000014
-rw------- 1 postgres postgres 16777216 Nov 30 10:05 000000010000000000000013
$ cat 00000002.history
1 0/16000098 no recovery target specified
$ /usr/lib/postgresql/9.6/bin/pg_xlogdump 000000010000000000000016
rmgr: XLOG  len (rec/tot): 106/ 106, tx:   0, lsn: 0/16000028, prev 0/152C9A10, desc: CHECKPOINT_SHUTDOWN redo 0/16000028; tli 1; prev tli 1; fpw true; xid 0:118746; oid 16432; multi 1; offset 0; oldest xid 543 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 550/118745; oldest running xid 0; shutdown
rmgr: XLOG  len (rec/tot):  42/ 42, tx:   0, lsn: 0/16000098, prev 0/16000028, desc: END_OF_RECOVERY tli 2; prev tli 1; time 2018-11-30 10:16:57.249408 CST
rmgr: Standby  len (rec/tot):  50/ 50, tx:   0, lsn: 0/160000C8, prev 0/16000098, desc: RUNNING_XACTS nextXid 118746 latestCompletedXid 118745 oldestRunningXid 118746
rmgr: XLOG  len (rec/tot):  51/ 312, tx:   0, lsn: 0/16000100, prev 0/160000C8, desc: FPI_FOR_HINT , blkref #0: rel 1664/0/1260 blk 0 FPW
rmgr: Standby  len (rec/tot):  50/ 50, tx:   0, lsn: 0/16000238, prev 0/16000100, desc: RUNNING_XACTS nextXid 118746 latestCompletedXid 118745 oldestRunningXid 118746
pg_xlogdump: FATAL: error in WAL record at 0/16000238: invalid record length at 0/16000270: wanted 24, got 0

 

可以看到關(guān)鍵記錄

lsn: 0/16000028, prev 0/152C9A10, desc: CHECKPOINT_SHUTDOWN redo 0/16000028
lsn: 0/16000098, prev 0/16000028, desc: END_OF_RECOVERY

 

而 END_OF_RECOVERY 對(duì)應(yīng)的 lsn 為 0/16000098,和 00000002.history 時(shí)間線文件的內(nèi)容完全一致。

所以在異步 stream replication 環(huán)境下,主動(dòng)關(guān)閉master時(shí),會(huì)將最后一條記錄(CHECKPOINT_SHUTDOWN)發(fā)送給slave,不會(huì)造成數(shù)據(jù)的丟失。

而 synchronous_commit = on 保證事務(wù)有兩份持久化的落盤(pán)數(shù)據(jù)。

分析 pg_log 日志

old master 上的最后幾條日志

2018-11-30 10:16:40.986 CST,"postgres","pgbenchdb",7559,"[local]",5c009d79.1d87,4,"UPDATE waiting",2018-11-30 10:16:25 CST,,0,LOG,00000,"disconnection: session time: 0:00:15.723 user=postgres database=pgbenchdb host=[local]",,,,,,,,,"pgbench"
2018-11-30 10:16:40.993 CST,"postgres","pgbenchdb",7558,"[local]",5c009d79.1d86,3,"idle",2018-11-30 10:16:25 CST,4/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"pgbench"
2018-11-30 10:16:40.994 CST,"postgres","pgbenchdb",7560,"[local]",5c009d79.1d88,3,"idle",2018-11-30 10:16:25 CST,5/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"pgbench"
2018-11-30 10:16:40.994 CST,"postgres","pgbenchdb",7558,"[local]",5c009d79.1d86,4,"idle",2018-11-30 10:16:25 CST,,0,LOG,00000,"disconnection: session time: 0:00:15.729 user=postgres database=pgbenchdb host=[local]",,,,,,,,,"pgbench"
2018-11-30 10:16:40.994 CST,"postgres","pgbenchdb",7560,"[local]",5c009d79.1d88,4,"idle",2018-11-30 10:16:25 CST,,0,LOG,00000,"disconnection: session time: 0:00:15.725 user=postgres database=pgbenchdb host=[local]",,,,,,,,,"pgbench"
2018-11-30 10:16:40.999 CST,"postgres","pgbenchdb",7561,"[local]",5c009d79.1d89,3,"idle",2018-11-30 10:16:25 CST,6/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"pgbench"
2018-11-30 10:16:41.001 CST,"postgres","pgbenchdb",7561,"[local]",5c009d79.1d89,4,"idle",2018-11-30 10:16:25 CST,,0,LOG,00000,"disconnection: session time: 0:00:15.731 user=postgres database=pgbenchdb host=[local]",,,,,,,,,"pgbench"
2018-11-30 10:16:41.010 CST,,,7156,,5c009735.1bf4,7,,2018-11-30 09:49:41 CST,,0,LOG,00000,"shutting down",,,,,,,,,""
2018-11-30 10:16:41.209 CST,,,7156,,5c009735.1bf4,8,,2018-11-30 09:49:41 CST,,0,LOG,00000,"checkpoint starting: shutdown immediate",,,,,,,,,""
2018-11-30 10:16:47.623 CST,,,7156,,5c009735.1bf4,9,,2018-11-30 09:49:41 CST,,0,LOG,00000,"checkpoint complete: wrote 29357 buffers (89.6%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=5.931 s, sync=0.399 s, total=6.418 s; sync files=53, longest=0.179 s, average=0.007 s; distance=311294 kB, estimate=311294 kB",,,,,,,,,""
2018-11-30 10:16:47.683 CST,"repl","",7227,"192.168.56.90:52556",5c009795.1c3b,3,"streaming 0/16000098",2018-11-30 09:51:17 CST,,0,LOG,00000,"disconnection: session time: 0:25:30.149 user=repl database= host=192.168.56.90 port=52556",,,,,,,,,"walreceiver"
2018-11-30 10:16:47.730 CST,,,7153,,5c009735.1bf1,5,,2018-11-30 09:49:41 CST,,0,LOG,00000,"database system is shut down",,,,,,,,,""

 

注意倒數(shù)第二條信息 streaming 0/16000098 ,說(shuō)明當(dāng)時(shí)的master關(guān)閉時(shí),已經(jīng)和salve溝通過(guò),確認(rèn)已經(jīng)接收到 END_OF_RECOVERY 之前所有的數(shù)據(jù)了。

old slave 日志

2018-11-30 10:16:47.660 CST,,,7256,,5c009795.1c58,2,,2018-11-30 09:51:17 CST,,0,LOG,00000,"replication terminated by primary server","End of WAL reached on timeline 1 at 0/16000098.",,,,,,,,""
2018-11-30 10:16:47.660 CST,,,7256,,5c009795.1c58,3,,2018-11-30 09:51:17 CST,,0,FATAL,XX000,"could not send end-of-streaming message to primary: no COPY in progress
",,,,,,,,,""
2018-11-30 10:16:47.660 CST,,,7255,,5c009795.1c57,5,,2018-11-30 09:51:17 CST,1/0,0,LOG,00000,"invalid record length at 0/16000098: wanted 24, got 0",,,,,,,,,""
2018-11-30 10:16:47.854 CST,,,7443,,5c009d8f.1d13,1,,2018-11-30 10:16:47 CST,,0,FATAL,XX000,"could not connect to the primary server: could not connect to server: Connection refused
 Is the server running on host ""192.168.56.119"" and accepting
 TCP/IP connections on port 5432?
",,,,,,,,,""
2018-11-30 10:16:52.668 CST,,,7444,,5c009d94.1d14,1,,2018-11-30 10:16:52 CST,,0,FATAL,XX000,"could not connect to the primary server: could not connect to server: Connection refused
 Is the server running on host ""192.168.56.119"" and accepting
 TCP/IP connections on port 5432?
",,,,,,,,,""
2018-11-30 10:16:56.875 CST,,,7255,,5c009795.1c57,6,,2018-11-30 09:51:17 CST,1/0,0,LOG,00000,"received promote request",,,,,,,,,""
2018-11-30 10:16:56.875 CST,,,7255,,5c009795.1c57,7,,2018-11-30 09:51:17 CST,1/0,0,LOG,00000,"redo done at 0/16000028",,,,,,,,,""
2018-11-30 10:16:56.875 CST,,,7255,,5c009795.1c57,8,,2018-11-30 09:51:17 CST,1/0,0,LOG,00000,"last completed transaction was at log time 2018-11-30 10:16:40.986869+08",,,,,,,,,""
2018-11-30 10:16:56.888 CST,,,7255,,5c009795.1c57,9,,2018-11-30 09:51:17 CST,1/0,0,LOG,00000,"selected new timeline ID: 2",,,,,,,,,""
2018-11-30 10:16:57.166 CST,,,7255,,5c009795.1c57,10,,2018-11-30 09:51:17 CST,1/0,0,LOG,00000,"archive recovery complete",,,,,,,,,""
2018-11-30 10:16:57.267 CST,,,7255,,5c009795.1c57,11,,2018-11-30 09:51:17 CST,1/0,0,LOG,00000,"MultiXact member wraparound protections are now enabled",,,,,,,,,""
2018-11-30 10:16:57.267 CST,,,7257,,5c009795.1c59,1,,2018-11-30 09:51:17 CST,,0,LOG,00000,"checkpoint starting: force",,,,,,,,,""
2018-11-30 10:16:57.275 CST,,,7253,,5c009795.1c55,3,,2018-11-30 09:51:17 CST,,0,LOG,00000,"database system is ready to accept connections",,,,,,,,,""
2018-11-30 10:16:57.276 CST,,,7447,,5c009d99.1d17,1,,2018-11-30 10:16:57 CST,,0,LOG,00000,"autovacuum launcher started",,,,,,,,,""

信息也是相當(dāng)?shù)那逦?/p>

wal_retrieve_retry_interval = 5s 控制 salve 到 master 失敗時(shí),再次重試的等待時(shí)間。

文章來(lái)源:腳本之家

來(lái)源地址:https://www.jb51.net/article/204307.htm

申請(qǐng)創(chuàng)業(yè)報(bào)道,分享創(chuàng)業(yè)好點(diǎn)子。點(diǎn)擊此處,共同探討創(chuàng)業(yè)新機(jī)遇!

相關(guān)文章

熱門(mén)排行

信息推薦