123,123

B-tree索引適合用于存儲排序的數(shù)據(jù)。對于這種數(shù)據(jù)類型需要定義大于、大于等于、小于、小于等于操作符。

通常情況下，B-tree的索引記錄存儲在數(shù)據(jù)頁中。葉子頁中的記錄包含索引數(shù)據(jù)（keys）以及指向heap tuple記錄（即表的行記錄TIDs）的指針。內(nèi)部頁中的記錄包含指向索引子頁的指針和子頁中最小值。

B-tree有幾點重要的特性：

1、B-tree是平衡樹，即每個葉子頁到root頁中間有相同個數(shù)的內(nèi)部頁。因此查詢?nèi)魏我粋€值的時間是相同的。

2、B-tree中一個節(jié)點有多個分支，即每頁（通常8KB）具有許多TIDs。因此B-tree的高度比較低，通常4到5層就可以存儲大量行記錄。

3、索引中的數(shù)據(jù)以非遞減的順序存儲（頁之間以及頁內(nèi)都是這種順序），同級的數(shù)據(jù)頁由雙向鏈表連接。因此不需要每次都返回root，通過遍歷鏈表就可以獲取一個有序的數(shù)據(jù)集。

下面是一個索引的簡單例子，該索引存儲的記錄為整型并只有一個字段：

該索引最頂層的頁是元數(shù)據(jù)頁，該數(shù)據(jù)頁存儲索引root頁的相關(guān)信息。內(nèi)部節(jié)點位于root下面，葉子頁位于最下面一層。向下的箭頭表示由葉子節(jié)點指向表記錄（TIDs）。

等值查詢

例如通過"indexed-field = expression"形式的條件查詢49這個值。

root節(jié)點有三個記錄：(4,32,64)。從root節(jié)點開始進行搜索，由于32≤ 49 < 64，所以選擇32這個值進入其子節(jié)點。通過同樣的方法繼續(xù)向下進行搜索一直到葉子節(jié)點，最后查詢到49這個值。

實際上，查詢算法遠不止看上去的這么簡單。比如，該索引是非唯一索引時，允許存在許多相同值的記錄，并且這些相同的記錄不止存放在一個頁中。此時該如何查詢？我們返回到上面的的例子，定位到第二層節(jié)點(32,43,49)。如果選擇49這個值并向下進入其子節(jié)點搜索，就會跳過前一個葉子頁中的49這個值。因此，在內(nèi)部節(jié)點進行等值查詢49時，定位到49這個值，然后選擇49的前一個值43，向下進入其子節(jié)點進行搜索。最后，在底層節(jié)點中從左到右進行搜索。

(另外一個復(fù)雜的地方是，查詢的過程中樹結(jié)構(gòu)可能會改變，比如分裂)

非等值查詢

通過"indexed-field ≤ expression" (or "indexed-field ≥ expression")查詢時，首先通過"indexed-field = expression"形式進行等值（如果存在該值）查詢，定位到葉子節(jié)點后，再向左或向右進行遍歷檢索。

下圖是查詢 n ≤ 35的示意圖：

大于和小于可以通過同樣的方法進行查詢。查詢時需要排除等值查詢出的值。

范圍查詢

范圍查詢"expression1 ≤ indexed-field ≤ expression2"時，需要通過 "expression1 ≤ indexed-field =expression2"找到一匹配值，然后在葉子節(jié)點從左到右進行檢索，一直到不滿足"indexed-field ≤ expression2" 的條件為止；或者反過來，首先通過第二個表達式進行檢索，在葉子節(jié)點定位到該值后，再從右向左進行檢索，一直到不滿足第一個表達式的條件為止。

下圖是23 ≤ n ≤ 64的查詢示意圖:

案例

下面是一個查詢計劃的實例。通過demo database中的aircraft表進行介紹。該表有9行數(shù)據(jù)，由于整個表只有一個數(shù)據(jù)頁，所以執(zhí)行計劃不會使用索引。為了解釋說明問題，我們使用整個表進行說明。

（更準確的方式：create index on aircrafts using btree(range)，創(chuàng)建索引時默認構(gòu)建B-tree索引。）

等值查詢的執(zhí)行計劃：

demo=# explain(costs off) select * from aircrafts where range = 3000;
QUERY PLAN
---------------------------------------------------
Index Scan using aircrafts_range_idx on aircrafts
Index Cond: (range = 3000)
(2 rows)

非等值查詢的執(zhí)行計劃：

demo=# explain(costs off) select * from aircrafts where range < 3000;
QUERY PLAN
---------------------------------------------------
Index Scan using aircrafts_range_idx on aircrafts
Index Cond: (range < 3000)
(2 rows)

范圍查詢的執(zhí)行計劃：

demo=# explain(costs off) select * from aircrafts
where range between 3000 and 5000;
QUERY PLAN
-----------------------------------------------------
Index Scan using aircrafts_range_idx on aircrafts
Index Cond: ((range >= 3000) AND (range <= 5000))
(2 rows)

排序

再次強調(diào)，通過index、index-only或bitmap掃描，btree訪問方法可以返回有序的數(shù)據(jù)。因此如果表的排序條件上有索引，優(yōu)化器會考慮以下方式：表的索引掃描；表的順序掃描然后對結(jié)果集進行排序。

排序順序

當(dāng)創(chuàng)建索引時可以明確指定排序順序。如下所示，在range列上建立一個索引，并且排序順序為降序：

1demo=# create index on aircrafts(range desc);

本案例中，大值會出現(xiàn)在樹的左邊，小值出現(xiàn)在右邊。為什么有這樣的需求？這樣做是為了多列索引。創(chuàng)建aircraft的一個視圖，通過range分成3部分：

然后創(chuàng)建一個索引（使用下面表達式）：

demo=# create index on aircrafts(
(case when range < 4000 then 1 when range < 10000 then 2 else 3 end),
model);

現(xiàn)在，可以通過索引以升序的方式獲取排序的數(shù)據(jù)：

demo=# select class, model from aircrafts_v order by class, model;
class | model
-------+---------------------
1 | Bombardier CRJ-200
1 | Cessna 208 Caravan
1 | Sukhoi SuperJet-100
2 | Airbus A319-100
2 | Airbus A320-200
2 | Airbus A321-200
2 | Boeing 737-300
2 | Boeing 767-300
3 | Boeing 777-300
(9 rows)

demo=# explain(costs off)
select class, model from aircrafts_v order by class, model;
QUERY PLAN
--------------------------------------------------------
Index Scan using aircrafts_case_model_idx on aircrafts
(1 row)

同樣，可以以降序的方式獲取排序的數(shù)據(jù)：

demo=# select class, model from aircrafts_v order by class desc, model desc;
class | model
-------+---------------------
3 | Boeing 777-300
2 | Boeing 767-300
2 | Boeing 737-300
2 | Airbus A321-200
2 | Airbus A320-200
2 | Airbus A319-100
1 | Sukhoi SuperJet-100
1 | Cessna 208 Caravan
1 | Bombardier CRJ-200
(9 rows)
demo=# explain(costs off)
select class, model from aircrafts_v order by class desc, model desc;
QUERY PLAN
-----------------------------------------------------------------
Index Scan BACKWARD using aircrafts_case_model_idx on aircrafts
(1 row)

然而，如果一列以升序一列以降序的方式獲取排序的數(shù)據(jù)的話，就不能使用索引，只能單獨排序：

demo=# explain(costs off)
select class, model from aircrafts_v order by class ASC, model DESC;
QUERY PLAN
-------------------------------------------------
Sort
Sort Key: (CASE ... END), aircrafts.model DESC
-> Seq Scan on aircrafts
(3 rows)

（注意，最終執(zhí)行計劃會選擇順序掃描，忽略之前設(shè)置的enable_seqscan = off。因為這個設(shè)置并不會放棄表掃描，只是設(shè)置他的成本----查看costs on的執(zhí)行計劃）

若有使用索引，創(chuàng)建索引時指定排序的方向：

demo=# create index aircrafts_case_asc_model_desc_idx on aircrafts(
(case
when range < 4000 then 1
when range < 10000 then 2
else 3
end) ASC,
model DESC);

demo=# explain(costs off)
select class, model from aircrafts_v order by class ASC, model DESC;
QUERY PLAN
-----------------------------------------------------------------
Index Scan using aircrafts_case_asc_model_desc_idx on aircrafts
(1 row)

列的順序

當(dāng)使用多列索引時與列的順序有關(guān)的問題會顯示出來。對于B-tree，這個順序非常重要：頁中的數(shù)據(jù)先以第一個字段進行排序，然后再第二個字段，以此類推。

下圖是在range和model列上構(gòu)建的索引：

當(dāng)然，上圖這么小的索引在一個root頁足以存放。但是為了清晰起見，特意將其分成幾頁。

從圖中可見，通過類似的謂詞class = 3（僅按第一個字段進行搜索）或者class = 3 and model = 'Boeing 777-300'（按兩個字段進行搜索）將非常高效。

然而，通過謂詞model = 'Boeing 777-300'進行搜索的效率將大大降低：從root開始，判斷不出選擇哪個子節(jié)點進行向下搜索，因此會遍歷所有子節(jié)點向下進行搜索。這并不意味著永遠無法使用這樣的索引----它的效率有問題。例如，如果aircraft有3個classes值，每個class類中有許多model值，此時不得不掃描索引1/3的數(shù)據(jù)，這可能比全表掃描更有效。

但是，當(dāng)創(chuàng)建如下索引時：

demo=# create index on aircrafts( model, (case when range < 4000 then 1 when range < 10000 then 2 else 3 end));

索引字段的順序會改變：

通過這個索引，model = 'Boeing 777-300'將會很有效，但class = 3則沒這么高效。

NULLs

PostgreSQL的B-tree支持在NULLs上創(chuàng)建索引，可以通過IS NULL或者IS NOT NULL的條件進行查詢。

考慮flights表，允許NULLs：

demo=# create index on flights(actual_arrival);
demo=# explain(costs off) select * from flights where actual_arrival is null;
QUERY PLAN
-------------------------------------------------------
Bitmap Heap Scan on flights
Recheck Cond: (actual_arrival IS NULL)
-> Bitmap Index Scan on flights_actual_arrival_idx
Index Cond: (actual_arrival IS NULL)
(4 rows)

下面的例子中，他們的順序相同，因此可以使用索引：

demo=# explain(costs off)
select * from flights order by actual_arrival NULLS LAST;
QUERY PLAN
--------------------------------------------------------
Index Scan using flights_actual_arrival_idx on flights
(1 row)

下面的例子，順序不同，優(yōu)化器選擇順序掃描然后進行排序：

demo=# explain(costs off)
select * from flights order by actual_arrival NULLS FIRST;
QUERY PLAN
----------------------------------------
Sort
Sort Key: actual_arrival NULLS FIRST
-> Seq Scan on flights
(3 rows)

NULLs必須位于開頭才能使用索引：

demo=# create index flights_nulls_first_idx on flights(actual_arrival NULLS FIRST);
demo=# explain(costs off)
select * from flights order by actual_arrival NULLS FIRST;
QUERY PLAN
-----------------------------------------------------
Index Scan using flights_nulls_first_idx on flights
(1 row)

像這樣的問題是由NULLs引起的而不是無法排序，也就是說NULL和其他這比較的結(jié)果無法預(yù)知：

demo=# \pset null NULL
demo=# select null < 42;
?column?
----------
NULL
(1 row)

這和B-tree的概念背道而馳并且不符合一般的模式。然而NULLs在數(shù)據(jù)庫中扮演者很重要的角色，因此不得不為NULL做特殊設(shè)置。

由于NULLs可以被索引，因此即使表上沒有任何標記也可以使用索引。（因為這個索引包含表航記錄的所有信息）。如果查詢需要排序的數(shù)據(jù)，而且索引確保了所需的順序，那么這可能是由意義的。這種情況下，查詢計劃更傾向于通過索引獲取數(shù)據(jù)。

屬性

下面介紹btree訪問方法的特性。

可以看到，B-tree能夠排序數(shù)據(jù)并且支持唯一性。同時還支持多列索引，但是其他訪問方法也支持這種索引。我們將在下次討論EXCLUDE條件。

Btree訪問方法可以通過以下兩種方式獲取數(shù)據(jù)：index scan以及bitmap scan?？梢钥吹?，通過tree可以向前和向后進行遍歷。

前四種特性指定了特定列如何精確的排序。本案例中，值以升序（asc）進行排序并且NULLs在后面（nulls_last）。也可以有其他組合。

search_array的特性支持向這樣的表達式：

demo=# explain(costs off)
select * from aircrafts where aircraft_code in ('733','763','773');
QUERY PLAN
-----------------------------------------------------------------
Index Scan using aircrafts_pkey on aircrafts
Index Cond: (aircraft_code = ANY ('{733,763,773}'::bpchar[]))
(2 rows)

returnable屬性支持index-only scan，由于索引本身也存儲索引值所以這是合理的。下面簡單介紹基于B-tree的覆蓋索引。

具有額外列的唯一索引

前面討論了：覆蓋索引包含查詢所需的所有值，需不要再回表。唯一索引可以成為覆蓋索引。

假設(shè)我們查詢所需要的列添加到唯一索引，新的組合唯一鍵可能不再唯一，同一列上將需要2個索引：一個唯一，支持完整性約束；另一個是非唯一，為了覆蓋索引。這當(dāng)然是低效的。

在我們公司 Anastasiya Lubennikova @ lubennikovaav 改進了btree，額外的非唯一列可以包含在唯一索引中。我們希望這個補丁可以被社區(qū)采納。實際上PostgreSQL11已經(jīng)合了該補丁。

考慮表bookings：d

demo=# \d bookings
    Table "bookings.bookings"
Column |   Type   | Modifiers
--------------+--------------------------+-----------
book_ref | character(6)    | not null
book_date | timestamp with time zone | not null
total_amount | numeric(10,2)   | not null
Indexes:
"bookings_pkey" PRIMARY KEY, btree (book_ref)
Referenced by:
TABLE "tickets" CONSTRAINT "tickets_book_ref_fkey" FOREIGN KEY (book_ref) REFERENCES bookings(book_ref)

這個表中，主鍵（book_ref,booking code）通過常規(guī)的btree索引提供，下面創(chuàng)建一個由額外列的唯一索引：

1demo=# create unique index bookings_pkey2 on bookings(book_ref) INCLUDE (book_date);

然后使用新索引替代現(xiàn)有索引：

demo=# begin;
demo=# alter table bookings drop constraint bookings_pkey cascade;
demo=# alter table bookings add primary key using index bookings_pkey2;
demo=# alter table tickets add foreign key (book_ref) references bookings (book_ref);
demo=# commit;

然后表結(jié)構(gòu)：

demo=# \d bookings
    Table "bookings.bookings"
Column |   Type   | Modifiers
--------------+--------------------------+-----------
book_ref | character(6)    | not null
book_date | timestamp with time zone | not null
total_amount | numeric(10,2)   | not null
Indexes:
"bookings_pkey2" PRIMARY KEY, btree (book_ref) INCLUDE (book_date)
Referenced by:
TABLE "tickets" CONSTRAINT "tickets_book_ref_fkey" FOREIGN KEY (book_ref) REFERENCES bookings(book_ref)

此時，這個索引可以作為唯一索引工作也可以作為覆蓋索引：

demo=# explain(costs off)
select book_ref, book_date from bookings where book_ref = '059FC4';
QUERY PLAN
--------------------------------------------------
Index Only Scan using bookings_pkey2 on bookings
Index Cond: (book_ref = '059FC4'::bpchar)
(2 rows)

創(chuàng)建索引

眾所周知，對于大表，加載數(shù)據(jù)時最好不要帶索引；加載完成后再創(chuàng)建索引。這樣做不僅提升效率還能節(jié)省空間。

創(chuàng)建B-tree索引比向索引中插入數(shù)據(jù)更高效。所有的數(shù)據(jù)大致上都已排序，并且數(shù)據(jù)的葉子頁已創(chuàng)建好，然后只需構(gòu)建內(nèi)部頁直到root頁構(gòu)建成一個完整的B-tree。

這種方法的速度依賴于RAM的大小，受限于參數(shù)maintenance_work_mem。因此增大該參數(shù)值可以提升速度。對于唯一索引，除了分配maintenance_work_mem的內(nèi)存外，還分配了work_mem的大小的內(nèi)存。

比較

前面，提到PG需要知道對于不同類型的值調(diào)用哪個函數(shù)，并且這個關(guān)聯(lián)方法存儲在哈希訪問方法中。同樣，系統(tǒng)必須找出如何排序。這在排序、分組（有時）、merge join中會涉及。PG不會將自身綁定到操作符名稱，因為用戶可以自定義他們的數(shù)據(jù)類型并給出對應(yīng)不同的操作符名稱。

例如bool_ops操作符集中的比較操作符：

postgres=# select amop.amopopr::regoperator as opfamily_operator,
   amop.amopstrategy
from pg_am am,
   pg_opfamily opf,
   pg_amop amop
where opf.opfmethod = am.oid
and amop.amopfamily = opf.oid
and am.amname = 'btree'
and opf.opfname = 'bool_ops'
order by amopstrategy;
opfamily_operator | amopstrategy
---------------------+--------------
<(boolean,boolean) |   1
<=(boolean,boolean) |   2
=(boolean,boolean) |   3
>=(boolean,boolean) |   4
>(boolean,boolean) |   5
(5 rows)

這里可以看到有5種操作符，但是不應(yīng)該依賴于他們的名字。為了指定哪種操作符做什么操作，引入策略的概念。為了描述操作符語義，定義了5種策略：

1 — less

2 — less or equal

3 — equal

4 — greater or equal

5 — greater

postgres=# select amop.amopopr::regoperator as opfamily_operator
from pg_am am,
pg_opfamily opf,
pg_amop amop
where opf.opfmethod = am.oid
and amop.amopfamily = opf.oid
and am.amname = 'btree'
and opf.opfname = 'integer_ops'
and amop.amopstrategy = 1
order by opfamily_operator;
pfamily_operator
----------------------
<(integer,bigint)
<(smallint,smallint)
<(integer,integer)
<(bigint,bigint)
<(bigint,integer)
<(smallint,integer)
<(integer,smallint)
<(smallint,bigint)
<(bigint,smallint)
(9 rows)

一些操作符族可以包含幾種操作符，例如integer_ops包含策略1的幾種操作符：

正因如此，當(dāng)比較類型在一個操作符族中時，不同類型值的比較，優(yōu)化器可以避免類型轉(zhuǎn)換。

索引支持的新數(shù)據(jù)類型

文檔中提供了一個創(chuàng)建符合數(shù)值的新數(shù)據(jù)類型，以及對這種類型數(shù)據(jù)進行排序的操作符類。該案例使用C語言完成。但不妨礙我們使用純SQL進行對比試驗。

創(chuàng)建一個新的組合類型：包含real和imaginary兩個字段

1postgres=# create type complex as (re float, im float);

創(chuàng)建一個包含該新組合類型字段的表：

postgres=# create table numbers(x complex);postgres=# insert into numbers values ((0.0, 10.0)), ((1.0, 3.0)), ((1.0, 1.0));

現(xiàn)在有個疑問，如果在數(shù)學(xué)上沒有為他們定義順序關(guān)系，如何進行排序？

已經(jīng)定義好了比較運算符：

postgres=# select * from numbers order by x;
x
--------
(0,10)
(1,1)
(1,3)
(3 rows)

默認情況下，對于組合類型排序是分開的：首先比較第一個字段然后第二個字段，與文本字符串比較方法大致相同。但是我們也可以定義其他的排序方式，例如組合數(shù)字可以當(dāng)做一個向量，通過模值進行排序。為了定義這樣的順序，我們需要創(chuàng)建一個函數(shù)：

postgres=# create operator class complex_ops
default for type complex
using btree as
operator 1 #<#,
operator 2 #<=#,
operator 3 #=#,
operator 4 #>=#,
operator 5 #>#,
function 1 complex_cmp(complex,complex);

//排序結(jié)果：
postgres=# select * from numbers order by x;
x
--------
(1,1)
(1,3)
(0,10)
(3 rows)

//可以使用此查詢獲取支持的函數(shù)：

postgres=# select amp.amprocnum,
amp.amproc,
amp.amproclefttype::regtype,
amp.amprocrighttype::regtype
from pg_opfamily opf,
pg_am am,
pg_amproc amp
where opf.opfname = 'complex_ops'
and opf.opfmethod = am.oid
and am.amname = 'btree'
and amp.amprocfamily = opf.oid;
amprocnum | amproc | amproclefttype | amprocrighttype
-----------+-------------+----------------+-----------------
1 | complex_cmp | complex | complex
(1 row)

//排序結(jié)果：postgres=# select * from numbers order by x; x -------- (1,1) (1,3) (0,10)(3 rows)

//可以使用此查詢獲取支持的函數(shù)：

postgres=# select amp.amprocnum, amp.amproc, amp.amproclefttype::regtype, amp.amprocrighttype::regtypefrom pg_opfamily opf, pg_am am, pg_amproc ampwhere opf.opfname = 'complex_ops'and opf.opfmethod = am.oidand am.amname = 'btree'and amp.amprocfamily = opf.oid; amprocnum | amproc | amproclefttype | amprocrighttype-----------+-------------+----------------+----------------- 1 | complex_cmp | complex | complex(1 row)

內(nèi)部結(jié)構(gòu)

使用pageinspect插件觀察B-tree結(jié)構(gòu)：

1demo=# create extension pageinspect;

索引的元數(shù)據(jù)頁：

demo=# select * from bt_metap('ticket_flights_pkey');
magic | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
340322 | 2 | 164 | 2 | 164 | 2
(1 row)

值得關(guān)注的是索引level：不包括root，有一百萬行記錄的表其索引只需要2層就可以了。

Root頁，即164號頁面的統(tǒng)計信息：

demo=# select type, live_items, dead_items, avg_item_size, page_size, free_size
from bt_page_stats('ticket_flights_pkey',164);
type | live_items | dead_items | avg_item_size | page_size | free_size
------+------------+------------+---------------+-----------+-----------
r | 33 | 0 | 31 | 8192 | 6984
(1 row)

該頁中數(shù)據(jù)：

demo=# select itemoffset, ctid, itemlen, left(data,56) as data
from bt_page_items('ticket_flights_pkey',164) limit 5;
itemoffset | ctid | itemlen |       data
------------+---------+---------+----------------------------------------------------------
   1 | (3,1) | 8 |
   2 | (163,1) | 32 | 1d 30 30 30 35 34 33 32 33 30 35 37 37 31 00 00 ff 5f 00
   3 | (323,1) | 32 | 1d 30 30 30 35 34 33 32 34 32 33 36 36 32 00 00 4f 78 00
   4 | (482,1) | 32 | 1d 30 30 30 35 34 33 32 35 33 30 38 39 33 00 00 4d 1e 00
   5 | (641,1) | 32 | 1d 30 30 30 35 34 33 32 36 35 35 37 38 35 00 00 2b 09 00
(5 rows)

第一個tuple指定該頁的最大值，真正的數(shù)據(jù)從第二個tuple開始。很明顯最左邊子節(jié)點的頁號是163，然后是323。反過來，可以使用相同的函數(shù)搜索。

PG10版本提供了"amcheck"插件，該插件可以檢測B-tree數(shù)據(jù)的邏輯一致性，使我們提前探知故障。

文章來源：腳本之家

來源地址：https://www.jb51.net/article/204311.htm

申請創(chuàng)業(yè)報道，分享創(chuàng)業(yè)好點子。點擊此處，共同探討創(chuàng)業(yè)新機遇！

當(dāng)前位置：首頁 > 站長 > 數(shù)據(jù)庫 > 正文

PostgreSQL的B-tree索引用法詳解

相關(guān)文章

熱門排行

信息推薦

編輯推薦

MongoDB通配符索引的用法實例

PostgreSQL技巧如何獲取當(dāng)前日期時間

熱門標簽