postgresql 索引之 hash的使用詳解

os: ubuntu 16.04

postgresql: 9.6.8

ip 規劃

192.168.56.102 node2 postgresql

help create index

postgres=# \h create index
Command:   CREATE INDEX
Description: define a new index
Syntax:
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON table_name [ USING method ]
  ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
  [ WITH ( storage_parameter = value [, ... ] ) ]
  [ TABLESPACE tablespace_name ]
  [ WHERE predicate ]

[ USING method ]

method

要使用的索引方法的名稱。可以選擇 btree、hash、 gist、spgist、 gin以及brin。 默認方法是btree。

hash

hash 隻能處理簡單的等值比較,

postgres=# drop table tmp_t0;
DROP TABLE
postgres=# create table tmp_t0(c0 varchar(100),c1 varchar(100));
CREATE TABLE
postgres=# insert into tmp_t0(c0,c1) select md5(id::varchar),md5((id+id)::varchar) from generate_series(1,100000) as id;
INSERT 0 100000
postgres=# create index idx_tmp_t0_1 on tmp_t0 using hash(c0);
CREATE INDEX
postgres=# \d+ tmp_t0
                     Table "public.tmp_t0"
 Column |     Type     | Collation | Nullable | Default | Storage | Stats target | Description 
--------+------------------------+-----------+----------+---------+----------+--------------+-------------
 c0   | character varying(100) |      |     |     | extended |       | 
 c1   | character varying(100) |      |     |     | extended |       | 
Indexes:
  "idx_tmp_t0_1" hash (c0)
postgres=# explain select * from tmp_t0 where c0 = 'd3d9446802a44259755d38e6d163e820';
                 QUERY PLAN                 
----------------------------------------------------------------------------
 Index Scan using idx_tmp_t0_1 on tmp_t0 (cost=0.00..8.02 rows=1 width=66)
  Index Cond: ((c0)::text = 'd3d9446802a44259755d38e6d163e820'::text)
(2 rows)

註意事項,官網特別強調:

Hash索引操作目前不被WAL記錄,因此存在未寫入修改,在數據庫崩潰後需要用REINDEX命令重建Hash索引。

同樣,在完成初始的基礎備份後,對於Hash索引的改變也不會通過流式或基於文件的復制所復制,所以它們會對其後使用它們的查詢給出錯誤的答案。

正因為這些原因,Hash索引已不再被建議使用。

補充:Postgresql hash索引介紹

hash索引的結構

當數據插入索引時,我們會為這個索引鍵通過哈希函數計算一個值。 PostgreSQL中的哈希函數始終返回“整數”類型,范圍為2^32≈40億。bucket桶的數量最初為2個,然後動態增加以適應數據大小。可以使用位算法從哈希碼計算出桶編號。這個bucket將存放TID。

由於可以將與不同索引鍵匹配的TID放入同一bucket桶中。而且除瞭TID之外,還可以將鍵的源值存儲在bucket桶中,但這會增加索引大小。為瞭節省空間,bucket桶隻存儲索引鍵的哈希碼,而不存儲索引鍵。

當我們通過索引查詢時,我們計算索引鍵的哈希函數並獲取bucket桶的編號。現在,仍然需要遍歷存儲桶的內容,並僅返回所需的哈希碼匹配的TID。由於存儲的“hash code – TID”對是有序的,因此可以高效地完成此操作。

但是,兩個不同的索引鍵可能會發生以下情況,兩個索引鍵都進入一個bucket桶,而且具有相同的四字節的哈希碼。因此,索引訪問方法要求索引引擎重新檢查表行中的情況來驗證每個TID。

映射數據結構到page

Meta page – 0號page,包含索引內部相關信息

Bucket pages – 索引的主要page,存儲 “hash code – TID” 對

Overflow pages – 與bucket page的結構相同,在不足一個page時,作為bucket桶使用

Bitmap pages – 跟蹤當前幹凈的overflow page,並可將其重新用於其他bucket桶

註意,哈希索引不能減​​小大小。雖然我們刪除瞭一些索引行,但是分配的頁面將不會返回到操作系統,隻會在VACUUMING之後重新用於新數據。減小索引大小的唯一選項是使用REINDEX或VACUUM FULL命令從頭開始重建索引

接下來看下hash索引如何創建

demo=# create index on flights using hash(flight_no);
demo=# explain (costs off) select * from flights where flight_no = 'PG0001';
           QUERY PLAN           
----------------------------------------------------
 Bitmap Heap Scan on flights
  Recheck Cond: (flight_no = 'PG0001'::bpchar)
  -> Bitmap Index Scan on flights_flight_no_idx
     Index Cond: (flight_no = 'PG0001'::bpchar)
(4 rows)

註意:10版本之前hash索引不記錄到wal中,所以hash索引不能做recovery,當然也就不能復制瞭,但是從10版本以後hash所用得到瞭增強,可以記錄到wal中,創建的時候也不會再有警告。

查看hash訪問方法相關的操作函數

demo=# select  opf.opfname as opfamily_name,
     amproc.amproc::regproc AS opfamily_procedure
from   pg_am am,
     pg_opfamily opf,
     pg_amproc amproc
where  opf.opfmethod = am.oid
and   amproc.amprocfamily = opf.oid
and   am.amname = 'hash'
order by opfamily_name,
     opfamily_procedure;
  
   opfamily_name  |  opfamily_procedure  
--------------------+-------------------------
 abstime_ops    | hashint4extended
 abstime_ops    | hashint4
 aclitem_ops    | hash_aclitem
 aclitem_ops    | hash_aclitem_extended
 array_ops     | hash_array
 array_ops     | hash_array_extended
 bool_ops      | hashcharextended
 bool_ops      | hashchar
 bpchar_ops     | hashbpcharextended
 bpchar_ops     | hashbpchar
 bpchar_pattern_ops | hashbpcharextended
 bpchar_pattern_ops | hashbpchar
 bytea_ops     | hashvarlena
 bytea_ops     | hashvarlenaextended
 char_ops      | hashcharextended
 char_ops      | hashchar
 cid_ops      | hashint4extended
 cid_ops      | hashint4
 date_ops      | hashint4extended
 date_ops      | hashint4
 enum_ops      | hashenumextended
 enum_ops      | hashenum
 float_ops     | hashfloat4extended
 float_ops     | hashfloat8extended
 float_ops     | hashfloat4
 float_ops     | hashfloat8
 ...

可以用這些函數計算相關類型的哈希碼

hank=# select hashtext('zhang');
 hashtext  
-------------
 -1172392837
(1 row)
hank=# select hashint4(10);
 hashint4  
-------------
 -1547814713
(1 row)

hash索引相關的屬性

hank=# select a.amname, p.name, pg_indexam_has_property(a.oid,p.name)
hank-# from pg_am a,
hank-#   unnest(array['can_order','can_unique','can_multi_col','can_exclude']) p(name)
hank-# where a.amname = 'hash'
hank-# order by a.amname;
 amname |   name   | pg_indexam_has_property 
--------+---------------+-------------------------
 hash  | can_order   | f
 hash  | can_unique  | f
 hash  | can_multi_col | f
 hash  | can_exclude  | t
(4 rows)
hank=# select p.name, pg_index_has_property('hank.idx_test_name'::regclass,p.name)
hank-# from unnest(array[
hank(#    'clusterable','index_scan','bitmap_scan','backward_scan'
hank(#   ]) p(name);
   name   | pg_index_has_property 
---------------+-----------------------
 clusterable  | f
 index_scan  | t
 bitmap_scan  | t
 backward_scan | t
(4 rows)
hank=# select p.name,
hank-#   pg_index_column_has_property('hank.idx_test_name'::regclass,1,p.name)
hank-# from unnest(array[
hank(#    'asc','desc','nulls_first','nulls_last','orderable','distance_orderable',
hank(#    'returnable','search_array','search_nulls'
hank(#   ]) p(name);
    name    | pg_index_column_has_property 
--------------------+------------------------------
 asc        | f
 desc        | f
 nulls_first    | f
 nulls_last     | f
 orderable     | f
 distance_orderable | f
 returnable     | f
 search_array    | f
 search_nulls    | f
(9 rows)

由於hash函數沒有特定的排序規則,所以一般的hash索引隻支持等值查詢,可以通過下面數據字典看到,所有操作都是“=”,而且hash索引也不會處理null值,所以不會標記null值,還有就是hash索引不存儲索引鍵,隻存儲hash碼,所以不會 index-only掃描,也不支持多列創建hash索引

hank=# select  opf.opfname AS opfamily_name,
hank-#     amop.amopopr::regoperator AS opfamily_operator
hank-# from   pg_am am,
hank-#     pg_opfamily opf,
hank-#     pg_amop amop
hank-# where  opf.opfmethod = am.oid
hank-# and   amop.amopfamily = opf.oid
hank-# and   am.amname = 'hash'
hank-# order by opfamily_name,
hank-#     opfamily_operator;
  opfamily_name  |           opfamily_operator           
--------------------+------------------------------------------------------------
 abstime_ops    | =(abstime,abstime)
 aclitem_ops    | =(aclitem,aclitem)
 array_ops     | =(anyarray,anyarray)
 bool_ops      | =(boolean,boolean)
 bpchar_ops     | =(character,character)
 bpchar_pattern_ops | =(character,character)
 bytea_ops     | =(bytea,bytea)
 char_ops      | =("char","char")
 cid_ops      | =(cid,cid)
 date_ops      | =(date,date)
 enum_ops      | =(anyenum,anyenum)
 float_ops     | =(real,real)
 float_ops     | =(double precision,double precision)
 float_ops     | =(real,double precision)
 float_ops     | =(double precision,real)
 hash_hstore_ops  | =(hstore,hstore)
 integer_ops    | =(integer,bigint)
 integer_ops    | =(smallint,smallint)
 integer_ops    | =(integer,integer)
 integer_ops    | =(bigint,bigint)
 integer_ops    | =(bigint,integer)
 integer_ops    | =(smallint,integer)
 integer_ops    | =(integer,smallint)
 integer_ops    | =(smallint,bigint)
 integer_ops    | =(bigint,smallint)
 interval_ops    | =(interval,interval)
 jsonb_ops     | =(jsonb,jsonb)
 macaddr8_ops    | =(macaddr8,macaddr8)
 macaddr_ops    | =(macaddr,macaddr)
 name_ops      | =(name,name)
 network_ops    | =(inet,inet)
 numeric_ops    | =(numeric,numeric)
 oid_ops      | =(oid,oid)
 oidvector_ops   | =(oidvector,oidvector)
 pg_lsn_ops     | =(pg_lsn,pg_lsn)
 range_ops     | =(anyrange,anyrange)
 reltime_ops    | =(reltime,reltime)
 text_ops      | =(text,text)
 text_pattern_ops  | =(text,text)
 time_ops      | =(time without time zone,time without time zone)
 timestamp_ops   | =(timestamp without time zone,timestamp without time zone)
 timestamptz_ops  | =(timestamp with time zone,timestamp with time zone)
 timetz_ops     | =(time with time zone,time with time zone)
 uuid_ops      | =(uuid,uuid)
 xid_ops      | =(xid,xid)

從10版本開始,可以通過pageinspect插件查看hash索引的內部情況

安裝插件

create extension pageinspect;

查看0號page

hank=# select hash_page_type(get_raw_page('hank.idx_test_name',0));
 hash_page_type 
----------------
 metapage
(1 row)

查看索引中的行數和已用的最大存儲桶數

hank=# select ntuples, maxbucket
hank-# from hash_metapage_info(get_raw_page('hank.idx_test_name',0));  
 ntuples | maxbucket 
---------+-----------
  1000 |     3
(1 row)

可以看到1號page是bucket,查看此bucket page的活動元組和死元組的數量,

也就是膨脹度,以便維護索引

hank=# select hash_page_type(get_raw_page('hank.idx_test_name',1));
 hash_page_type 
----------------
 bucket
(1 row)
hank=# select live_items, dead_items
hank-# from hash_page_stats(get_raw_page('hank.idx_test_name',1));  
 live_items | dead_items 
------------+------------
    407 |     0
(1 row) 

以上為個人經驗,希望能給大傢一個參考,也希望大傢多多支持WalkonNet。如有錯誤或未考慮完全的地方,望不吝賜教。