在PostgreSQL中使用ltree處理層次結構數據的方法

在本文中,我們將學習如何使用PostgreSQL的ltree模塊,該模塊允許以分層的樹狀結構存儲數據。

什麼是ltree?

Ltree是PostgreSQL模塊。它實現瞭一種數據類型ltree,用於表示存儲在分層樹狀結構中的數據的標簽。提供瞭用於搜索標簽樹的廣泛工具。

為什麼選擇ltree?

  • ltree實現瞭一個物化路徑,對於INSERT / UPDATE / DELETE來說非常快,而對於SELECT操作則較快
  • 通常,它比使用經常需要重新計算分支的遞歸CTE或遞歸函數要快
  • 如內置的查詢語法和專門用於查詢和導航樹的運算符
  • 索引!!!

初始數據

首先,您應該在數據庫中啟用擴展。您可以通過以下命令執行此操作:

CREATE EXTENSION ltree;

讓我們創建表並向其中添加一些數據:

CREATE TABLE comments (user_id integer, description text, path ltree);
INSERT INTO comments (user_id, description, path) VALUES ( 1, md5(random()::text), '0001');
INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0001.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0001.0001.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 1, md5(random()::text), '0001.0001.0001.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 5, md5(random()::text), '0001.0001.0001.0003');
INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0002.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0003');
INSERT INTO comments (user_id, description, path) VALUES ( 8, md5(random()::text), '0001.0003.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 11, md5(random()::text), '0001.0003.0002.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0003.0002.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 5, md5(random()::text), '0001.0003.0002.0003');
INSERT INTO comments (user_id, description, path) VALUES ( 7, md5(random()::text), '0001.0003.0002.0002.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 20, md5(random()::text), '0001.0003.0002.0002.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 31, md5(random()::text), '0001.0003.0002.0002.0003');
INSERT INTO comments (user_id, description, path) VALUES ( 22, md5(random()::text), '0001.0003.0002.0002.0004');
INSERT INTO comments (user_id, description, path) VALUES ( 34, md5(random()::text), '0001.0003.0002.0002.0005');
INSERT INTO comments (user_id, description, path) VALUES ( 22, md5(random()::text), '0001.0003.0002.0002.0006');

另外,我們應該添加一些索引:

CREATE INDEX path_gist_comments_idx ON comments USING GIST(path);
CREATE INDEX path_comments_idx ON comments USING btree(path);

正如您看到的那樣,我建立comments表時帶有path字段,該字段包含該表的tree全部路徑。如您所見,對於樹分隔符,我使用4個數字和點。

讓我們在commenets表中找到path以‘0001.0003’的記錄:

$ SELECT user_id, path FROM comments WHERE path <@ '0001.0003';
 user_id |   path
---------+--------------------------
  6 | 0001.0003
  8 | 0001.0003.0001
  9 | 0001.0003.0002
  11 | 0001.0003.0002.0001
  2 | 0001.0003.0002.0002
  5 | 0001.0003.0002.0003
  7 | 0001.0003.0002.0002.0001
  20 | 0001.0003.0002.0002.0002
  31 | 0001.0003.0002.0002.0003
  22 | 0001.0003.0002.0002.0004
  34 | 0001.0003.0002.0002.0005
  22 | 0001.0003.0002.0002.0006
(12 rows)

讓我們通過EXPLAIN命令檢查這個SQL:

$ EXPLAIN ANALYZE SELECT user_id, path FROM comments WHERE path <@ '0001.0003';
            QUERY PLAN
----------------------------------------------------------------------------------------------------
 Seq Scan on comments (cost=0.00..1.24 rows=2 width=38) (actual time=0.013..0.017 rows=12 loops=1)
 Filter: (path <@ '0001.0003'::ltree)
 Rows Removed by Filter: 7
 Total runtime: 0.038 ms
(4 rows)

讓我們禁用seq scan進行測試:

$ SET enable_seqscan=false;
SET
$ EXPLAIN ANALYZE SELECT user_id, path FROM comments WHERE path <@ '0001.0003';
               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Index Scan using path_gist_comments_idx on comments (cost=0.00..8.29 rows=2 width=38) (actual time=0.023..0.034 rows=12 loops=1)
 Index Cond: (path <@ '0001.0003'::ltree)
 Total runtime: 0.076 ms
(3 rows)

現在SQL慢瞭,但是能看到SQL是怎麼使用index的。
第一個SQL語句使用瞭sequence scan,因為在表中沒有太多的數據。

我們可以將select “path <@ ‘0001.0003’” 換種實現方法:

$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*';
user_id |   path
---------+--------------------------
  6 | 0001.0003
  8 | 0001.0003.0001
  9 | 0001.0003.0002
  11 | 0001.0003.0002.0001
  2 | 0001.0003.0002.0002
  5 | 0001.0003.0002.0003
  7 | 0001.0003.0002.0002.0001
  20 | 0001.0003.0002.0002.0002
  31 | 0001.0003.0002.0002.0003
  22 | 0001.0003.0002.0002.0004
  34 | 0001.0003.0002.0002.0005
  22 | 0001.0003.0002.0002.0006
(12 rows)

你不應該忘記數據的順序,如下的例子:

$ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0001');
$ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0002');
$ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0003');
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*';
user_id |   path
---------+--------------------------
  6 | 0001.0003
  8 | 0001.0003.0001
  9 | 0001.0003.0002
  11 | 0001.0003.0002.0001
  2 | 0001.0003.0002.0002
  5 | 0001.0003.0002.0003
  7 | 0001.0003.0002.0002.0001
  20 | 0001.0003.0002.0002.0002
  31 | 0001.0003.0002.0002.0003
  22 | 0001.0003.0002.0002.0004
  34 | 0001.0003.0002.0002.0005
  22 | 0001.0003.0002.0002.0006
  9 | 0001.0003.0001.0001
  9 | 0001.0003.0001.0002
  9 | 0001.0003.0001.0003
(15 rows)

現在進行排序:

$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*' ORDER by path;
 user_id |   path
---------+--------------------------
  6 | 0001.0003
  8 | 0001.0003.0001
  9 | 0001.0003.0001.0001
  9 | 0001.0003.0001.0002
  9 | 0001.0003.0001.0003
  9 | 0001.0003.0002
  11 | 0001.0003.0002.0001
  2 | 0001.0003.0002.0002
  7 | 0001.0003.0002.0002.0001
  20 | 0001.0003.0002.0002.0002
  31 | 0001.0003.0002.0002.0003
  22 | 0001.0003.0002.0002.0004
  34 | 0001.0003.0002.0002.0005
  22 | 0001.0003.0002.0002.0006
  5 | 0001.0003.0002.0003
(15 rows)

可以在lquery的非星號標簽的末尾添加幾個修飾符,以使其比完全匹配更匹配:
“ @”-不區分大小寫匹配,例如a @匹配A
“ *”-匹配任何帶有該前綴的標簽,例如foo *匹配foobar
“%”-匹配以下劃線開頭的單詞

$ SELECT user_id, path FROM comments WHERE path ~ '0001.*{1,2}.0001|0002.*' ORDER by path;
 user_id |   path
---------+--------------------------
  2 | 0001.0001.0001
  2 | 0001.0001.0001.0001
  1 | 0001.0001.0001.0002
  5 | 0001.0001.0001.0003
  6 | 0001.0002.0001
  8 | 0001.0003.0001
  9 | 0001.0003.0001.0001
  9 | 0001.0003.0001.0002
  9 | 0001.0003.0001.0003
  9 | 0001.0003.0002
  11 | 0001.0003.0002.0001
  2 | 0001.0003.0002.0002
  7 | 0001.0003.0002.0002.0001
  20 | 0001.0003.0002.0002.0002
  31 | 0001.0003.0002.0002.0003
  22 | 0001.0003.0002.0002.0004
  34 | 0001.0003.0002.0002.0005
  22 | 0001.0003.0002.0002.0006
  5 | 0001.0003.0002.0003
(19 rows)

我們來為parent ‘0001.0003’找到所有直接的childrens,見下:

$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*{1}' ORDER by path;
 user_id |  path
---------+----------------
  8 | 0001.0003.0001
  9 | 0001.0003.0002
(2 rows)

為parent ‘0001.0003’找到所有的childrens,見下:

$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*' ORDER by path;
 user_id |   path
---------+--------------------------
  6 | 0001.0003
  8 | 0001.0003.0001
  9 | 0001.0003.0001.0001
  9 | 0001.0003.0001.0002
  9 | 0001.0003.0001.0003
  9 | 0001.0003.0002
  11 | 0001.0003.0002.0001
  2 | 0001.0003.0002.0002
  7 | 0001.0003.0002.0002.0001
  20 | 0001.0003.0002.0002.0002
  31 | 0001.0003.0002.0002.0003
  22 | 0001.0003.0002.0002.0004
  34 | 0001.0003.0002.0002.0005
  22 | 0001.0003.0002.0002.0006
  5 | 0001.0003.0002.0003
(15 rows)

為children ‘0001.0003.0002.0002.0005’找到parent:

$ SELECT user_id, path FROM comments WHERE path = subpath('0001.0003.0002.0002.0005', 0, -1) ORDER by path;
 user_id |  path
---------+---------------------
  2 | 0001.0003.0002.0002
(1 row)

如果你的路徑不是唯一的,你會得到多條記錄。

概述

可以看出,使用ltree的物化路徑非常簡單。在本文中,我沒有列出ltree的所有可能用法。它不被視為全文搜索問題ltxtquery。但是您可以在PostgreSQL官方文檔(http://www.postgresql.org/docs/current/static/ltree.html)中找到它。

瞭解更多PostgreSQL熱點資訊、新聞動態、精彩活動,請訪問中國PostgreSQL官方網站:www.postgresqlchina.com

解決更多PostgreSQL相關知識、技術、工作問題,請訪問中國PostgreSQL官方問答社區:www.pgfans.cn

下載更多PostgreSQL相關資料、工具、插件問題,請訪問中國PostgreSQL官方下載網站:www.postgreshub.cn

到此這篇關於在PostgreSQL中使用ltree處理層次結構數據的文章就介紹到這瞭,更多相關PostgreSQL層次結構數據內容請搜索WalkonNet以前的文章或繼續瀏覽下面的相關文章希望大傢以後多多支持WalkonNet!