在PostgreSQL中使用ltree處理層次結構數據的方法
在本文中,我們將學習如何使用PostgreSQL的ltree模塊,該模塊允許以分層的樹狀結構存儲數據。
什麼是ltree?
Ltree是PostgreSQL模塊。它實現瞭一種數據類型ltree,用於表示存儲在分層樹狀結構中的數據的標簽。提供瞭用於搜索標簽樹的廣泛工具。
為什麼選擇ltree?
- ltree實現瞭一個物化路徑,對於INSERT / UPDATE / DELETE來說非常快,而對於SELECT操作則較快
- 通常,它比使用經常需要重新計算分支的遞歸CTE或遞歸函數要快
- 如內置的查詢語法和專門用於查詢和導航樹的運算符
- 索引!!!
初始數據
首先,您應該在數據庫中啟用擴展。您可以通過以下命令執行此操作:
CREATE EXTENSION ltree;
讓我們創建表並向其中添加一些數據:
CREATE TABLE comments (user_id integer, description text, path ltree); INSERT INTO comments (user_id, description, path) VALUES ( 1, md5(random()::text), '0001'); INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0001.0001'); INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0001.0001.0001'); INSERT INTO comments (user_id, description, path) VALUES ( 1, md5(random()::text), '0001.0001.0001.0002'); INSERT INTO comments (user_id, description, path) VALUES ( 5, md5(random()::text), '0001.0001.0001.0003'); INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0002'); INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0003'); INSERT INTO comments (user_id, description, path) VALUES ( 8, md5(random()::text), '0001.0003.0001'); INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0002'); INSERT INTO comments (user_id, description, path) VALUES ( 11, md5(random()::text), '0001.0003.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0003.0002.0002'); INSERT INTO comments (user_id, description, path) VALUES ( 5, md5(random()::text), '0001.0003.0002.0003'); INSERT INTO comments (user_id, description, path) VALUES ( 7, md5(random()::text), '0001.0003.0002.0002.0001'); INSERT INTO comments (user_id, description, path) VALUES ( 20, md5(random()::text), '0001.0003.0002.0002.0002'); INSERT INTO comments (user_id, description, path) VALUES ( 31, md5(random()::text), '0001.0003.0002.0002.0003'); INSERT INTO comments (user_id, description, path) VALUES ( 22, md5(random()::text), '0001.0003.0002.0002.0004'); INSERT INTO comments (user_id, description, path) VALUES ( 34, md5(random()::text), '0001.0003.0002.0002.0005'); INSERT INTO comments (user_id, description, path) VALUES ( 22, md5(random()::text), '0001.0003.0002.0002.0006');
另外,我們應該添加一些索引:
CREATE INDEX path_gist_comments_idx ON comments USING GIST(path); CREATE INDEX path_comments_idx ON comments USING btree(path);
正如您看到的那樣,我建立comments表時帶有path字段,該字段包含該表的tree全部路徑。如您所見,對於樹分隔符,我使用4個數字和點。
讓我們在commenets表中找到path以‘0001.0003’的記錄:
$ SELECT user_id, path FROM comments WHERE path <@ '0001.0003'; user_id | path ---------+-------------------------- 6 | 0001.0003 8 | 0001.0003.0001 9 | 0001.0003.0002 11 | 0001.0003.0002.0001 2 | 0001.0003.0002.0002 5 | 0001.0003.0002.0003 7 | 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | 0001.0003.0002.0002.0003 22 | 0001.0003.0002.0002.0004 34 | 0001.0003.0002.0002.0005 22 | 0001.0003.0002.0002.0006 (12 rows)
讓我們通過EXPLAIN命令檢查這個SQL:
$ EXPLAIN ANALYZE SELECT user_id, path FROM comments WHERE path <@ '0001.0003'; QUERY PLAN ---------------------------------------------------------------------------------------------------- Seq Scan on comments (cost=0.00..1.24 rows=2 width=38) (actual time=0.013..0.017 rows=12 loops=1) Filter: (path <@ '0001.0003'::ltree) Rows Removed by Filter: 7 Total runtime: 0.038 ms (4 rows)
讓我們禁用seq scan進行測試:
$ SET enable_seqscan=false; SET $ EXPLAIN ANALYZE SELECT user_id, path FROM comments WHERE path <@ '0001.0003'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Index Scan using path_gist_comments_idx on comments (cost=0.00..8.29 rows=2 width=38) (actual time=0.023..0.034 rows=12 loops=1) Index Cond: (path <@ '0001.0003'::ltree) Total runtime: 0.076 ms (3 rows)
現在SQL慢瞭,但是能看到SQL是怎麼使用index的。
第一個SQL語句使用瞭sequence scan,因為在表中沒有太多的數據。
我們可以將select “path <@ ‘0001.0003’” 換種實現方法:
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*'; user_id | path ---------+-------------------------- 6 | 0001.0003 8 | 0001.0003.0001 9 | 0001.0003.0002 11 | 0001.0003.0002.0001 2 | 0001.0003.0002.0002 5 | 0001.0003.0002.0003 7 | 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | 0001.0003.0002.0002.0003 22 | 0001.0003.0002.0002.0004 34 | 0001.0003.0002.0002.0005 22 | 0001.0003.0002.0002.0006 (12 rows)
你不應該忘記數據的順序,如下的例子:
$ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0001'); $ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0002'); $ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0003'); $ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*'; user_id | path ---------+-------------------------- 6 | 0001.0003 8 | 0001.0003.0001 9 | 0001.0003.0002 11 | 0001.0003.0002.0001 2 | 0001.0003.0002.0002 5 | 0001.0003.0002.0003 7 | 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | 0001.0003.0002.0002.0003 22 | 0001.0003.0002.0002.0004 34 | 0001.0003.0002.0002.0005 22 | 0001.0003.0002.0002.0006 9 | 0001.0003.0001.0001 9 | 0001.0003.0001.0002 9 | 0001.0003.0001.0003 (15 rows)
現在進行排序:
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*' ORDER by path; user_id | path ---------+-------------------------- 6 | 0001.0003 8 | 0001.0003.0001 9 | 0001.0003.0001.0001 9 | 0001.0003.0001.0002 9 | 0001.0003.0001.0003 9 | 0001.0003.0002 11 | 0001.0003.0002.0001 2 | 0001.0003.0002.0002 7 | 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | 0001.0003.0002.0002.0003 22 | 0001.0003.0002.0002.0004 34 | 0001.0003.0002.0002.0005 22 | 0001.0003.0002.0002.0006 5 | 0001.0003.0002.0003 (15 rows)
可以在lquery的非星號標簽的末尾添加幾個修飾符,以使其比完全匹配更匹配:
“ @”-不區分大小寫匹配,例如a @匹配A
“ *”-匹配任何帶有該前綴的標簽,例如foo *匹配foobar
“%”-匹配以下劃線開頭的單詞
$ SELECT user_id, path FROM comments WHERE path ~ '0001.*{1,2}.0001|0002.*' ORDER by path; user_id | path ---------+-------------------------- 2 | 0001.0001.0001 2 | 0001.0001.0001.0001 1 | 0001.0001.0001.0002 5 | 0001.0001.0001.0003 6 | 0001.0002.0001 8 | 0001.0003.0001 9 | 0001.0003.0001.0001 9 | 0001.0003.0001.0002 9 | 0001.0003.0001.0003 9 | 0001.0003.0002 11 | 0001.0003.0002.0001 2 | 0001.0003.0002.0002 7 | 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | 0001.0003.0002.0002.0003 22 | 0001.0003.0002.0002.0004 34 | 0001.0003.0002.0002.0005 22 | 0001.0003.0002.0002.0006 5 | 0001.0003.0002.0003 (19 rows)
我們來為parent ‘0001.0003’找到所有直接的childrens,見下:
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*{1}' ORDER by path; user_id | path ---------+---------------- 8 | 0001.0003.0001 9 | 0001.0003.0002 (2 rows)
為parent ‘0001.0003’找到所有的childrens,見下:
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*' ORDER by path; user_id | path ---------+-------------------------- 6 | 0001.0003 8 | 0001.0003.0001 9 | 0001.0003.0001.0001 9 | 0001.0003.0001.0002 9 | 0001.0003.0001.0003 9 | 0001.0003.0002 11 | 0001.0003.0002.0001 2 | 0001.0003.0002.0002 7 | 0001.0003.0002.0002.0001 20 | 0001.0003.0002.0002.0002 31 | 0001.0003.0002.0002.0003 22 | 0001.0003.0002.0002.0004 34 | 0001.0003.0002.0002.0005 22 | 0001.0003.0002.0002.0006 5 | 0001.0003.0002.0003 (15 rows)
為children ‘0001.0003.0002.0002.0005’找到parent:
$ SELECT user_id, path FROM comments WHERE path = subpath('0001.0003.0002.0002.0005', 0, -1) ORDER by path; user_id | path ---------+--------------------- 2 | 0001.0003.0002.0002 (1 row)
如果你的路徑不是唯一的,你會得到多條記錄。
概述
可以看出,使用ltree的物化路徑非常簡單。在本文中,我沒有列出ltree的所有可能用法。它不被視為全文搜索問題ltxtquery。但是您可以在PostgreSQL官方文檔(http://www.postgresql.org/docs/current/static/ltree.html)中找到它。
瞭解更多PostgreSQL熱點資訊、新聞動態、精彩活動,請訪問中國PostgreSQL官方網站:www.postgresqlchina.com
解決更多PostgreSQL相關知識、技術、工作問題,請訪問中國PostgreSQL官方問答社區:www.pgfans.cn
下載更多PostgreSQL相關資料、工具、插件問題,請訪問中國PostgreSQL官方下載網站:www.postgreshub.cn
到此這篇關於在PostgreSQL中使用ltree處理層次結構數據的文章就介紹到這瞭,更多相關PostgreSQL層次結構數據內容請搜索WalkonNet以前的文章或繼續瀏覽下面的相關文章希望大傢以後多多支持WalkonNet!
推薦閱讀:
- None Found