ElasticSearch學習之多條件組合查詢驗證及示例分析
多條件組合查詢
bool
es
中使用bool
來控制多條件
查詢,bool
查詢支持以下參數:
must
:被查詢的數據必須滿足
當前條件mush_not
:被查詢的數據必須不滿足
當前條件should
:被查詢的數據應該滿足
當前條件。should
查詢被用於修正查詢結果的評分。需要註意的是,如果組合查詢中沒有must
,那麼被查詢的數據至少要匹配一條should
。如果有must
語句,那麼就無須匹配should
,should
將完全用於修正查詢結果的評分filter
:被查詢的數據必須滿足
當前條件,但是filter
操作不涉及查詢結果評分。僅用於條件過濾
下面通過一個例子來看下如何使用:
GET class_1/_search { "query": { "bool": { "must": [ {"match": { "name": "apple" }} ], "must_not": [ {"term": { "num": { "value": "5" } }} ], "should": [ {"match": { "name": "k" }} ],"filter": [ {"range": { "num": { "gte": 0, "lte": 10 } }} ] } } }
結果返回:
{ "took" : 9, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 0.752627, "hits" : [ { "_index" : "class_1", "_type" : "_doc", "_id" : "b8fcCoYB090miyjed7YE", "_score" : 0.752627, "_source" : { "name" : "I eat apple so haochi1~", "num" : 1 } }, { "_index" : "class_1", "_type" : "_doc", "_id" : "ccfcCoYB090miyjed7YE", "_score" : 0.752627, "_source" : { "name" : "I eat apple so haochi3~", "num" : 1 } }, { "_index" : "class_1", "_type" : "_doc", "_id" : "cMfcCoYB090miyjed7YE", "_score" : 0.7389809, "_source" : { "name" : "I eat apple so zhen haochi2~", "num" : 1 } } ] } }
constant_score
constant_score
查詢可以通過boost
指定一個固定的評分,通常來說,constant_score
的作用是代替一個隻有filter
的bool
查詢
下面看具體使用:
GET class_1/_search { "query": { "constant_score": { "filter": { "term": { "num": 6 } }, "boost": 1.2 } } }
返回:
{ "took" : 7, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.2, "hits" : [ { "_index" : "class_1", "_type" : "_doc", "_id" : "h2Fg-4UBECmbBdQA6VLg", "_score" : 1.2, "_source" : { "name" : "b", "num" : 6 } }, { "_index" : "class_1", "_type" : "_doc", "_id" : "1", "_score" : 1.2, "_source" : { "name" : "l", "num" : 6 } } ] } }
查詢驗證 & 分析
驗證
es
中通過/_validate/query
路由來驗證查詢條件的正確性, 這裡要註意是驗證查詢條件是否準確
示例:
GET class_1/_validate/query?explain { "query": { "bool": { "must": [ {"match": { "name": "apple" }} ] } } }
正常返回:
{ "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "valid" : true, "explanations" : [ { "index" : "class_1", "valid" : true, "explanation" : "+name:apple" } ] }
將name
字段改為 name1
再查詢:
{ "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "valid" : true, "explanations" : [ { "index" : "class_1", "valid" : true, "explanation" : """+MatchNoDocsQuery("unmapped fields [name1]")""" } ] }
可以看到報瞭異常錯誤
分析
es
中通過/_validate/query?explain
路由來進行查詢分析
示例:
GET class_1/_validate/query?explain { "query": { "bool": { "must": [ {"match": { "name": "apple so" }} ] } } }
返回:
{ "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "valid" : true, "explanations" : [ { "index" : "class_1", "valid" : true, "explanation" : "+(name:apple name:so)" } ] }
可以看到"explanation" : "+(name:apple name:so)"
,查詢的短語apple so
被進行瞭分詞,分成瞭name:apple
, name: so
排序
默認排序
在前面的幾個例子中,我們可以看到它的默認排序是按照_score降序,也就是匹配度高的比較靠前,但是_socre
的計算是很占用查詢性能的,這個不難理解。
當我們不需要進行_score計算,可以通過filter
或constant_score
來進行構建查詢條件
filter
示例:
GET class_1/_search { "query": { "bool": { "filter": [ {"term": { "num": 1 }} ] } } }
返回:
{ "took" : 5, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 0.0, "hits" : [ { "_index" : "class_1", "_type" : "_doc", "_id" : "b8fcCoYB090miyjed7YE", "_score" : 0.0, "_source" : { "name" : "I eat apple so haochi1~", "num" : 1 } }, { "_index" : "class_1", "_type" : "_doc", "_id" : "ccfcCoYB090miyjed7YE", "_score" : 0.0, "_source" : { "name" : "I eat apple so haochi3~", "num" : 1 } }, { "_index" : "class_1", "_type" : "_doc", "_id" : "cMfcCoYB090miyjed7YE", "_score" : 0.0, "_source" : { "name" : "I eat apple so zhen haochi2~", "num" : 1 } } ] } }
通過查詢結果我們發現score
都為0.0
瞭,說明沒有進行score
計算
constant_score
示例:
GET class_1/_search { "query": { "constant_score": { "filter": { "term": { "num": 1 } }, "boost": 1.2 } } }
返回:
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 1.2, "hits" : [ { "_index" : "class_1", "_type" : "_doc", "_id" : "b8fcCoYB090miyjed7YE", "_score" : 1.2, "_source" : { "name" : "I eat apple so haochi1~", "num" : 1 } }, { "_index" : "class_1", "_type" : "_doc", "_id" : "ccfcCoYB090miyjed7YE", "_score" : 1.2, "_source" : { "name" : "I eat apple so haochi3~", "num" : 1 } }, { "_index" : "class_1", "_type" : "_doc", "_id" : "cMfcCoYB090miyjed7YE", "_score" : 1.2, "_source" : { "name" : "I eat apple so zhen haochi2~", "num" : 1 } } ] } }
可以看到,對應返回的分值,都是使用boost
屬性指定的分值
自定義排序
自定義可以用於大部分場景,那麼es
中怎麼進行自定義排序呢? es
中使用sort
參數來自定義排序順序,默認為升序,那麼降序怎麼操作呢?
- 升序
{"sort":["num"]}
- 降序,
desc
代表降序
{"sort":[{"num":{"order":"desc"}}]}
tips
es
中使用doc value
列式存儲來實現字段的排序功能text
字段默認不創建doc value
,因此無法針對text
字段進行排序- 可以通過設置
text
字段屬性fielddata=true
來開啟對text
字段的排序功能,但是不建議開啟,對text
字段排序及其消耗查詢性能且不符合需求
單字段排序
GET class_1/_search { "sort": [ "num" ] }
返回:
{ "took" : 6, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 11, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "class_1", "_type" : "_doc", "_id" : "b8fcCoYB090miyjed7YE", "_score" : null, "_source" : { "name" : "I eat apple so haochi1~", "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "ccfcCoYB090miyjed7YE", "_score" : null, "_source" : { "name" : "I eat apple so haochi3~", "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "cMfcCoYB090miyjed7YE", "_score" : null, "_source" : { "name" : "I eat apple so zhen haochi2~", "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "h2Fg-4UBECmbBdQA6VLg", "_score" : null, "_source" : { "name" : "b", "num" : 6 }, "sort" : [ 6 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "1", "_score" : null, "_source" : { "name" : "l", "num" : 6 }, "sort" : [ 6 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "3", "_score" : null, "_source" : { "num" : 9, "name" : "e", "age" : 9, "desc" : [ "hhhh" ] }, "sort" : [ 9 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "4", "_score" : null, "_source" : { "name" : "f", "age" : 10, "num" : 10 }, "sort" : [ 10 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "RWlfBIUBDuA8yW5cu9wu", "_score" : null, "_source" : { "name" : "一年級", "num" : 20 }, "sort" : [ 20 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "iGFt-4UBECmbBdQAnVJe", "_score" : null, "_source" : { "name" : "g", "age" : 8 }, "sort" : [ 9223372036854775807 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "iWFt-4UBECmbBdQAnVJg", "_score" : null, "_source" : { "name" : "h", "age" : 9 }, "sort" : [ 9223372036854775807 ] } ] } }
可以看到是按照num
默認升序排序
再看下降序:
GET class_1/_search { "sort": [ {"num": {"order":"desc"}} ] }
返回:
{ "took" : 15, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 11, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "class_1", "_type" : "_doc", "_id" : "RWlfBIUBDuA8yW5cu9wu", "_score" : null, "_source" : { "name" : "一年級", "num" : 20 }, "sort" : [ 20 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "4", "_score" : null, "_source" : { "name" : "f", "age" : 10, "num" : 10 }, "sort" : [ 10 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "3", "_score" : null, "_source" : { "num" : 9, "name" : "e", "age" : 9, "desc" : [ "hhhh" ] }, "sort" : [ 9 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "h2Fg-4UBECmbBdQA6VLg", "_score" : null, "_source" : { "name" : "b", "num" : 6 }, "sort" : [ 6 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "1", "_score" : null, "_source" : { "name" : "l", "num" : 6 }, "sort" : [ 6 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "b8fcCoYB090miyjed7YE", "_score" : null, "_source" : { "name" : "I eat apple so haochi1~", "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "ccfcCoYB090miyjed7YE", "_score" : null, "_source" : { "name" : "I eat apple so haochi3~", "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "cMfcCoYB090miyjed7YE", "_score" : null, "_source" : { "name" : "I eat apple so zhen haochi2~", "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "iGFt-4UBECmbBdQAnVJe", "_score" : null, "_source" : { "name" : "g", "age" : 8 }, "sort" : [ -9223372036854775808 ] }, { "_index" : "class_1", "_type" : "_doc", "_id" : "iWFt-4UBECmbBdQAnVJg", "_score" : null, "_source" : { "name" : "h", "age" : 9 }, "sort" : [ -9223372036854775808 ] } ] } }
這下就降序
排序瞭
多字段
GET class_1/_search { "sort": [ "num", "age" ] }
scroll分頁
還記得之前給大傢講的from+size
的分頁方式嗎,es
中默認允許from+size
的分頁的最大數據量為10000
。當我們想要批量獲取更大的數據量時,使用from+size
就會十分的耗費性能。
然而大部分應用場景下的數據量是極其龐大的,比如你要查詢某些系統日志數據。es
中可以使用/scorll
路由來進行滾動分頁查詢
,它類似於在查詢初始時間點創建瞭一個當前服務集群的數據快照
(包含每一個分片),並保留它一段時間。在時間超過瞭設置的過期時間以後,快照將在es空閑時被刪除。
需要註意的是,因為是進行快照
查詢,因此在快照
創建後數據的變更在本次的滾動查詢中,不可見
初始化快照 & 快照保存10分鐘
查詢示例:
GET class_1/_search?scroll=10m { "query": { "match_phrase": { "name": "apple" } }, "size": 2 }
返回:
{ "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==", "took" : 6, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 0.752627, "hits" : [ { "_index" : "class_1", "_type" : "_doc", "_id" : "b8fcCoYB090miyjed7YE", "_score" : 0.752627, "_source" : { "name" : "I eat apple so haochi1~", "num" : 1 } }, { "_index" : "class_1", "_type" : "_doc", "_id" : "ccfcCoYB090miyjed7YE", "_score" : 0.752627, "_source" : { "name" : "I eat apple so haochi3~", "num" : 1 } } ] } }
如圖,當前共返回2
條數據,並且返回瞭一個快照ID,後續可以根據快照ID進行滾動查詢:
根據快照ID滾動查詢
GET /_search/scroll { "scroll": "10m", "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==" }
返回:
{ "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==", "took" : 6, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 0.752627, "hits" : [ { "_index" : "class_1", "_type" : "_doc", "_id" : "cMfcCoYB090miyjed7YE", "_score" : 0.7389809, "_source" : { "name" : "I eat apple so zhen haochi2~", "num" : 1 } } ] } }
在滾動一次:
{ "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==", "took" : 1, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 0.752627, "hits" : [ ] } }
有的小夥伴可能不知道怎麼滾動
的,因為後續滾動都是同一個scroll_id
,其實通過結果,我們不難發現:
- 首先創建瞭一個10分鐘的
快照
,規定瞭每次返回的數據量為2條
,並且初始化的時候,返回瞭2條 - 通過
scroll_id
進行滾動操作,返回瞭1條
數據,原因是快照的數據量總共隻有3條
,初始化的時候返回瞭2條
,所以現在隻有1條
- 再次滾動的時候,發現返回瞭空,因為數據已經被查完瞭
以上就是ElasticSearch 多條件組合查詢驗證及示例分析的詳細內容,更多關於ElasticSearch 多條件組合查詢的資料請關註WalkonNet其它相關文章!
推薦閱讀:
- ElasticSearch查詢文檔基本操作實例
- Elasticsearch屬性單詞常用解析說明
- Java elasticSearch-api的具體操作步驟講解
- ElasticSearch學習之Es集群Api操作示例
- SpringBoot集成ElasticSearch的示例代碼