ElasticSearch学习之多条件组合查询验证及示例分析

2023-02-03 09:46:07
目录
多条件组合查询boolconstant_score查询验证 & 分析验证分析排序默认排序自定义排序tips单字段排序多字段scroll分页初始化快照 & 快照保存10分钟根据快照ID滚动查询

多条件组合查询

bool

es中使用bool来控制多条件查询,bool查询支持以下参数:

    must:被查询的数据必须满足当前条件mush_not:被查询的数据必须不满足当前条件should:被查询的数据应该满足当前条件。should查询被用于修正查询结果的评分。需要注意的是,如果组合查询中没有must,那么被查询的数据至少要匹配一条should。如果有must语句,那么就无须匹配shouldshould将完全用于修正查询结果的评分filter:被查询的数据必须满足当前条件,但是filter操作不涉及查询结果评分。仅用于条件过滤

    下面通过一个例子来看下如何使用:

    GET class_1/_search
    {
      "query": {
        "bool": {
          "must": [
            {"match": {
              "name": "apple"
            }}
          ],
          "must_not": [
            {"term": {
              "num": {
                "value": "5"
              }
            }}
          ],
          "should": [
            {"match": {
              "name": "k"
            }}
          ],"filter": [
            {"range": {
              "num": {
                "gte": 0,
                "lte": 10
              }
            }}
          ]
        }
      }
    }
    

    结果返回:

    {
      "took" : 9,
      "timed_out" : false,
      "_shards" : {
        "total" : 3,
        "successful" : 3,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 3,
          "relation" : "eq"
        },
        "max_score" : 0.752627,
        "hits" : [
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "b8fcCoYB090miyjed7YE",
            "_score" : 0.752627,
            "_source" : {
              "name" : "I eat apple so haochi1~",
              "num" : 1
            }
          },
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "ccfcCoYB090miyjed7YE",
            "_score" : 0.752627,
            "_source" : {
              "name" : "I eat apple so haochi3~",
              "num" : 1
            }
          },
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "cMfcCoYB090miyjed7YE",
            "_score" : 0.7389809,
            "_source" : {
              "name" : "I eat apple so zhen haochi2~",
              "num" : 1
            }
          }
        ]
      }
    }
    

    constant_score

    constant_score查询可以通过boost指定一个固定的评分,通常来说,constant_score的作用是代替一个只有filterbool查询

    下面看具体使用:

    GET class_1/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "term": {
              "num": 6
            }
          },
          "boost": 1.2
        }
      }
    }
    

    返回:

    {
      "took" : 7,
      "timed_out" : false,
      "_shards" : {
        "total" : 3,
        "successful" : 3,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 2,
          "relation" : "eq"
        },
        "max_score" : 1.2,
        "hits" : [
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "h2Fg-4UBECmbBdQA6VLg",
            "_score" : 1.2,
            "_source" : {
              "name" : "b",
              "num" : 6
            }
          },
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "1",
            "_score" : 1.2,
            "_source" : {
              "name" : "l",
              "num" : 6
            }
          }
        ]
      }
    }
    

    查询验证>

    验证

    es中通过/_validate/query路由来验证查询条件的正确性,>

    示例:

    GET class_1/_validate/query?explain
    {
      "query": {
        "bool": {
          "must": [
            {"match": {
              "name": "apple"
            }}
          ]
        }
      }
    }
    

    正常返回:

    {
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "failed" : 0
      },
      "valid" : true,
      "explanations" : [
        {
          "index" : "class_1",
          "valid" : true,
          "explanation" : "+name:apple"
        }
      ]
    }
    

    name字段改为 name1再查询:

    {
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "failed" : 0
      },
      "valid" : true,
      "explanations" : [
        {
          "index" : "class_1",
          "valid" : true,
          "explanation" : """+MatchNoDocsQuery("unmapped fields [name1]")"""
        }
      ]
    }
    

    可以看到报了异常错误

    分析

    es中通过/_validate/query?explain路由来进行查询分析

    示例:

    GET class_1/_validate/query?explain
    {
      "query": {
        "bool": {
          "must": [
            {"match": {
              "name": "apple so"
            }}
          ]
        }
      }
    }
    

    返回:

    {
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "failed" : 0
      },
      "valid" : true,
      "explanations" : [
        {
          "index" : "class_1",
          "valid" : true,
          "explanation" : "+(name:apple name:so)"
        }
      ]
    }
    

    可以看到"explanation" : "+(name:apple name:so)",查询的短语apple so被进行了分词,分成了name:apple, name: so

    排序

    默认排序

    在前面的几个例子中,我们可以看到它的默认排序是按照_score降序,也就是匹配度高的比较靠前,但是_socre的计算是很占用查询性能的,这个不难理解。

    当我们不需要进行_score计算,可以通过filterconstant_score来进行构建查询条件

    filter示例:

    GET class_1/_search
    {
      "query": {
        "bool": {
          "filter": [
            {"term": {
              "num": 1
            }}
          ]
        }
      }
    }
    

    返回:

    {
      "took" : 5,
      "timed_out" : false,
      "_shards" : {
        "total" : 3,
        "successful" : 3,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 3,
          "relation" : "eq"
        },
        "max_score" : 0.0,
        "hits" : [
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "b8fcCoYB090miyjed7YE",
            "_score" : 0.0,
            "_source" : {
              "name" : "I eat apple so haochi1~",
              "num" : 1
            }
          },
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "ccfcCoYB090miyjed7YE",
            "_score" : 0.0,
            "_source" : {
              "name" : "I eat apple so haochi3~",
              "num" : 1
            }
          },
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "cMfcCoYB090miyjed7YE",
            "_score" : 0.0,
            "_source" : {
              "name" : "I eat apple so zhen haochi2~",
              "num" : 1
            }
          }
        ]
      }
    }
    

    通过查询结果我们发现score都为0.0了,说明没有进行score计算

    constant_score示例:

    GET class_1/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "term": {
              "num": 1
            }
          },
          "boost": 1.2
        }
      }
    }
    

    返回:

    {
      "took" : 3,
      "timed_out" : false,
      "_shards" : {
        "total" : 3,
        "successful" : 3,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 3,
          "relation" : "eq"
        },
        "max_score" : 1.2,
        "hits" : [
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "b8fcCoYB090miyjed7YE",
            "_score" : 1.2,
            "_source" : {
              "name" : "I eat apple so haochi1~",
              "num" : 1
            }
          },
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "ccfcCoYB090miyjed7YE",
            "_score" : 1.2,
            "_source" : {
              "name" : "I eat apple so haochi3~",
              "num" : 1
            }
          },
          {
            "_index" : "class_1",
            "_type" : "_doc",
            "_id" : "cMfcCoYB090miyjed7YE",
            "_score" : 1.2,
            "_source" : {
              "name" : "I eat apple so zhen haochi2~",
              "num" : 1
            }
          }
        ]
      }
    }
    

    可以看到,对应返回的分值,都是使用boost属性指定的分值

    自定义排序

    自定义可以用于大部分场景,那么es中怎么进行自定义排序呢?>es中使用sort参数来自定义排序顺序,默认为升序,那么降序怎么操作呢?

      升序
      {"sort":["num"]}
      
        降序, desc代表降序
        {"sort":[{"num":{"order":"desc"}}]} 
        

        tips

          es中使用doc>列式存储来实现字段的排序功能text字段默认不创建doc value,因此无法针对text字段进行排序可以通过设置text字段属性fielddata=true来开启对text字段的排序功能,但是不建议开启,对text字段排序及其消耗查询性能且不符合需求

          单字段排序

          GET class_1/_search
          {
              "sort": [
                  "num"
              ]
          }
          

          返回:

          {
            "took" : 6,
            "timed_out" : false,
            "_shards" : {
              "total" : 3,
              "successful" : 3,
              "skipped" : 0,
              "failed" : 0
            },
            "hits" : {
              "total" : {
                "value" : 11,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "b8fcCoYB090miyjed7YE",
                  "_score" : null,
                  "_source" : {
                    "name" : "I eat apple so haochi1~",
                    "num" : 1
                  },
                  "sort" : [
                    1
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "ccfcCoYB090miyjed7YE",
                  "_score" : null,
                  "_source" : {
                    "name" : "I eat apple so haochi3~",
                    "num" : 1
                  },
                  "sort" : [
                    1
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "cMfcCoYB090miyjed7YE",
                  "_score" : null,
                  "_source" : {
                    "name" : "I eat apple so zhen haochi2~",
                    "num" : 1
                  },
                  "sort" : [
                    1
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "h2Fg-4UBECmbBdQA6VLg",
                  "_score" : null,
                  "_source" : {
                    "name" : "b",
                    "num" : 6
                  },
                  "sort" : [
                    6
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "1",
                  "_score" : null,
                  "_source" : {
                    "name" : "l",
                    "num" : 6
                  },
                  "sort" : [
                    6
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "3",
                  "_score" : null,
                  "_source" : {
                    "num" : 9,
                    "name" : "e",
                    "age" : 9,
                    "desc" : [
                      "hhhh"
                    ]
                  },
                  "sort" : [
                    9
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "4",
                  "_score" : null,
                  "_source" : {
                    "name" : "f",
                    "age" : 10,
                    "num" : 10
                  },
                  "sort" : [
                    10
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "RWlfBIUBDuA8yW5cu9wu",
                  "_score" : null,
                  "_source" : {
                    "name" : "一年级",
                    "num" : 20
                  },
                  "sort" : [
                    20
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "iGFt-4UBECmbBdQAnVJe",
                  "_score" : null,
                  "_source" : {
                    "name" : "g",
                    "age" : 8
                  },
                  "sort" : [
                    9223372036854775807
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "iWFt-4UBECmbBdQAnVJg",
                  "_score" : null,
                  "_source" : {
                    "name" : "h",
                    "age" : 9
                  },
                  "sort" : [
                    9223372036854775807
                  ]
                }
              ]
            }
          }
          

          可以看到是按照num默认升序排序

          再看下降序:

          GET class_1/_search
          {
              "sort": [
                  {"num": {"order":"desc"}}
              ]
          }
          

          返回:

          {
            "took" : 15,
            "timed_out" : false,
            "_shards" : {
              "total" : 3,
              "successful" : 3,
              "skipped" : 0,
              "failed" : 0
            },
            "hits" : {
              "total" : {
                "value" : 11,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "RWlfBIUBDuA8yW5cu9wu",
                  "_score" : null,
                  "_source" : {
                    "name" : "一年级",
                    "num" : 20
                  },
                  "sort" : [
                    20
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "4",
                  "_score" : null,
                  "_source" : {
                    "name" : "f",
                    "age" : 10,
                    "num" : 10
                  },
                  "sort" : [
                    10
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "3",
                  "_score" : null,
                  "_source" : {
                    "num" : 9,
                    "name" : "e",
                    "age" : 9,
                    "desc" : [
                      "hhhh"
                    ]
                  },
                  "sort" : [
                    9
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "h2Fg-4UBECmbBdQA6VLg",
                  "_score" : null,
                  "_source" : {
                    "name" : "b",
                    "num" : 6
                  },
                  "sort" : [
                    6
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "1",
                  "_score" : null,
                  "_source" : {
                    "name" : "l",
                    "num" : 6
                  },
                  "sort" : [
                    6
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "b8fcCoYB090miyjed7YE",
                  "_score" : null,
                  "_source" : {
                    "name" : "I eat apple so haochi1~",
                    "num" : 1
                  },
                  "sort" : [
                    1
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "ccfcCoYB090miyjed7YE",
                  "_score" : null,
                  "_source" : {
                    "name" : "I eat apple so haochi3~",
                    "num" : 1
                  },
                  "sort" : [
                    1
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "cMfcCoYB090miyjed7YE",
                  "_score" : null,
                  "_source" : {
                    "name" : "I eat apple so zhen haochi2~",
                    "num" : 1
                  },
                  "sort" : [
                    1
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "iGFt-4UBECmbBdQAnVJe",
                  "_score" : null,
                  "_source" : {
                    "name" : "g",
                    "age" : 8
                  },
                  "sort" : [
                    -9223372036854775808
                  ]
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "iWFt-4UBECmbBdQAnVJg",
                  "_score" : null,
                  "_source" : {
                    "name" : "h",
                    "age" : 9
                  },
                  "sort" : [
                    -9223372036854775808
                  ]
                }
              ]
            }
          }
          

          这下就降序排序了

          多字段

          GET class_1/_search
          {
              "sort": [
                  "num", "age"
              ]
          }
          

          scroll分页

          还记得之前给大家讲的from+size的分页方式吗,es中默认允许from+size的分页的最大数据量为10000。当我们想要批量获取更大的数据量时,使用from+size就会十分的耗费性能。

          然而大部分应用场景下的数据量是极其庞大的,比如你要查询某些系统日志数据。es中可以使用/scorll路由来进行滚动分页查询,它类似于在查询初始时间点创建了一个当前服务集群的数据快照(包含每一个分片),并保留它一段时间。在时间超过了设置的过期时间以后,快照将在es空闲时被删除。

          需要注意的是,因为是进行快照查询,因此在快照创建后数据的变更在本次的滚动查询中,不可见

          初始化快照>

          查询示例:

          GET class_1/_search?scroll=10m
          {
          "query": {
           "match_phrase": {
             "name": "apple"
           }
          },
          "size": 2
          }
          

          返回:

          {
            "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
            "took" : 6,
            "timed_out" : false,
            "_shards" : {
              "total" : 3,
              "successful" : 3,
              "skipped" : 0,
              "failed" : 0
            },
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : 0.752627,
              "hits" : [
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "b8fcCoYB090miyjed7YE",
                  "_score" : 0.752627,
                  "_source" : {
                    "name" : "I eat apple so haochi1~",
                    "num" : 1
                  }
                },
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "ccfcCoYB090miyjed7YE",
                  "_score" : 0.752627,
                  "_source" : {
                    "name" : "I eat apple so haochi3~",
                    "num" : 1
                  }
                }
              ]
            }
          }
          

          如图,当前共返回2条数据,并且返回了一个快照ID,后续可以根据快照ID进行滚动查询:

          根据快照ID滚动查询

          GET /_search/scroll
          {
           "scroll": "10m", 
           "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw=="
          }
          

          返回:

          {
            "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
            "took" : 6,
            "timed_out" : false,
            "_shards" : {
              "total" : 3,
              "successful" : 3,
              "skipped" : 0,
              "failed" : 0
            },
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : 0.752627,
              "hits" : [
                {
                  "_index" : "class_1",
                  "_type" : "_doc",
                  "_id" : "cMfcCoYB090miyjed7YE",
                  "_score" : 0.7389809,
                  "_source" : {
                    "name" : "I eat apple so zhen haochi2~",
                    "num" : 1
                  }
                }
              ]
            }
          }
          

          在滚动一次:

          {
            "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
            "took" : 1,
            "timed_out" : false,
            "_shards" : {
              "total" : 3,
              "successful" : 3,
              "skipped" : 0,
              "failed" : 0
            },
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : 0.752627,
              "hits" : [ ]
            }
          }
          

          有的小伙伴可能不知道怎么滚动的,因为后续滚动都是同一个scroll_id,其实通过结果,我们不难发现:

            首先创建了一个10分钟的快照,规定了每次返回的数据量为2条,并且初始化的时候,返回了2条通过scroll_id进行滚动操作,返回了1条数据,原因是快照的数据量总共只有3条,初始化的时候返回了2条,所以现在只有1条再次滚动的时候,发现返回了空,因为数据已经被查完了

            以上就是ElasticSearch 多条件组合查询验证及示例分析的详细内容,更多关于ElasticSearch 多条件组合查询的资料请关注易采站长站其它相关文章!