如何通过Elasticsearch中的嵌套字段计算多个唯一文档？_随笔

如何通过Elasticsearch中的嵌套字段计算多个唯一文档？

折叠只能在

_search

上下文中使用，而不能在中使用

_count

。

其次，您的查询甚至可以做什么？您那里有很多多余的参数，例如

boost:1

etc。您不妨说：

POST /package/_count?&pretty{  "query": {    "bool": {      "must": [        {          "nested": { "path": "attachment", "query": {   "match_all": {} }          }        }      ]    }  }}

这实际上什么也没做:)

要回答您的原始问题 “使用唯一的嵌套字段值对文档进行计数”，

假设有3个文档，其中2个具有相同的

attachment.uuid

值：

[  {    "attachment":{      "uuid":"04144e14-62c3-11ea-bc55-0242ac130003"    }  },  {    "attachment":{      "uuid":"04144e14-62c3-11ea-bc55-0242ac130003"    }  },  {    "attachment":{      "uuid":"100b9632-62c3-11ea-bc55-0242ac130003"    }  }]

要获取s 的

terms

细分

uuid

，请运行

GET package/_search{  "size": 0,  "aggs": {    "nested_uniques": {      "nested": {        "path": "attachment"      },      "aggs": {        "subagg": {          "terms": { "field": "attachment.uuid"          }        }      }    }  }}

产生

...{  "aggregations":{    "nested_uniques":{      "doc_count":3,      "subagg":{        "doc_count_error_upper_bound":0,        "sum_other_doc_count":0,        "buckets":[          { "key":"04144e14-62c3-11ea-bc55-0242ac130003", "doc_count":2          },          { "key":"100b9632-62c3-11ea-bc55-0242ac130003", "doc_count":1          }        ]      }    }  }}

为了获得唯一嵌套字段的父文档数，我们将不得不变得更加聪明：

GET package/_search{  "size": 0,  "aggs": {    "nested_uniques": {      "nested": {        "path": "attachment"      },      "aggs": {        "scripted_uniques": {          "scripted_metric": { "init_script": "state.my_map = [:];", "map_script": """   if (doc.containsKey('attachment.uuid')) {     state.my_map[doc['attachment.uuid'].value.toString()] = 1;   } """, "combine_script": """   def sum = 0;   for (c in state.my_map.entrySet()) {     sum += 1   }   return sum """, "reduce_script": """   def sum = 0;   for (agg in states) {     sum += agg;   }   return sum; """          }        }      }    }  }}

哪个返回

...{  "aggregations":{    "nested_uniques":{      "doc_count":3,      "scripted_uniques":{        "value":2      }    }  }}

而这

scripted_uniques: 2

正是您所追求的。

注意：我使用嵌套的脚本指标aggs解决了该用例，但是如果您知道更干净的方法，我非常乐于学习！

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/4903237.html

如何通过Elasticsearch中的嵌套字段计算多个唯一文档？

发表评论

评论列表（0条）