ES 除了DSL查询aggregations也是很重要的,如果说DSL相当于sql aggregations就相当于group by 和一些sum count 函数
aggregations能作什么aggregations主要分三个部分,分桶聚合,指标聚合,管道聚合,aggregations在写查询语句的时候可以简写成aggs
分桶聚合(bucket):相当于sql的group by 能按照某一个或多个条件,对数据进行分桶(分组),默认返回数据的count(计数)条数,但实际上,可以理解为数据被分桶了,方便后面的聚合或者统计 *** 作,后面会有实际的例子。
指标聚合(metrice):就是对分桶,或者未分桶的数据进行计算,例如avg求平均值,MAX最大值,min最小值,value count 计数 cardinality 基数 去重 starts 统计聚合等
管道聚合(pipeline):基于聚合结果的查询,分桶有可能是多层的,也有可能和指标是仙桃的,管道聚合可以根据路径(分桶和指标聚合时候的命名路径)对数据进行有针对性的 *** 作,例如排序。
aggregations如何使用es先插入数据
PUT /product/_doc/1 { "name" : "小米手机", "desc" : "手机中的战斗机", "price" : 3999, "lv":"旗舰机", "type":"手机", "createtime":"2020-10-01T08:00:00Z", "tags": [ "性价比", "发烧", "不卡顿" ] } PUT /product/_doc/2 { "name" : "小米NFC手机", "desc" : "支持全功能NFC,手机中的滑翔机", "price" : 4999, "lv":"旗舰机", "type":"手机", "createtime":"2020-05-21T08:00:00Z", "tags": [ "性价比", "发烧", "公交卡" ] } PUT /product/_doc/3 { "name" : "NFC手机", "desc" : "手机中的轰炸机", "price" : 2999, "lv":"高端机", "type":"手机", "createtime":"2020-06-20", "tags": [ "性价比", "快充", "门禁卡" ] } PUT /product/_doc/4 { "name" : "小米耳机", "desc" : "耳机中的黄焖鸡", "price" : 999, "lv":"百元机", "type":"耳机", "createtime":"2020-06-23", "tags": [ "降噪", "防水", "蓝牙" ] } PUT /product/_doc/5 { "name" : "红米耳机", "desc" : "耳机中的肯德基", "price" : 399, "type":"耳机", "lv":"百元机", "createtime":"2020-07-20", "tags": [ "防火", "低音炮", "听声辨位" ] } PUT /product/_doc/6 { "name" : "小米手机12", "desc" : "充电贼快掉电更快,超级无敌望远镜,高刷电竞屏", "price" : 5999, "lv":"旗舰机", "type":"手机", "createtime":"2020-07-27", "tags": [ "120HZ刷新率", "120W快充", "120倍变焦" ] } PUT /product/_doc/7 { "name" : "挨炮 SE2", "desc" : "除了CPU,一无是处", "price" : 3299, "lv":"旗舰机", "type":"手机", "createtime":"2020-07-21", "tags": [ "割韭菜", "割韭菜", "割新韭菜" ] } PUT /product/_doc/8 { "name" : "XS Max", "desc" : "听说要出新款15手机了,终于可以换掉手中的4S了", "price" : 4399, "lv":"旗舰机", "type":"手机", "createtime":"2020-08-19", "tags": [ "5V1A", "4G全网通", "大" ] } PUT /product/_doc/9 { "name" : "小米电视", "desc" : "70寸性价比只选,不要一万八,要不要八千八,只要两千九百九十八", "price" : 2998, "lv":"高端机", "type":"电视", "createtime":"2020-08-16", "tags": [ "巨馍", "家庭影院", "游戏" ] } PUT /product/_doc/10 { "name" : "红米电视", "desc" : "我比上边那个更划算,我也2998,我也70寸,但是我更好看", "price" : 2999, "type":"电视", "lv":"高端机", "createtime":"2020-08-28", "tags": [ "大片", "蓝光8K", "超薄" ] } PUT /product/_doc/11 { "name": "红米电视", "desc": "我比上边那个更划算,我也2998,我也70寸,但是我更好看", "price": "2998", "type": "电视", "lv": "高端机", "createtime": "2020-08-28", "tags": [ "大片", "蓝光8K", "超薄" ] }
分桶聚合简单例子
GET product/_search { "size": 0, "aggs": { "tagtest": {// *** 作的命名 "terms": {//分桶方式 "field": "tags.keyword",//按tags 进行分桶 keyword 代表不分词 直接取数据 "size": 10 } } } }
指标聚合简单例子
GET product/_search { "size": 0, "aggs": { "max": { "max": { "field": "price" } }, "min": { "min": { "field": "price" } }, "avg": { "avg": { "field": "price" } } } } GET product/_search { "size": 0, "aggs": { "price_stats": { "stats": { "field": "price" } } } }
管道聚合例子
取按type分组后,进行avg平均值计算后,所有数据的最小值
嵌套聚合
根据 商品的 type,和lv(级别)进行分桶(嵌套),利用avg函数对价格计算平均值,利用管道查出分桶平均后的最小值
GET product/_search { "size": 0, "aggs": { "type_lv": { "terms": { "field": "type.keyword" }, "aggs": { "lv": { "terms": { "field": "lv.keyword" }, "aggs": { "price_avg": { "avg": { "field": "price" } } } }, "price_min": { "min_bucket": { "buckets_path": "lv>price_avg" } } } } } }基于查询结果的聚合
1 可以再aggs同级下添加 查询或筛选条件,对分桶的数据进行条件限制
例如添加条件筛选,按标签分桶,限制价格区间
GET product/_search { "query": { "range": { "price": { "gte": 2000, "lte": 6000 } } }, "aggs": { "type_lv": { "terms": { "field": "type.keyword" } } } }
用过滤器对进行过滤然后分桶
GET product/_search { "query": { "bool": { "filter": [ { "range": { "price": { "gte": 10, "lte": 2000 } } } ] } }, "aggs": { "type_lv": { "terms": { "field": "type.keyword" } } } }
基于聚合结果的查询(分桶后对分桶后的数据进行筛选查询)
对分桶的部分结果,取消查询或筛选条件的限制
Global 阻断 上面的查询条件
如果多维度统计 有些需要过滤之后统计,有些不需要
GET product/_search { "size": 0, "query": {"range": { "price": { "gte": 1000 } }}, "aggs": { "max": { "max": { "field": "price" } }, "min": { "min": { "field": "price" } }, "avg": { "global": {}, "aggs": { "price_avg": { "avg": { "field": "price" } } } } } }
不同的指标聚合 有的根据筛选聚合 有的全量数据聚合
GET product/_search { "size": 0, "aggs": { "max": { "max": { "field": "price" } }, "min": { "min": { "field": "price" } }, "avg": { "filter": { "range": { "price": { "gte": 1000 } } }, "aggs": { "price_avg": { "avg": { "field": "price" } } } } } }基于聚合的排序
按照count排序
GET product/_search { "size": 0, "aggs": { "tag_bucket":{ "terms": { "field": "tags.keyword", "size": 10, "order": { "_count": "asc" } } } } }
多级排序
GET product/_search { "size": 0, "aggs": { "type_first_order":{ "terms": { "field": "type.keyword", "order": { "_term": "asc" } }, "aggs": { "lv_second_order": { "terms": { "field": "lv.keyword", "order": { "_key": "asc" } } } } } } }
多级聚合
GET product/_search { "size": 0, "aggs": { "type_stats_price": { "terms": { "field": "type.keyword", "order": { "aggs_price>stats.sum": "asc" } }, "aggs": { "aggs_price": { "filter": { "terms": { "type.keyword": ["耳机","手机","电视"] } }, "aggs": { "stats": { "stats": { "field": "price" } } } } } } } }
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)