介绍SearchTemplateIndexAliasSuggestAPI
Phrase SuggesterPhrase SuggesterCompletionSuggeserContextSuggester
介绍SearchTemplate查询模板可以用来解耦,IndexAlias索引别名可以实现封装和解耦,SuggestAPI推荐API可以将输入的文本分解为单词,然后在索引的字段里查找相似的单词并返回。
SearchTemplate示例如下,给标题做一个match_phrase匹配,q为参数:
POST /_scripts/movies { "script": { "lang": "mustache", "source": { "_source": [ "title" ], "size": 20, "query": { "bool": { "must": { "match_phrase": { "title": "{{q}}" } } } } } } }
使用方法如下,只需要给指定查询模板传参即可:
POST movies/_search/template { "id": "movies", "params": { "q": "Safe Passage" } }IndexAlias
示例如下,给某个索引起名为movies-today,并加入过滤器,过滤出rating字段≥10的记录:
POST _aliases { "actions": [ { "add": { "index": "movies-2020-07-13", "alias": "movies-today", "filter": { "range": { "rating": { "gte": 10 } } } } } ] }
实现要给movies-2020-07-13索引插入数据:
POST movies-2020-07-13/_doc/1 { "name": "n1", "rating":11 } POST movies-2020-07-13/_doc/2 { "name": "n2", "rating":9 }
然后对索引别名查询即可:
POST movies-today/_search { "query": {"match_all": {}} }SuggestAPI
ES7中总共有四种推荐器:Term/Phrase Suggester、Complete/Context Suggester。
Phrase Suggester先插入测试数据:
POST _bulk {"index": {"_index": "article", "_id": 1}} {"body": "lucene is very cool"} {"index": {"_index": "article", "_id": 2}} {"body": "ElasticSearch is built on top of lucene"} {"index": {"_index": "article", "_id": 3}} {"body": "ElasticSearch rocks"} {"index": {"_index": "article", "_id": 4}} {"body": "Elastic is the corporation of ELK stack"} {"index": {"_index": "article", "_id": 5}} {"body": "ELK stack rocks"} {"index": {"_index": "article", "_id": 6}} {"body": "Elastic is rock solid"}
然后编写查询体,给出Suggester,这里是对文本luece rock进行缺失建议:
POST article/_search { "size": 20, "query": {"match": { "body": "luece rock" }}, "suggest": { "term-suggestion": { "text": "luece rock", "term": { "suggest_mode": "missing", "field": "body" } } } }
有三种建议模式:Missing(如果指定文本就是已存在的字段,就不会推荐)、Popular(推荐出现频率更高的词)和Always(不管文本是不是已存在的字段,都进行推荐),所以上面的例子输出中的suggest部分如下所示
"suggest" : { "term-suggestion" : [ { "text" : "luece", "offset" : 0, "length" : 5, "options" : [ { "text" : "lucene", "score" : 0.6, "freq" : 4 } ] }, { "text" : "rock", "offset" : 6, "length" : 4, "options" : [ ] } ] }
但如果把上面的rock改成hock也不会对它进行推荐,这时加入prefix_length字段,令其为0即可:
POST article/_search { "size": 20, "query": {"match": { "body": "luece builf hock" }}, "suggest": { "term-suggestion": { "text": "luece builf hock", "term": { "suggest_mode": "missing", "field": "body", "prefix_length": 0 } } } }
输出的suggest字段如下:
"suggest" : { "term-suggestion" : [ { "text" : "luece", "offset" : 0, "length" : 5, "options" : [ { "text" : "lucene", "score" : 0.6, "freq" : 2 } ] }, { "text" : "builf", "offset" : 6, "length" : 5, "options" : [ { "text" : "built", "score" : 0.8, "freq" : 1 } ] }, { "text" : "hock", "offset" : 12, "length" : 4, "options" : [ { "text" : "rock", "score" : 0.75, "freq" : 1 } ] } ] }Phrase Suggester
phrase建议器可以在term建议器的基础上增加一些逻辑,例如max_errors控制返回的结果中错误单词数,confidence控制返回结果的置信度阈值(此阈值越高,返回结果数越少),也可以加入高亮,指定高亮标签:
POST article/_search { "suggest": { "my_suggestion": { "text": "lucne and elasticsear rodk very well", "phrase": { "field": "body", "max_errors": 3, "confidence": 1, "direct_generator": [ {"field": "body", "suggest_mode": "missing"} ], "highlight": { "pre_tag": "", "post_tag": "" } } } } }
输出的suggest部分如下:
"suggest" : { "my_suggestion" : [ { "text" : "lucne and elasticsear rodk very well", "offset" : 0, "length" : 36, "options" : [ { "text" : "lucene and elasticsearch rock very well", "highlighted" : "lucene and elasticsearch rock very well", "score" : 1.6991E-4 }, { "text" : "lucene and elasticsearch rocks very well", "highlighted" : "lucene and elasticsearch rocks very well", "score" : 1.6991E-4 }, { "text" : "lucene and elasticsearch rodk very well", "highlighted" : "lucene and elasticsearch rodk very well", "score" : 1.393378E-4 } ] } ] }CompletionSuggeser
补全建议器提供了自动补全功能。
使用时要先给文档设置Mapping,指定对哪个字段进行补全:
PUT article { "mappings": { "properties": { "body": { "type": "completion" } } } }
然后插入数据,并进行补全查询,指定前缀和要补全的字段即可:
POST article/_search { "suggest": { "YOUR_SUGGESTION": { "prefix": "e", "completion": { "field": "body" } } } }ContextSuggester
这是对补全建议器的扩展,可以在搜索中加入更多的上下文信息。es中可以定义Category(任意字符串)和Geo(地理位置信息)两种上下文。
实现上下文建议器的步骤有三:定制Mapping;索引数据并加入上下文信息;结合上下文进行建议查询。
使用示例如下,先给文档设置Mapping,让某个字段的类型为补全类型,并给定上下文信息:
PUT comments PUT comments/_mapping { "properties": { "comment_autocomplete": { "type": "completion", "contexts": [ { "type": "category", "name": "comment_category" } ] } } }
然后插入数据,设置补全信息,给定样例输入和对应的上下文:
POST comments/_doc { "comment": "I love the star war movie", "comment_autocomplete": { "input": ["star wars"], "contexts": { "comment_category": "movies" } } } POST comments/_doc { "comment": "Where can I find a Starbucks", "comment_autocomplete": { "input": ["starbucks"], "contexts": { "comment_category": "coffee" } } }
最后进行查询,给定待补全的前缀、使用的补全字段,以及上下文信息:
POST comments/_search { "suggest": { "YOUR_SUGGESTION": { "prefix": "sta", "completion": { "field": "comment_autocomplete", "contexts": { "comment_category": "movies" } } } } }
输出的建议字段如下,可见es根据输入前缀和上下文输出了对应的数据:
"suggest" : { "YOUR_SUGGESTION" : [ { "text" : "sta", "offset" : 0, "length" : 3, "options" : [ { "text" : "star wars", "_index" : "comments", "_type" : "_doc", "_id" : "JHZvRnMBVFEAERRHgcsw", "_score" : 1.0, "_source" : { "comment" : "I love the star war movie", "comment_autocomplete" : { "input" : [ "star wars" ], "contexts" : { "comment_category" : "movies" } } }, "contexts" : { "comment_category" : [ "movies" ] } } ] } ] }
和phrase、term在精准度、召回率和性能方面的比较:
精准度:Completion > Phrase > Term
召回率:Term > Phrase > Completion
性能:Completion > Phrase > Term
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)