Elasticsearch基础使用_java

Elasticsearch基础使用

文章目录

Elasticsearch基础使用
- 概念
- 安装
- 基础语法
- - 基础信息查询
  - 四类请求
  - - POST和PUT区别
    - DELTE
    - GET
    - 批量 *** 作
- 检索能力（Query DSL语法）
- - 查询字符串
  - 查询非字符串
  - bool查询
  - filter
  - 分析能力（aggregations聚合能力）
  - - aggregations分组执行 *** 作
- 其他
- - Mapping
  - Reinndex 数据迁移
  - 分词器

elasticsearch三大主要用途，存储、检索（query）、分析（analyse）。

概念

索引（database）、类型（table）、文档（data）

类型已在es 7+版本移除

安装

docker pull elasticsearch:7.17.3

mkdir -p /usr/local/elasticsearch/config
mkdir -p /usr/local/elasticsearch/data
## 限定不限制访问ip
echo "http.host: 0.0.0.0" >> /usr/local/elasticsearch/config/elasticsearch.yml

## 保证权限
chmod -R 777 /usr/local/elasticsearch/ 
docker run --name elasticsearch \
-p 40003:9200 \
-p 40004:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /usr/local/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /usr/local/elasticsearch/data:/usr/share/elasticsearch/data \
-v /usr/local/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.17.3

基础语法基础信息查询

GET /_cat/nodes：查看所有节点
GET /_cat/health：查看es节点健康状况
GET /_cat/master：查看主节点
GET /_cat/indices：查看所有索引show databases

四类请求 POST和PUT区别

新增场景：都可用于新增，Put必须指定id，Post可不指定id新增，系统自动分配id
```
PUT customer/external/1
{
	"name": "John Doe2"
}
```
更新场景：
- 不加_update：Post和Put能力相同
- 加_update：Post可加_update， *** 作会对比源数据，如果相同不会更新。Put不可加
```
POST customer/external/1/_update
{
  "doc":{
  	"name": "John Doew"
  }
}
```

DELTE

DELETE customer/external/1

GET

Rest API

GET bank/_search?q=*&sort=account_number:asc

Query DSL

GET bank/_search
{
  "query": {
  	"match_all": {}
  },
  "sort": [
    {
      "account_number": {
      	"order": "desc"
      }
    }
  ]
}

批量 *** 作

POST /_bulk
{ "delete": { "_index": "website", "_type": "blog", "_id": "123" }}
{ "create": { "_index": "website", "_type": "blog", "_id": "123" }}
{ "title": "My first blog post" }
{ "index": { "_index": "website", "_type": "blog" }}
{ "title": "My second blog post" }
{ "update": { "_index": "website", "_type": "blog", "_id": "123", "_retry_on_conflict" : 3} }
{ "doc" : {"title" : "My updated blog post"} }

检索能力（Query DSL语法）查询字符串

match 全文检索：会将查询内容分词后进行匹配，包含即匹配成功

GET bank/_search
{
  "query": {
    "match": {
 	   "address": "mill road"
    }
  }
}

match_phrase 短语匹配：不会对查询内容进行拆分，直接匹配短语。

multi_match 多字段匹配

GET bank/_search
{
  "query": {
    "multi_match": {
      "query": "mill",
      "fields": ["state","address"]
    }
  }
}

注意：match_phrase 匹配到的是包含短语的结果，如果要匹配完全相等的结果，需要用keyword关键字

GET bank/_search
{
  "query": {
    "match": {
      "address.keyword": "Madison"
    }
  }
}

查询非字符串

term 条目匹配，非text均使用term，不能用于匹配字符串

bool查询

must：必须达到must 列举的所有条件

should：：应该达到should 列举的条件，如果达到会增加相关文档的评分，并不会改变查询的结果

must_not：必须不是指定的情况，相当于一种filter

GET bank/_search
{
  "query": {
  "bool": {
    "must": [
    	{ "match": { "address": "mill" } },
    	{ "match": { "gender": "M" } }
    ],
    "should": [
    	{"match": { "address": "lane" }}
    ],
    "must_not": [
    	{"match": { "email": "baluba.com" }}
    ]
  }
}

filter

过滤文档，只留下满足要求的文档，再进行query

GET bank/_search
{
  "query": {
    "bool": {
      "must": [
      	{"match": { "address": "mill"}}
      ],
      "filter": {
        "range": {
          "balance": {
              "gte": 10000,
              "lte": 20000
          }
        }
      }
    }
  }
}

分析能力（aggregations聚合能力） aggregations分组执行 *** 作

如：查出所有年龄分布，并且这些年龄段中M 的平均薪资和F 的平均薪资以及这个年龄段的总体平均薪资

通过aggs不断嵌套实现，类似stream处理思想

GET bank/account/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "age_agg": {
      "terms": {
        "field": "age",
        "size": 100
      },
      "aggs": {
        "gender_agg": {
          "terms": {
            "field": "gender.keyword",
            "size": 100
          },
          "aggs": {
            "balance_avg": {
              "avg": {
                "field": "balance"
              }
            }
          }
        },
        "balance_avg": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 1000
}

其他 Mapping

映射是定义一个文档以及其所包含的字段如何被存储和索引的方法，类似于mysql中的列类型，可在建表时指定。

添加（index控制字段是否被检索），此处type和es 6支持的_type不同

PUT /my-index
{
  "mappings": {
    "properties": {
      "age": {
        "type": "integer"
      },
      "email": {
        "type": "keyword"
      },
      "name": {
        "type": "text",
        "index": true
      }
    }
  }
}

添加字段（不能直接post或update）

PUT /my-index/_mapping
{
  "properties": {
    "employee-id": {
      "type": "keyword",
      "index": false
    }
  }
}

不能修改已有映射！！！想修改映射，只能数据迁移后重新reindex

Reinndex 数据迁移

POST _reindex [固定写法]
{
  "source": {
  	"index": "twitter"
  },
  "dest": {
	  "index": "new_twitter"
  }
}

分词器

es自带英文分词器，中文不带，需要自行安装

安装ik分词器https://github.com/medcl/elasticsearch-analysis-ik/releases
部署ngnix提供自定义分词数据，位置/usr/local/elasticsearch/plugins/ik/config
可获取到数据后，分词器自动识别词语

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/893854.html

Elasticsearch基础使用

发表评论

评论列表（0条）