ElasticSearch入门_python

引言

一、三个核心概念

二、高级查询

三、索引原理

四、分词器

五、springboot整和ES

引言

一、三个核心概念

索引

一个索引就是一个拥有几分相似的特征的文档的集合,可以将它想象成关系型数据库中库的概念

索引由一个名字来标识（必须全是小写字母），并且当我们要对这个索引中的文档进行索引、搜索和删除的时候都要用到这个名字。

索引的三种 *** 作（增删查）

#查看索引
GET /_cat/indices?v

#创建索引
PUT /orders
{
  "settings": {

    "number_of_shards": 1, 
    "number_of_replicas": 0
  }
}

#删除索引
DELETE /orders

映射

映射是定义一个文档和他所包含的字段如何被存储和索引的过程，在默认配置下，ES可以根据插入的数据自动的创建mapping，也可以手动的创建mapping，mapping中主要包括字段名、字段类型。映射脱落索引没有任何意义，一般都是在创建索引是创建映射。

创建

字符串类型：keyword 关键字、text 文本

数字类型： interger long

小数类型： flocat double

布尔类型：boolean

日期类型： date


#创建商品索引 product 指定mapping{id,title,price}
PUT /product
{
  "settings": {
    "number_of_replicas": 1
    , "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "id":{
        "type": "integer"
      },
      "title":{
        "type": "keyword"
      },
      "price":{
        "type": "double"
      },
      "create":{
        "type": "date"
      },
      "descrition":{
        "type": "text"
      }
    }
  }
}

#查看某个索引的映射信息
GET /product/_mapping

映射不允许删除和修改

文档

文档是索引中存储的一条条数据，一条文档是一个可被索引的最小单位，ES中的文档采用了轻量级的JSON格式数据来表示。

#添加文档
POST /product/_doc/1
{
  "id":"1",
  "title":"iphone",
  "price":"9999",
  "created_at":"2018-6-6",
  "description":"华为P30"
}

#基于id文档查询
GET /product/_doc/1

#基于id删除文档
DELETE /product/_doc/1

#更新文档  删除原始文档 再重新添加
PUT /product/_doc/1
{
  "title": "desktop"
}

#这种更新方式将原始数据内容保存，并在此基础上更新
POST /product/1/_update 
{
  "doc":{
    "title":"desktop"
  }
}

文档批量 *** 作

批量 *** 作不会因为一条失败而全部失败，不是原子性 *** 作

index 表示批量添加 *** 作，update表示更新，delete删除

#批量 *** 作
POST /product/_doc/_bulk
{"index":{"_id":3}}
  {"id":"3","title":"辣条","price": "0.5","created_at":"1999.9.9","description":"delicious"}
{"index":{"_id":4}}
  {"id":"4","title":"鱼豆腐","price": "1","created_at":"2020.10.10","descripion":"delicious"}

#更新文档的同时删除文档
POST /product/_doc/_bulk
{"update":{"_id":"1"}}
  {"doc":{"title":"keyboard","price":"30"}}
{"delete":{"_id":2}}

二、高级查询

query DSL 利用Rest api 传递的json格式请求体数据与es进行交互

语法： GET /索引名/_search {json格式请求体数据}

GET /product/_search
{
  "query":{
    "match_all": {}
  }
}

关键字查询

除了text之外其他都不分词

#term 基于关键字查询
#text 默认，标准分词器，单字搜索才能搜索的到
GET /product/_search
{
  "query": {
    "term": {
      "descripion": {
        "value": "del"
      }
    }
  }
}

范围查询

#范围查询
GET /product/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 0.5,
        "lte": 3
      }
    }
  }
}

前缀查询

GET /product/_search
{
  "query": {
    "prefix": {
      "descripion": {
        "value": "del"
      }
    }
  }
}

通配符查询

*代表多个，？代表一个

#通配符查询 wildcard
GET /product/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "鱼*"
      }
    }
  }
}

多id查询{ids}

#查询一组符合条件的id
GET /product/_search
{
  "query": {
    "ids": {
      "values": [1,3,4]
    }
  }
}

模糊查询 fuzzy

搜索关键词长度为2不存在模糊查询

搜索关键词长度为3-5允许一次模糊

搜索关键词长度大于3-5允许最大2模糊

#模糊查询
GET /product/_search
{
  "query": {
    "fuzzy": {
      "title": "吃辣条"
    }
  }
}

三、索引原理

倒排索引：反向索引，通过value找到key，然后再返回文档，es底层在检索索引时正是用到倒排索引。在es索引里，可以分为两个区域，元数据区和索引区。索引区的模型就是倒排索引，元数据区存储文档。

四、分词器

Analyzer 和Analysis

Analyzer组成

在es中默认使用标准分词器：standarAnalyzer特点：中文单字分词，英文单词分词

分词器都有三个构件构成：character filters ,tokenziers, token filters

character filter 字符过滤器

一段文本进行分词之前，先进行预处理，过滤标签

tokenizer 分词器

英文分词可以根据空格将单词分开，中文单词比较复杂

token filters

将切分的单词进行加工。大小写转换，去掉停用词，加入同义词

注意三者的顺序：character Filter (0个或多个)----->Tokenizer------>Token Filter（0个或多个）

内置分词器

stander :单词统一小写，去掉标准符号

simple:英文按照小写，去掉符号，中文按照空格分词

whitespace:空格分词器，

中文分词器：

安装ik

//最细粒度划分
POST /_analyze
{
  "analyzer": "ik_max_word",
  "text": "青青草原"
}

//最粗粒度划分
POST /_analyze
{
  "analyzer": "ik_smart",
  "text":"我最爱读书"
}

ik支持自定义扩展词典和停用词典

扩展词典就是有些词并不是关键词，但是

停用词:就是有些词不想放入

过滤查询

ES的查询分为两种：查询和过滤，查询是之前提到的query查询，它会默认计算每个文档的得分，然后根据得分排序，而过滤只会筛选出符合的文档，并不会计算得分，而且它可以缓存文档，所以从性能考虑，过滤比查询更快。换句话说：过滤适合在大范围筛选数据，而查询则适合精准匹配数据，一般应用时，应先使用过滤 *** 作过滤数据，然后使用查询匹配数据。

ES会将经常使用的过滤器缓存

ids filter


# filter query过滤查询
GET /product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term":{
            "description":{
              "value":"delicious"
            }
          }
        }
      ],
      "filter": [
        {
          "ids": {
            "values": [
              "3"
            ]
          }
        }
      ]
    }
  }
}

range filter

#ods filter
GET /product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match_all": {}
        }
      ],
      "filter": [
        {
          "range": {
            "price": {
              "gte": 0,
              "lte": 3
            }
          }
        }
      ]
    }
  }
}

exist filter

GET /product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match_all": {}
        }
      ],
      "filter": [
        {
          "exists": {
            "field": "title"
          }
        }
      ]
    }
  }
}

五、springdata整合ES

环境搭建

引入依赖


      org.springframework.boot
      spring-boot-starter-data-elasticsearch

配置客户端

@Configuration
public class RestClientConfig extends AbstractElasticsearchConfiguration {
    
    @Bean  //es 两个端口 9200 9300
    public RestHighLevelClient elasticsearchClient(){
        final ClientConfiguration clientConfiguration = ClientConfiguration.builder()
                .connectedTo("localhost:9200")
                .build();
        return RestClients.create(clientConfiguration).rest();
    }
}

客户端对象

ElasticsearchOperations

站在Java一切皆对象的角度的层面上和es做交互，限制耦合太高

创建实体类


/**
 * @Document()将这个类的对象转换为es中的一条文档进行录入
 *     indexName:指定文档的索引名称
 *     createIndex:是否创建索引
 */
@Document(indexName = "products",createIndex = true)
public class Product {
    @Id //将对象id作为文档_id 进行映射
    private Integer id;
    @Field(type = FieldType.Keyword)
    private String title;
    @Field(type = FieldType.Double)
    private Double price;
    @Field(type = FieldType.Text,analyzer = "ik_max_word")
    private String description;

    public Integer getId() {
        return id;
    }

    public void setId(Integer id) {
        this.id = id;
    }

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public Double getPrice() {
        return price;
    }

    public void setPrice(Double price) {
        this.price = price;
    }

    public String getDescription() {
        return description;
    }

    public void setDescription(String description) {
        this.description = description;
    }
}

测试


import com.elasticsearch.entity.Product;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.elasticsearch.core.ElasticsearchOperations;
import org.springframework.data.elasticsearch.core.SearchHit;
import org.springframework.data.elasticsearch.core.SearchHits;
import org.springframework.data.elasticsearch.core.query.Query;

public class ElasticSearchOptionTest extends SpringBootElasticsearchApplicationTests {

    @Autowired
    private ElasticsearchOperations elasticSearchOptions;

    @Autowired
    public ElasticSearchOptionTest(ElasticsearchOperations elasticsearchOperations){
        this.elasticSearchOptions = elasticsearchOperations;
    }

    /**
     * save 索引一条文档 更新一条文档
     * save 方法当文档不存在时创建文档，当文档存在时更新文档
     */
    @Test
    public void Indextext(){
        Product product = new Product();
        product.setId(2);
        product.setTitle("卫龙辣条");
        product.setPrice(0.50);
        product.setDescription("delicious");
        elasticSearchOptions.save(product);
    }

    /**
     * 查询一条文档
     */
    @Test
    public void searchTest(){
        Product product = elasticSearchOptions.get("1", Product.class);
        System.out.println(product.getId()+product.getTitle()+product.getPrice()+product.getDescription());
    }

    /**
     * 删除一条文档
     */
    @Test
    public void deleteTest(){
        Product product = new Product();
        product.setId(1);
        elasticSearchOptions.delete(product);
    }

    /**
     * 一次性删除所有文档
     */
    @Test
    public void deleteAllTest(){
        System.out.println(elasticSearchOptions.delete(Query.findAll(), Product.class));
    }

    /**
     * 查询所有
     */
    @Test
    public void findAllTest() throws JsonProcessingException {
        SearchHits searchHits = elasticSearchOptions.search(Query.findAll(), Product.class);
        System.out.println("总分数"+searchHits.getMaxScore());
        System.out.println("符合条件的总条数"+searchHits.getTotalHits());
        for (SearchHit productSearchHits : searchHits){
            System.out.println(new ObjectMapper().writeValueAsString(productSearchHits.getContent()));
        }
    }
}

elastaicOperations对elasticsearch *** 作虽然简单，但是很有局限性，难应对复杂的情况。

RestHightLeveClient(推荐),站在rest的角度和es做交互，更像kibana的方式

创建删除索引和映射

ppackage com.elasticsearch;

import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;

import java.io.IOException;

public class RestHightLeveClientTest extends SpringBootElasticsearchApplicationTests{

    private final RestHighLevelClient restHighLevelClient;

    @Autowired
    public RestHightLeveClientTest(RestHighLevelClient restHighLevelClient) {
        this.restHighLevelClient = restHighLevelClient;
    }

    /**
     * 创建索引 创建映射
     */
    @Test
    public void IndexMappingTest() throws IOException {
        //参数1.创建索引的请求对象 参数2.请求匹配对象
        CreateIndexRequest createIndexRequest = new CreateIndexRequest("goods");
        //一般在kibana中先创建号再复制过来
        createIndexRequest.mapping("{\n" +
                "    \"properties\": {\n" +
                "      \"id\":{\n" +
                "        \"type\": \"integer\"\n" +
                "      },\n" +
                "      \"title\":{\n" +
                "        \"type\": \"keyword\"\n" +
                "      },\n" +
                "      \"price\":{\n" +
                "        \"type\": \"double\"\n" +
                "      },\n" +
                "      \"create\":{\n" +
                "        \"type\": \"date\"\n" +
                "      },\n" +
                "      \"descrition\":{\n" +
                "        \"type\": \"text\"\n" +
                "      }\n" +
                "    }\n" +
                "  }", XContentType.JSON);

        //指定映射 参数1:指定映射 参数2:指定数据类型 json 结构
        CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT);
        System.out.println("创建状态:"+createIndexResponse.isAcknowledged());
        restHighLevelClient.close(); //关闭资源
    }

    /**
     * 删除索引
     */
    @Test
    public void deleteIndexTest() throws IOException {
        //参数1.删除索引对象，参数2.请求匹配对象
        AcknowledgedResponse acknowledgedResponse = restHighLevelClient.indices().delete(new DeleteIndexRequest("goods"), RequestOptions.DEFAULT);
        System.out.println(acknowledgedResponse.isAcknowledged());
    }

}

对文档做增删改查

package com.elasticsearch;

import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.action.update.UpdateResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import java.io.IOException;

public class RestHightLeveClientDocumentTest extends SpringBootElasticsearchApplicationTests{
    private final RestHighLevelClient restHighLevelClient;

    @Autowired
    public RestHightLeveClientDocumentTest(RestHighLevelClient restHighLevelClient) {
        this.restHighLevelClient = restHighLevelClient;
    }

    /**
     * 索引一条文档
     */
    @Test
    public void createDocumentTest() throws IOException {
        //参数1.索引对象 参数2.请求配置对象
        IndexRequest indexRequest = new IndexRequest("product");
        indexRequest
                .id("2")
                .source("{\n" +
                        "  \"id\":\"1\",\n" +
                        "  \"title\":\"辣条\",\n" +
                        "  \"price\":\"0.5\",\n" +
                        "  \"create_at\":\"2022-3-30\",\n" +
                        "  \"description\":\"不错吃\"\n" +
                        "  \n" +
                        "}", XContentType.JSON);
        IndexResponse index = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
        System.out.println(index.status());
    }

    /**
     * 更新文档
     * @throws IOException
     */
    @Test
    public void updateTest() throws IOException {
        //参数1.更新对象 ，参数2.更新文档id
        UpdateRequest updateRequest = new UpdateRequest("product","1");
        updateRequest.doc("{\n" +
                "  \"title\":\"每日坚果\",\n" +
                "  \"description\":\"好吃\"\n" +
                "}\n" +
                "  ",XContentType.JSON);
        UpdateResponse update = restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT);
    }

    /**
     * 删除文档
     */
    @Test
    public void deleteTest() throws IOException {
        //参数1，删除请求对象  参数2.请求配对对象
        DeleteRequest deleteRequest = new DeleteRequest("product", "1");
        DeleteResponse delete = restHighLevelClient.delete(deleteRequest, RequestOptions.DEFAULT);
    }

    /**
     * 基于id查询文档
     */
    @Test
    public void getTest() throws IOException {
        //参数1：查询的索引  参数2.查询索引的id
        GetRequest getRequest = new GetRequest("product","1");
        GetResponse getResponse = restHighLevelClient.get(getRequest,RequestOptions.DEFAULT);
        System.out.println(getResponse);
    }
}

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/716541.html

ElasticSearch入门

发表评论

评论列表（0条）