- 1、熟悉SpringBoot集成ElasticSearch
- 1.1、官方指导文档
- 1.2、创建集成项目配置
- 1.3、测试索引-增删查
- 1.4、测试文档-增删改查
- 2、ElasticSearch实战-仿京东首页查询高亮
- 2.1、创建项目
- 2.2、基础爬虫拉取数据(jsoup)
- 2.3、编写service业务逻辑层接口及实现类
- 2.4、编写Controller前端访问层
- 2.5、测试接口
- 2.6、前后端分离(简单使用Vue)
- 2.7、高亮显示关键字
elasticsearch官方指导文档:https://www.elastic.co/guide/index.html
推荐使用REST风格 *** 作es,可以直接根据REST Client客户端官方指导文档即可:
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/index.html
1、引入springboot集成es客户端依赖
org.springframework.boot spring-boot-starter-data-elasticsearch
2、统一版本
org.springframework.boot spring-boot-starter-parent2.2.5.RELEASE 1.8 7.6.1
3、导入后续会用到的关键依赖
org.projectlombok lomboktrue com.alibaba fastjson1.2.70
4、创建并编写配置类
@Configuration public class ElasticSearchRestClientConfig { // 向spring容器中注入Rest高级客户端 //方法名最好和返回类型保持一直,后续自动匹配装载时方便 @Bean public RestHighLevelClient restHighLevelClient(){ RestHighLevelClient client = new RestHighLevelClient( RestClient.builder(new HttpHost("127.0.0.1",9200,"http")) ); return client; } }
5、创建并编写测试实体类
@Data //生成setter和getter方法 @NoArgsConstructor //生成无参构造函数 @AllArgsConstructor //生成带参构造函数 public class User implements Serializable { private String name; private Integer age; }1.3、测试索引-增删查
- 首先启动elasticsearch和es-head服务和插件
- 然后要启动项目的主启动类SpringbootElasticsearchApiApplication,因为要把RestHighLevelClient注入到spring容器中,在测试前一定一定要做这一步,后续的测试才不会报错,血的教训!!!
- 测试建议写在test包下的SpringbootElasticsearchApplicationTests类中
6.1、创建索引
@SpringBootTest class SpringbootElasticsearchApplicationTests { @Autowired RestHighLevelClient restHighLevelClient; @Test public void testPUTCreateIndex() throws IOException { //创建索引请求对象,同时可初始化索引名 CreateIndexRequest request = new CreateIndexRequest("yxj_index"); //创建索引响应对应,默认类型 CreateIndexResponse reponse = restHighLevelClient.indices().create(request,RequestOptions.DEFAULT); System.out.println(reponse.isAcknowledged());//根据响应状态,索引是够创建成功 System.out.println(reponse);//查询响应对象信息 restHighLevelClient.close();//用完一定要关闭客户端 } } 控制台结果: true org.elasticsearch.client.indices.CreateIndexResponse@5565235d
6.2、获取索引,并判断其是否存在
@Test public void testGETIndexAndIsExists() throws IOException { //创建获取索引请求对象 GetIndexRequest request = new GetIndexRequest("yxj_index"); //创建获取索引响应对象 GetIndexResponse response = restHighLevelClient.indices().get(request, RequestOptions.DEFAULT); //判断索引是否存在 boolean exits = restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT); System.out.println(response.getIndices());//输出索引信息(暂时还没数据) System.out.println(exits);//判断是否存在 restHighLevelClient.close();//用完一定要关闭客户端 } 控制台结果: [Ljava.lang.String;@36790bec true
6.3、删除索引
@Test public void testDeleteIndex() throws IOException { //创建删除索引的请求对象 DeleteIndexRequest request = new DeleteIndexRequest("yxj_index"); //创建删除索引的响应对象 AcknowledgedResponse response = restHighLevelClient.indices().delete(request,RequestOptions.DEFAULT); System.out.println(response.isAcknowledged());//判断删除是否成功 restHighLevelClient.close(); } 控制台结果: true1.4、测试文档-增删改查
1、添加文档
@Test void testAdddocument() throws IOException{ //创建对象 User user = new User("一宿君",21); //创建请求,链接索引库 IndexRequest request = new IndexRequest("yxj_index"); //规则 PUT /yxj_index/_doc/1 request.id("1"); request.timeout("1s");//设置超时时间为1s request.timeout(Timevalue.timevalueMinutes(1));//这两种方式应该都可以 //将数据放入request请求中(json格式) request.source(JSON.toJSONString(user), XContentType.JSON); //客户端发送请求,获取响应的结果信息 IndexResponse response = restHighLevelClient.index(request,RequestOptions.DEFAULT); System.out.println(response.status());//获取 *** 作文档的状态 System.out.println(response);//获取文档 *** 作相应信息 restHighLevelClient.close(); } 控制台结果: CREATED IndexResponse[index=yxj_index,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]
2、获取文档信息
@Test void testGetDocumntAndIsExits() throws IOException { //创建获取文档请求,指定索引名和文档id GetRequest request = new GetRequest("yxj_index","1"); //过滤掉_source文档上下文,我们只需要判断文档是否存在,不需要获取内容,可以提高效率 //request.fetchSourceContext(new FetchSourceContext(false)); //不获取任何字段 //request.storedFields("_none_"); //获取值钱,先判断该文档是否存在(提高效率) boolean exists = restHighLevelClient.exists(request, RequestOptions.DEFAULT); if(exists){ System.out.println("文档存在。。。"); //发送请求获取响应对象(此处发送请求,如果使用上述的request过滤掉上下文,是获取不到内容的,可以把上述过滤注释掉) GetResponse response = restHighLevelClient.get(request,RequestOptions.DEFAULT); System.out.println(response.getSourceAsString());//获取文档全部内容,转换为字符串 System.out.println(response);//获取全部相应信息(和Kibana的命令 *** 作是一致的) }else { System.out.println("文档不存在!!!"); } restHighLevelClient.close();//关闭客户端 } 控制台结果: 文档存在。。。 {"age":21,"name":"一宿君"} {"_index":"yxj_index","_type":"_doc","_id":"1","_version":1,"_seq_no":0,"_primary_term":1,"found":true,"_source":{"age":21,"name":"一宿君"}}
3、文档更新
@Test void testUpdatedocument() throws IOException { //创建更新请求 UpdateRequest request = new UpdateRequest("yxj_index","1"); //创建更新数据 User user = new User("一宿君Java",19); //将数据放入请求中,转换为JSON格式 request.doc(JSON.toJSONString(user),XContentType.JSON); //发送请求 UpdateResponse response = restHighLevelClient.update(request, RequestOptions.DEFAULT); System.out.println(response.status());//查询更新状态是否成功 restHighLevelClient.close();//关闭客户端 } 控制台结果: OK
4、文档的删除
@Test void testDeletedocument() throws IOException { //创建删除请求 DeleteRequest request = new DeleteRequest("yxj_index", "1"); //发送请求 DeleteResponse response = restHighLevelClient.delete(request, RequestOptions.DEFAULT); System.out.println(response.status());//查询更新状态是否成功 restHighLevelClient.close();//关闭客户端 } 控制台结果: OK
5、批量插入文档数据
@Test void testBulkInsertdocument() throws IOException { //创建批量出入请求对象 BulkRequest request = new BulkRequest(); request.timeout("1s"); //创建集合文档数据 ListuserList = new ArrayList<>(); userList.add(new User("一宿君1", 1)); userList.add(new User("一宿君2", 2)); userList.add(new User("一宿君3", 3)); userList.add(new User("一宿君4", 4)); userList.add(new User("一宿君5", 5)); userList.add(new User("一宿君6", 6)); //批量请求处理 for(int i=0;i
6、文档带条件查询@Test void testHasConditionSearch() throws IOException { //创建查询条件请求对象 SearchRequest request = new SearchRequest(); //构建查询条件对象 SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("name","一宿君"); //TermQueryBuilder queryBuilder = QueryBuilders.termQuery("name","一宿君"); //将查询条件对象放入 请求构建查询条件对象中 searchSourceBuilder.query(matchQueryBuilder); //设置高亮 searchSourceBuilder.highlighter(new HighlightBuilder()); //设置分页(当前第0页,每页显示3条数据) searchSourceBuilder.from(0); searchSourceBuilder.size(3); //将构建查询条件对象放入到请求查询条件对象中 request.source(searchSourceBuilder); //此处是指定索引,如果不指定会遍历所有的索引 request.indices("bulk_index"); //客户单发送请求 SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT); System.out.println(response.status());//查看查询的状态 System.out.println(response);//打印全部响应信息 //获取查询结果集,并遍历 SearchHits hits = response.getHits();//此处获取到的是整个hits标签,包含全部信息 System.out.println(JSON.toJSONString(hits));//将结果集转换为JSON格式 System.out.println("============================================================"); //此处的hits内部才是包含数据 for(SearchHit documentFields:hits.getHits()){ System.out.println(documentFields.getSourceAsString());//这个是获取字符串格式 //System.out.println(documentFields.getSourceAsMap());//这个是获取map集合对格式 } restHighLevelClient.close();//关闭客户端 } 控制台结果: OK {"took":19,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":6,"relation":"eq"},"max_score":0.22232392,"hits":[{"_index":"bulk_index","_type":"_doc","_id":"1","_score":0.22232392,"_source":{"age":1,"name":"一宿君1"}},{"_index":"bulk_index","_type":"_doc","_id":"2","_score":0.22232392,"_source":{"age":2,"name":"一宿君2"}},{"_index":"bulk_index","_type":"_doc","_id":"3","_score":0.22232392,"_source":{"age":3,"name":"一宿君3"}}]}} {"fragment":true,"hits":[{"fields":{},"fragment":false,"highlightFields":{},"id":"1","matchedQueries":[],"primaryTerm":0,"rawSortValues":[],"score":0.22232392,"seqNo":-2,"sortValues":[],"sourceAsMap":{"name":"一宿君1","age":1},"sourceAsString":"{"age":1,"name":"一宿君1"}","sourceRef":{"fragment":true},"type":"_doc","version":-1},{"fields":{},"fragment":false,"highlightFields":{},"id":"2","matchedQueries":[],"primaryTerm":0,"rawSortValues":[],"score":0.22232392,"seqNo":-2,"sortValues":[],"sourceAsMap":{"name":"一宿君2","age":2},"sourceAsString":"{"age":2,"name":"一宿君2"}","sourceRef":{"fragment":true},"type":"_doc","version":-1},{"fields":{},"fragment":false,"highlightFields":{},"id":"3","matchedQueries":[],"primaryTerm":0,"rawSortValues":[],"score":0.22232392,"seqNo":-2,"sortValues":[],"sourceAsMap":{"name":"一宿君3","age":3},"sourceAsString":"{"age":3,"name":"一宿君3"}","sourceRef":{"fragment":true},"type":"_doc","version":-1}],"maxScore":0.22232392,"totalHits":{"relation":"EQUAL_TO","value":6}} ============================================================ {"age":1,"name":"一宿君1"} {"age":2,"name":"一宿君2"} {"age":3,"name":"一宿君3"}2、ElasticSearch实战-仿京东首页查询高亮 2.1、创建项目静态界面资源包:
链接:https://pan.baidu.com/s/1L8_NtjVLMmOooK2m-L0Tlw
提取码:9gjc
配置application.properties配置文件:#修改端口号 server.port=9090 #关闭thymeleaf缓存 spring.thymeleaf.cache=false导入相关依赖(特别注意版本号):
org.springframework.boot spring-boot-starter-parent2.2.5.RELEASE com.wbs springboot-elasticsearch-jd0.0.1-SNAPSHOT springboot-elasticsearch-jd 1.8 7.6.1 org.springframework.boot spring-boot-starter-data-elasticsearchorg.springframework.boot spring-boot-starter-thymeleaforg.springframework.boot spring-boot-starter-webcom.alibaba fastjson1.2.70 org.springframework.boot spring-boot-devtoolsruntime true org.springframework.boot spring-boot-configuration-processortrue org.projectlombok lomboktrue org.springframework.boot spring-boot-starter-testtest 编写IndexController层:
@Controller public class IndexController { @RequestMapping({"/","/index"}) public String toIndex(){ return "index"; } }启动项目,直接访问地址localhost:9090,首先保证我们的项目能正常启动和访问到首页:
2.2、基础爬虫拉取数据(jsoup)
数据获取的方式有很多种:
- 数据库
- 消息队列
- 缓存
- 爬虫
- 等等……
1、首先导入jsoup依赖包
org.jsoup jsoup1.10.2 2、进入京东首页搜索商品关键字
查看地址栏地址:
https://search.jd.com/Search?keyword=Java&enc=utf-8
3、审查网页元素
4、编写工具类爬取数据(获取请求返回的页面信息,筛选出可用的)public class HtmlParseUtilTest { public static void main(String[] args) throws IOException { //1、请求url String url = "https://search.jd.com/Search?keyword=Java&enc=utf-8"; //2、解析网页(jsoup解析返回的就是浏览器document对象,可以 *** 作网页中所有的html元素) document document = Jsoup.parse(new URL(url), 30000); //3、通过上述审查网页元素中的商品列表id,获取元素 Element element = document.getElementById("J_goodsList"); //4、获取element元素中ul下的每一个所有li元素 Elements elements = element.getElementsByTag("li"); //5、获取li元素下的商品属性:img、price、name、…… for (Element el : elements) { System.out.println("img-src:" + el.getElementsByTag("img").eq(0).attr("src"));//获取li元素下的第一章照片 System.out.println("name:" + el.getElementsByClass("p-name").eq(0).text());//获取商品名字 System.out.println("price:" + el.getElementsByClass("p-price").eq(0).text());//获取商品价格 System.out.println("shopname:" + el.getElementsByClass("hd-shopname").eq(0).text());//获取商品出版社 System.out.println("================================================================================================"); } } }上述的情况是以为大型网站图片比较多,一般使用的都是图片延迟加载(也就是懒加载的方式)渲染图片,这样可以高相应速度。
更改图片获取属性为 :data-lazy-img
5、编写实体类,存放商品属性信息@Data @NoArgsConstructor @AllArgsConstructor public class Product implements Serializable { private String name; private String img; private String price; private String shopname; //……属性可以根据需要添加,这里只罗列几个关键属性即可 }6、编写修改解析网页工具类,获取树
public class HtmlParseUtil { public static void main(String[] args) throws IOException { new HtmlParseUtil().parseJD("Java").forEach(System.out::println); } public ListparseJD(String keyword) throws IOException { //1、请求url String url = "https://search.jd.com/Search?keyword=" + keyword +"&enc=utf-8"; //2、解析网页(jsoup解析返回的就是浏览器document对象,可以 *** 作网页中所有的html元素) document document = Jsoup.parse(new URL(url), 30000); //3、通过上述审查网页元素中的商品列表id,获取元素 Element element = document.getElementById("J_goodsList"); //4、获取element元素中ul下的每一个所有li元素 Elements elements = element.getElementsByTag("li"); //5、创建存储数据集合 ArrayList productArrayList = new ArrayList<>(); //6、获取li元素下的商品属性:img、price、name、shopname,并添加到集合中 for (Element el : elements) { String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");//获取li元素下的第一章照片 String name = el.getElementsByClass("p-name").eq(0).text();//获取商品名字 String price = el.getElementsByClass("p-price").eq(0).text();//获取商品价格 String shopname = el.getElementsByClass("hd-shopname").eq(0).text();//获取商品出版社 //创建商品实体类 Product product = new Product(img,name,price,shopname); //添加到集合中 productArrayList.add(product); } //返回集合 return productArrayList; } } 注意:
执行查看结果:
2.3、编写service业务逻辑层接口及实现类
//接口 @Service public interface ProductService { //爬取数据存入es中 public Boolean parseProductSafeEs(String keyword) throws IOException; //分页查询 public List2.4、编写Controller前端访问层注意:此处所有的方法都不要关闭RestHighLevelClient客户端,否则其他方法会无法继续访问,同时报IO异常。
@Controller public class ProductController { @Autowired RestHighLevelClient restHighLevelClient; @Autowired ProductService productService; @RequestMapping("/createIndex") @ResponseBody public String creatIndex() throws IOException { CreateIndexRequest request = new CreateIndexRequest("jd_pro_index"); CreateIndexResponse response = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT); System.out.println(response.isAcknowledged()); if(response.isAcknowledged()){ return "创建成功!"; }else{ return "创建失败!"; } } @RequestMapping("/deleteIndex") @ResponseBody public String deleteIndex() throws IOException { DeleteIndexRequest request = new DeleteIndexRequest("jd_pro_index"); AcknowledgedResponse response = restHighLevelClient.indices().delete(request, RequestOptions.DEFAULT); System.out.println(response.isAcknowledged()); if(response.isAcknowledged()){ return "删除成功!"; }else{ return "删除失败!"; } } @RequestMapping("/toSafeEs/{keyword}") @ResponseBody public String parseProductSafeEs(@PathVariable("keyword") String keyword) throws IOException { if(productService.parseProductSafeEs(keyword)){ return "爬取数据成功存入es中!"; } return "爬取数据失败"; } @RequestMapping("/searchEsDoc/{keyword}/{pageNum}/{pageSize}") @ResponseBody public List2.5、测试接口> searchProduct( @PathVariable("keyword") String keyword, @PathVariable("pageNum") int pageNum, @PathVariable("pageSize") int pageSize) throws IOException { List > mapList = productService.searchProduct(keyword, pageNum, pageSize); if (mapList != null){ return mapList; } return null; } } 创建索引
爬取数据存入es中
查询数据:
2.6、前后端分离(简单使用Vue)
- 下载vue依赖:用于渲染前端页面
- 下载axios依赖:用于ajax请求后端接口
vue和axios都可以去官网下载,跟狂神学了一个小技巧,在本地新建一个英文目录文件夹,直接cmd进入该目录下(前提是安装了nodejs):
#如果之前没有初始化过,可以先执行初始化 npm init #下载vue npm install vue #下载axios npm install axios
修改index.xml首页:一宿君Java-ES仿京东实战
评论列表(0条)