es大量数据导入效率优化

es大量数据导入效率优化,第1张

es大量数据导入效率优化

项目需求中,简历信息涉及10张表,需要联查获取组装数据,实测,查询一条数据需要5s,大概算了一下总时间的话需要60个小时左右导入完成。
优化方法:数据分段+多线程
1、10万条数据为例,数据分成10份,每一份10000条;
2、每一份数据起一个线程,10个线程,
代码大致如下:

public Result putAllJobUserInfo() throws ExecutionException, InterruptedException {
        long l = System.currentTimeMillis();
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(1, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(2, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(3, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(4, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(5, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(6, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(7, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(8, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(9, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(10, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(11, 10000));
        threadExecutorService.runSubmit(() -> this.putEsUserInfo(12, 10000));
        long l1 = System.currentTimeMillis();
        System.out.println(l1 - l);
        return Result.ok();
    }


    private Result putEsUserInfo(Integer page, Integer limit) {
        int count = 0;
        PageHelper.startPage(page, limit);
        List ids = jobUserMapper.getUserIdList();
        for (Long id : ids) {
            saveEsJobUserInfo(id);
            count++;
            logger.info("线程#" + page + ":已导入:[" + count + "]条数据;用户id为{" + id + "}");
        }
        logger.info("{全部完成 ==>:}线程# " + page + " #全部完成,总数为" + count);
        return Result.ok("{全部完成 ==>:}线程# " + page + " #全部完成,总数为" + count);
    }

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/4686140.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-11-07
下一篇 2022-11-07

发表评论

登录后才能评论

评论列表(0条)

保存