需求:Hive分区根据时间,年先分区,然后分月,然后分日(三级分区)。需要针对性调整SQL,使得查询速度加快。
思路:
例如:2019-03-05 到 2021-01-10 1、按年分段 2019-03-05 2019-12-31 2020-01-01 2020-12-31(不用管)pyear=2020 2021-01-01 2021-01-10 2、处理每段 (1)按月分隔 (不是整月)2019-03-05 2019-03-31//按每个月31天,不考虑月份天数不同 pyear=2019 and pmonth=03 and pday>=05 and pday<=31 (整月)2019-04-01 2019-12-31 pyear=2019 and pmonth>=04 and pmonth<=12 (整年)2020-01-01 2021-12-31 pyear=2020 (不是整月)2021-01-01 2021-01-10 pyear=2021 and pmonth=01 and pday>=01 and pday<=10
示例输入:startime = 20210102,endtime =20211231
示例输出:
(pyear=2021 and ((pmonth=01 and pday>=02) or (pmonth>01 and pmonth<12) or (pmonth=12 and pday<=31)))
示例输入:startime = 20190102,endtime =20211231
示例输出:
(pyear=2019 and ((pmonth=01 and pday>=02) or (pmonth>01 and pmonth<12) or (pmonth=12 and pday<=31))) or (pyear=2020) or (pyear=2021)
使用:使用该工具类就可以在原SQL上精准定位分区,加快了查询速度。
package com.koma.gantryprofile.Utils; import java.util.ArrayList; import java.util.List; public class DateToSqlUtil { public static String getTimeSpanSqlStr(String begintime, String endtime) { //result_sql是最终结果添加分区的SQL子句 String result_sql = ""; ListtimeSqlList = new ArrayList<>(); //1、按年分段 List subtimeList = getTimeListByYear(begintime,endtime); //System.out.println("按年分段后的结果(subtimeList):"+subtimeList+"n"); //2、处理每段 for (int i = 0; i < subtimeList.size(); i++) { String timeStr = subtimeList.get(i); System.out.print("正在处理 "+timeStr+" 这一时间段"); timeSqlList.add(getTimeSql(timeStr)); //System.out.println("这一时间段拼接日期分区sql为:"+timeSqlList); } if(timeSqlList.size()>0) { result_sql = String.join(" or ", timeSqlList); } return result_sql; } public static List getTimeListByYear(String begintime, String endtime) { List subTimeList = new ArrayList<>(); String begin_year = begintime.substring(0,4); String end_year = endtime.substring(0,4); List yearList = new ArrayList<>(); int start = Integer.parseInt(begin_year); int end = Integer.parseInt(end_year); while(start<=end) { yearList.add(String.valueOf(start)); start++; } //判断输入是否同一年 if(yearList.size()==1) { subTimeList.add(begintime+"-"+endtime); } else { subTimeList.add(begintime+"-"+yearList.get(0)+"1231"); for (int i = 1; i < yearList.size()-1; i++) { subTimeList.add(yearList.get(i)+"0101-"+yearList.get(i)+"1231"); } subTimeList.add(yearList.get(yearList.size()-1)+"0101-"+endtime); } return subTimeList; } public static String getTimeSql(String timeStr) { StringBuffer sql = new StringBuffer(""); String[] timeArray = timeStr.split("-"); String start = timeArray[0]; String end = timeArray[1]; //判断分段的日期是否为整年 if(start.endsWith("0101") && end.endsWith("1231")) { sql.append("(pyear=").append(start.substring(0,4)).append(")"); } //非整年 else { sql.append("(pyear=").append(start.substring(0,4)) .append(" and (").append("(pmonth=").append(start.substring(4,6)) .append(" and ").append("pday>=").append(start.substring(6,8)); List monthList = getMonthList(start, end); //同一月 if(monthList.size()==1) { sql.append(" and ").append("pday<=").append(end.substring(6,8)).append(")"); } else { sql.append(")"); sql.append(" or (pmonth>").append(monthList.get(0)) .append(" and pmonth<").append(monthList.get(monthList.size()-1)).append(")"); sql.append(" or (pmonth=").append(monthList.get(monthList.size()-1)).append( " and ") .append("pday<=").append(end.substring(6,8)).append(")"); } sql.append("))"); } return sql.toString(); } public static List getMonthList(String start, String end) { List monthList = new ArrayList<>(); int startnum = Integer.valueOf(start.substring(4,6)); int endnum = Integer.valueOf(end.substring(4,6)); while(startnum<=endnum) { if(startnum<10) { monthList.add("0"+startnum); } else { monthList.add(String.valueOf(startnum)); } startnum++; } return monthList; } public static void main(String[] args) { String startTime="20210102"; String finalTime="20211231"; System.out.println("n"+getTimeSpanSqlStr(startTime,finalTime)); } }
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)