业务场景的问题,我们有一个刷CUBE的sql,是Oracle环境,平时跑70多分钟,
但是最近突然不动了,这个sql需要算累计值,比如年累计客户数量。
累计值是什么意思呢?我们使用下面的数据来说明问题。
select ‘201901‘ as c_month,100 as c_customers from dual union all select ‘201902‘ as c_month,102 as c_customers from dual union all select ‘201903‘ as c_month,120 as c_customers from dual union all select ‘201904‘ as c_month,111 as c_customers from dual union all select ‘201905‘ as c_month,155 as c_customers from dual union all select ‘201906‘ as c_month,199 as c_customers from dual;C_MONT C_CUSTOMERS------ -----------201901 100201902 102201903 120201904 111201905 155201906 199
2019年1月,客户数量是100, 2019年2月,客户数量是102 ,
那么2019年1月的客户累计值是100,
2019年2月的客户年累计值是202(2019年1月的客户数量 + 2019年2月的客户数量),
2019年3月的客户年累计值是322(2019年1月的客户数量 + 2019年2月的客户数量+ 2019年3月的客户数量),
...
我使用如下的测试sql来说明这个场景
create or replace vIEw tab_test1 as select ‘201901‘ as c_month,199 as c_customers from dual union all select ‘201907‘ as c_month,108 as c_customers from dual;create vIEw tab_test2 as select ‘20190131‘ as monthlastday from dual union allselect ‘20190228‘ as monthlastday from dual union allselect ‘20190331‘ as monthlastday from dual union allselect ‘20190430‘ as monthlastday from dual union allselect ‘20190531‘ as monthlastday from dual union allselect ‘20190630‘ as monthlastday from dual union allselect ‘20190731‘ as monthlastday from dual;select * from tab_test1 a join tab_test2 b on to_date(c_month,‘yyyymm‘) <= to_date(b.monthlastday,‘yyyymmdd‘) and to_date(c_month,‘yyyymm‘) >= trunc(to_date(b.monthlastday,‘yyyymmdd‘),‘yyyy‘) order by b.monthlastday,a.c_month;C_MONT C_CUSTOMERS MONTHLAS------ ----------- --------201901 100 20190131201901 100 20190228201902 102 20190228201901 100 20190331201902 102 20190331201903 120 20190331201901 100 20190430201902 102 20190430201903 120 20190430201904 111 20190430201901 100 20190531C_MONT C_CUSTOMERS MONTHLAS------ ----------- --------201902 102 20190531201903 120 20190531201904 111 20190531201905 155 20190531201901 100 20190630201902 102 20190630201903 120 20190630201904 111 20190630201905 155 20190630201906 199 20190630201901 100 20190731C_MONT C_CUSTOMERS MONTHLAS------ ----------- --------201902 102 20190731201903 120 20190731201904 111 20190731201905 155 20190731201906 199 20190731201907 108 20190731已选择 28 行。
从上面sql返回的数据能看出来,monthlastday 字段分组,汇总c_customers,就能很轻松算出年累计值。但是如果a表数据太大,无法走hash 关联。
所以需要通过某种方法改成等值关联。
1. 先创建一个时间维表,可以通过树形查询生成一个时间维度表,由于我的测试数据自小粒度是到月的,所以我的日期维度表也是到月的。
create or replace vIEw tab_test3 as select extract(year from c_date) as c_year,extract(month from c_date) as c_month,to_char(c_date,‘yyyymm‘) as c_month2 from (select add_months(date‘2019-01-01‘,level -1 ) as c_date from dual connect by level <= 8);
2. 通过时间维度表自关联出累计月份对应的日期。
下面sql,同构过滤t1 表的c_month2 字段,就可以拿到任意月份的累计月份了。比如2019-07月的累计月份是2019年 1-7月份
select t1.c_month2 as groupcolumn,t2.c_month2 joincolumn from tab_test3 t1 join tab_test3 t2 on t1.c_year = t2.c_year and t1.c_month2 >= t2.c_month2 order by 1,2GROUPC JOINCO------ ------201901 201901201902 201901201902 201902201903 201901201903 201902201903 201903201904 201901201904 201902201904 201903201904 201904201905 201901GROUPC JOINCO------ ------201905 201902201905 201903201905 201904201905 201905201906 201901201906 201902201906 201903201906 201904201906 201905201906 201906201907 201901GROUPC JOINCO------ ------201907 201902201907 201903201907 201904201907 201905201907 201906201907 201907201908 201901201908 201902201908 201903201908 201904201908 201905GROUPC JOINCO------ ------201908 201906201908 201907201908 201908已选择 36 行。
3. 修改原来sql中关于日期的不等值关联,可以起到走hash的作用。
总结以上是内存溢出为你收集整理的关于Oracle中Sort Merge Join的改写全部内容,希望文章能够帮你解决关于Oracle中Sort Merge Join的改写所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)