求每个品牌方活动实际持续时间,可能出现的情况有:
1)一个品牌方有多个重叠活动
2)一个品牌方有多个不重叠的活动,活动之间的断层不纳入总时间
数据格式:tm_name sdt edt xiaomi 2022-01-06 2022-01-09 xiaomi 2022-01-11 2022-01-15 xiaomi 2022-01-17 2022-01-20 huawei 2022-01-06 2022-01-09 huawei 2022-01-08 2022-01-15 meizu 2022-01-06 2022-01-20 meizu 2022-01-09 2022-01-15 meizu 2022-01-12 2022-01-25 TCL 2022-01-06 2022-01-20 TCL 2022-01-09 2022-01-12 TCL 2022-01-15 2022-01-25建表语句 :
-- 建表 DROp TABLE IF EXISTS `trademark`; CREATE TABLE `trademark` ( `tm_name` varchar(255) CHARACTER SET gbk COLLATE gbk_chinese_ci NULL DEFAULT NULL, `sdt` date NULL DEFAULT NULL, `edt` date NULL DEFAULT NULL ) ENGINE = InnoDB CHARACTER SET = gbk COLLATE = gbk_chinese_ci ROW_FORMAT = Dynamic; -- 数据装填 INSERT INTO `trademark` VALUES ('xiaomi', '2022-01-06', '2022-01-09'); INSERT INTO `trademark` VALUES ('xiaomi', '2022-01-11', '2022-01-15'); INSERT INTO `trademark` VALUES ('xiaomi', '2022-01-17', '2022-01-20'); INSERT INTO `trademark` VALUES ('huawei', '2022-01-06', '2022-01-09'); INSERT INTO `trademark` VALUES ('huawei', '2022-01-08', '2022-01-15'); INSERT INTO `trademark` VALUES ('meizu', '2022-01-06', '2022-01-20'); INSERT INTO `trademark` VALUES ('meizu', '2022-01-09', '2022-01-15'); INSERT INTO `trademark` VALUES ('meizu', '2022-01-12', '2022-01-25'); INSERT INTO `trademark` VALUES ('TCL', '2022-01-06', '2022-01-20'); INSERT INTO `trademark` VALUES ('TCL', '2022-01-09', '2022-01-12'); INSERT INTO `trademark` VALUES ('TCL', '2022-01-15', '2022-01-25');具体步骤:
1、在写有时间重叠情况时的sql时,基本都是采用将开始时间和结束时间打散union在一起,如同求最大在线人数的思路一样,此处flag标记在下面的 *** 作会体现它的作用,注意union all和union的区别:
select tm_name,sdt dt,'1' flag from trademark union all select tm_name,edt dt,'-1' flag from trademark
效果:
2、将union在一起的数据进行开窗处理,条件就是按品牌分组,累积flag求日期是否重叠,若是重叠的日期sum(1)会一直变大,而不重叠的话相当于开始之后紧跟着结束sum(1)与sum(-1)相当于0,用此标志来判断是否断层,而用lead是来获取下一个时间,用来标记断层的时间范围:
with t as( select tm_name,sdt dt,'1' flag from trademark union all select tm_name,edt dt,'-1' flag from trademark) select tm_name,dt, sum(flag) over(partition by tm_name order by dt) f1, lead(dt,1,dt) over(partition by tm_name order by dt) last_dt from t
效果:
如上图所示,xiaomi出现了2个断层,分别是【01-09,01-11】和【01-15,01-17】。而标记这一行的f1列数据正好都是0。
3、用每个品牌最晚结束的时间 - 最开始的时间 - 断层的时间 = 实际活动时间
with t as( select tm_name,sdt dt,'1' flag from trademark union all select tm_name,edt dt,'-1' flag from trademark) select tm_name, datediff(max(dt),min(dt))-sum(if(f1=0,datediff(last_dt,dt)-1,0)) real_time from ( select tm_name,dt, sum(flag) over(partition by tm_name order by dt) f1, lead(dt,1,dt) over(partition by tm_name order by dt) last_dt from t ) t1 group by tm_name扩展法二:
with t as( select tm_name, #若开始时间小于当前最大的结束时间就用结束时间+1作为开始时间,避免时间叠加 if(max_edt is not null and sdt<=max_edt,date_add(max_edt,interval 1 day),sdt) sdt, edt from ( select *, #取出当前数据前面最大的结束时间 max(edt) over(partition by tm_name order by sdt rows between unbounded preceding and 1 preceding) max_edt from trademark) t1) select tm_name, #因为上面改了sdt,所以会导致某些重叠数据sdt>edt,这里要过滤掉 sum(if(datediff(edt,sdt)<0,0,datediff(edt,sdt)+1)) real_time from t group by tm_name结果:
mysql
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)