- SUN(A)+SUM(B)!=SUM(A+B)的情况
- 1. 验证
- 1.1. 对单独的一列进行SUM
- 1.2. SUM(A)+SUM(B)
- 1.3. SUM(A+B)
- 2. 解决
结论: 在hive、presto、spark中,当字段A或字段B列中存在null值时,SUN(A)+SUM(B)!=SUM(A+B)
1. 验证tips
1.sum(col) 当col的值为null时,结果为null
2.null+数值=null
示例数据如下
SELECt customer_no ,SUM(stock_fee) FROM table1 GROUP BY customer_no; +--------------+--------------------+ | customer_no | _c1 | +--------------+--------------------+ | C0001 | 0.1 | | C0002 | NULL | | C0003 | 6083.259999999999 | +--------------+--------------------+
- 当列值中存在NULL时(C0001),sum结果不会报错,且计算结果正确
- 当列值中全部为NULL时(C0002),sum结果为null
- 当列值中不存在NULL值时(C0003),对数值进行sum,无异议
SELECt customer_no ,SUM(stock_fee)+SUM(advertisement_fee) FROM table1 GROUP BY customer_no; +--------------+-----------+ | customer_no | _c1 | +--------------+-----------+ | C0001 | 9042.78 | | C0002 | NULL | | C0003 | 15395.84 | +--------------+-----------+
- 对于c0001,SUM(stock_fee)=0.1,SUM(advertisement_fee)=9042.68,因此SUM(stock_fee)+SUM(advertisement_fee)=9042.78
- 对于c0002,SUM(stock_fee)=null,SUM(advertisement_fee)=9312.58,因此SUM(stock_fee)+SUM(advertisement_fee)=null
- 对于c0003,SUM(stock_fee)=6083.259999999999,SUM(advertisement_fee)=9312.58,因此SUM(stock_fee)+SUM(advertisement_fee)=15395.84
SELECt customer_no ,SUM(stock_fee + advertisement_fee) FROM table1 GROUP BY customer_no; +--------------+---------------------+ | customer_no | _c1 | +--------------+---------------------+ | C0001 | 1084.82 | | C0002 | NULL | | C0003 | 15395.839999999998 | +--------------+---------------------+
- 对于c0001,只有0.1 + 1084.72 不为null,因此SUM(stock_fee + advertisement_fee)结果为1084.82
- 对于c0002,只有stock_fee + advertisement_fee都为null,因此SUM(stock_fee + advertisement_fee)结果为null
- 对于c0003,只有stock_fee + advertisement_fee都不为为null,因此SUM(stock_fee + advertisement_fee)结果为15395.839999999998
想要sum时得到正确结果,在sum时对每个字段值都判断是否为null,当为null时置为0
以下二者结果相同
SELECt customer_no ,SUM(if(stock_fee is null,0,stock_fee) + if(advertisement_fee is null,0,advertisement_fee)) FROM table1 GROUP BY customer_no; SELECt customer_no ,SUM(if(stock_fee is null,0,stock_fee)) + SUM(if(advertisement_fee is null,0,advertisement_fee)) FROM table1 GROUP BY customer_no; +--------------+-----------+ | customer_no | _c1 | +--------------+-----------+ | C0001 | 9042.78 | | C0002 | 9312.58 | | C0003 | 15395.84 | +--------------+-----------+
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)