【踩坑】SUM(A)+SUM(B)和SUM(A+B)不相等的情况

【踩坑】SUM(A)+SUM(B)和SUM(A+B)不相等的情况,第1张

【踩坑】SUM(A)+SUM(B)和SUM(A+B)不相等的情况

文章目录
  • SUN(A)+SUM(B)!=SUM(A+B)的情况
    • 1. 验证
      • 1.1. 对单独的一列进行SUM
      • 1.2. SUM(A)+SUM(B)
      • 1.3. SUM(A+B)
    • 2. 解决

SUN(A)+SUM(B)!=SUM(A+B)的情况

结论: 在hive、presto、spark中,当字段A或字段B列中存在null值时,SUN(A)+SUM(B)!=SUM(A+B)

tips
1.sum(col) 当col的值为null时,结果为null
2.null+数值=null

1. 验证

示例数据如下

customer_nostock_feeadvertisement_feeC0001NULL1130.52C00010.11084.72C0001NULL1738.32C0001NULL1817.67C0001NULL3271.45C0002NULL2587.76C0002NULL1397.60C0002NULL2871.75C0002NULL812.08C0002NULL1643.39C00032635.232587.76C00031253.691397.60C0003253.692871.75C00031528.12812.08C0003412.531643.39 1.1. 对单独的一列进行SUM
SELECt  customer_no
       ,SUM(stock_fee)
FROM table1
GROUP BY  customer_no;

+--------------+--------------------+
| customer_no  |        _c1         |
+--------------+--------------------+
| C0001        | 0.1                |
| C0002        | NULL               |
| C0003        | 6083.259999999999  |
+--------------+--------------------+
  • 当列值中存在NULL时(C0001),sum结果不会报错,且计算结果正确
  • 当列值中全部为NULL时(C0002),sum结果为null
  • 当列值中不存在NULL值时(C0003),对数值进行sum,无异议
1.2. SUM(A)+SUM(B)
SELECt  customer_no
       ,SUM(stock_fee)+SUM(advertisement_fee)
FROM table1
GROUP BY  customer_no;

+--------------+-----------+
| customer_no  |    _c1    |
+--------------+-----------+
| C0001        | 9042.78   |
| C0002        | NULL      |
| C0003        | 15395.84  |
+--------------+-----------+
  • 对于c0001,SUM(stock_fee)=0.1,SUM(advertisement_fee)=9042.68,因此SUM(stock_fee)+SUM(advertisement_fee)=9042.78
  • 对于c0002,SUM(stock_fee)=null,SUM(advertisement_fee)=9312.58,因此SUM(stock_fee)+SUM(advertisement_fee)=null
  • 对于c0003,SUM(stock_fee)=6083.259999999999,SUM(advertisement_fee)=9312.58,因此SUM(stock_fee)+SUM(advertisement_fee)=15395.84
1.3. SUM(A+B)
SELECt  customer_no
       ,SUM(stock_fee + advertisement_fee)
FROM table1
GROUP BY  customer_no;

+--------------+---------------------+
| customer_no  |         _c1         |
+--------------+---------------------+
| C0001        | 1084.82             |
| C0002        | NULL                |
| C0003        | 15395.839999999998  |
+--------------+---------------------+
  • 对于c0001,只有0.1 + 1084.72 不为null,因此SUM(stock_fee + advertisement_fee)结果为1084.82
  • 对于c0002,只有stock_fee + advertisement_fee都为null,因此SUM(stock_fee + advertisement_fee)结果为null
  • 对于c0003,只有stock_fee + advertisement_fee都不为为null,因此SUM(stock_fee + advertisement_fee)结果为15395.839999999998
2. 解决

想要sum时得到正确结果,在sum时对每个字段值都判断是否为null,当为null时置为0

以下二者结果相同

SELECt  customer_no
       ,SUM(if(stock_fee is null,0,stock_fee) + if(advertisement_fee is null,0,advertisement_fee))
FROM table1
GROUP BY  customer_no;

SELECt  customer_no
       ,SUM(if(stock_fee is null,0,stock_fee)) + SUM(if(advertisement_fee is null,0,advertisement_fee))
FROM table1
GROUP BY  customer_no;

+--------------+-----------+
| customer_no  |    _c1    |
+--------------+-----------+
| C0001        | 9042.78   |
| C0002        | 9312.58   |
| C0003        | 15395.84  |
+--------------+-----------+

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/5665068.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-16
下一篇 2022-12-16

发表评论

登录后才能评论

评论列表(0条)

保存