Hive--窗口函数

Hive--窗口函数,第1张

Hive--窗口函数

 首先创建数据库表数据:

mysql> CREATE TABLE sales(
    ->       sales_employee VARCHAr(50) NOT NULL,
    ->       fiscal_year INT NOT NULL,
    ->       sale DECIMAL(14,2) NOT NULL,
    ->       PRIMARY KEY(sales_employee,fiscal_year)
    ->   );

mysql> INSERT INTO sales(sales_employee,fiscal_year,sale) VALUES('Bob',2016,100),
    ->       ('Bob',2017,150),
    ->       ('Bob',2018,200),
    ->       ('Alice',2016,150),
    ->       ('Alice',2017,100),
    ->       ('Alice',2018,200),
    ->       ('John',2016,200),
    ->       ('John',2017,150),
    ->       ('John',2018,250);

mysql> select * from sales;
+----------------+-------------+--------+
| sales_employee | fiscal_year | sale   |
+----------------+-------------+--------+
| Alice          |        2016 | 150.00 |
| Alice          |        2017 | 100.00 |
| Alice          |        2018 | 200.00 |
| Bob            |        2016 | 100.00 |
| Bob            |        2017 | 150.00 |
| Bob            |        2018 | 200.00 |
| John           |        2016 | 200.00 |
| John           |        2017 | 150.00 |
| John           |        2018 | 250.00 |
+----------------+-------------+--------+

window子句: 
- PRECEDING:往前 
- FOLLOWING:往后 
- CURRENT ROW:当前行 
- UNBOUNDED:起点

UNBOUNDED PRECEDING 表示从前面的起点

UNBOUNDED FOLLOWING:表示到后面的终点

mysql> select fiscal_year, sales_employee, sale,
    -> sum(sale) over(partition by fiscal_year rows between 1 preceding and current row)
    -> as per_total_sales, sum(sale) over(partition by fiscal_year) as total_sales
    -> from sales ;
+-------------+----------------+--------+-----------------+-------------+
| fiscal_year | sales_employee | sale   | pre_total_sales | total_sales |
+-------------+----------------+--------+-----------------+-------------+
|        2016 | Alice          | 150.00 |          150.00 |      450.00 |
|        2016 | Bob            | 100.00 |          250.00 |      450.00 |
|        2016 | John           | 200.00 |          300.00 |      450.00 |
|        2017 | Alice          | 100.00 |          100.00 |      400.00 |
|        2017 | Bob            | 150.00 |          250.00 |      400.00 |
|        2017 | John           | 150.00 |          300.00 |      400.00 |
|        2018 | Alice          | 200.00 |          200.00 |      650.00 |
|        2018 | Bob            | 200.00 |          400.00 |      650.00 |
|        2018 | John           | 250.00 |          450.00 |      650.00 |
+-------------+----------------+--------+-----------------+-------------+


mysql> select fiscal_year, sales_employee, sale,
    -> sum(sale) over(partition by fiscal_year rows between current row and unbounded following)
    -> as cur_un_fol, sum(sale) over(partition by fiscal_year) as total_sales
    -> from sales ;
+-------------+----------------+--------+------------+-------------+
| fiscal_year | sales_employee | sale   | cur_un_fol | total_sales |
+-------------+----------------+--------+------------+-------------+
|        2016 | Alice          | 150.00 |     450.00 |      450.00 |
|        2016 | Bob            | 100.00 |     300.00 |      450.00 |
|        2016 | John           | 200.00 |     200.00 |      450.00 |
|        2017 | Alice          | 100.00 |     400.00 |      400.00 |
|        2017 | Bob            | 150.00 |     300.00 |      400.00 |
|        2017 | John           | 150.00 |     150.00 |      400.00 |
|        2018 | Alice          | 200.00 |     650.00 |      650.00 |
|        2018 | Bob            | 200.00 |     450.00 |      650.00 |
|        2018 | John           | 250.00 |     250.00 |      650.00 |
+-------------+----------------+--------+------------+-------------+

row_number() 、rank()、dense_rank()       排序

mysql> select fiscal_year,sales_employee,sale,row_number() over(partition by fiscal_year) from sales;
+-------------+----------------+--------+---------------------------------------------+
| fiscal_year | sales_employee | sale   | row_number() over(partition by fiscal_year) |
+-------------+----------------+--------+---------------------------------------------+
|        2016 | Alice          | 150.00 |                                           1 |
|        2016 | Bob            | 100.00 |                                           2 |
|        2016 | John           | 200.00 |                                           3 |
|        2017 | Alice          | 100.00 |                                           1 |
|        2017 | Bob            | 150.00 |                                           2 |
|        2017 | John           | 150.00 |                                           3 |
|        2018 | Alice          | 200.00 |                                           1 |
|        2018 | Bob            | 200.00 |                                           2 |
|        2018 | John           | 250.00 |                                           3 |
+-------------+----------------+--------+---------------------------------------------+


mysql> select fiscal_year,sales_employee,sale,rank() over(partition by fiscal_year order by sale) from sales;
+-------------+----------------+--------+-----------------------------------------------------+
| fiscal_year | sales_employee | sale   | rank() over(partition by fiscal_year order by sale) |
+-------------+----------------+--------+-----------------------------------------------------+
|        2016 | Bob            | 100.00 |                                                   1 |
|        2016 | Alice          | 150.00 |                                                   2 |
|        2016 | John           | 200.00 |                                                   3 |
|        2017 | Alice          | 100.00 |                                                   1 |
|        2017 | Bob            | 150.00 |                                                   2 |
|        2017 | John           | 150.00 |                                                   2 |
|        2018 | Alice          | 200.00 |                                                   1 |
|        2018 | Bob            | 200.00 |                                                   1 |
|        2018 | John           | 250.00 |                                                   3 |
+-------------+----------------+--------+-----------------------------------------------------+


mysql> select fiscal_year,sales_employee,sale,dense_rank() over(partition by fiscal_year order by sale) from sales;
+-------------+----------------+--------+-----------------------------------------------------------+
| fiscal_year | sales_employee | sale   | dense_rank() over(partition by fiscal_year order by sale) |
+-------------+----------------+--------+-----------------------------------------------------------+
|        2016 | Bob            | 100.00 |                                                         1 |
|        2016 | Alice          | 150.00 |                                                         2 |
|        2016 | John           | 200.00 |                                                         3 |
|        2017 | Alice          | 100.00 |                                                         1 |
|        2017 | Bob            | 150.00 |                                                         2 |
|        2017 | John           | 150.00 |                                                         2 |
|        2018 | Alice          | 200.00 |                                                         1 |
|        2018 | Bob            | 200.00 |                                                         1 |
|        2018 | John           | 250.00 |                                                         2 |
+-------------+----------------+--------+-----------------------------------------------------------+



ntile(3) 分组: 

mysql> select sales.*, ntile(3) over(order by fiscal_year) GroupId from sales;
+----------------+-------------+--------+---------+
| sales_employee | fiscal_year | sale   | GroupId |
+----------------+-------------+--------+---------+
| Alice          |        2016 | 150.00 |       1 |
| Bob            |        2016 | 100.00 |       1 |
| John           |        2016 | 200.00 |       1 |
| Alice          |        2017 | 100.00 |       2 |
| Bob            |        2017 | 150.00 |       2 |
| John           |        2017 | 150.00 |       2 |
| Alice          |        2018 | 200.00 |       3 |
| Bob            |        2018 | 200.00 |       3 |
| John           |        2018 | 250.00 |       3 |
+----------------+-------------+--------+---------+

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/5619127.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-15
下一篇 2022-12-16

发表评论

登录后才能评论

评论列表(0条)

保存