2021.11.11侧视图、行转列、列转行、排序

2021.11.11侧视图、行转列、列转行、排序,第1张

2021.11.11侧视图、行转列、列转行、排序
侧视图
将employee表中的work_place(array)列 炸裂开
0: jdbc:hive2://192.168.153.139:10000> select explode(work_place) from employee;
0: jdbc:hive2://192.168.153.139:10000> select name, wp from employee lateral view explode(work_place) work_place as wp;
将employee表中的skills_score(map)列 炸裂开
0: jdbc:hive2://192.168.153.139:10000> select explode(skills_score) from employee;
0: jdbc:hive2://192.168.153.139:10000> select name,skill,score from employee lateral view explode(skills_score) sk_sc as skill,score;

0: jdbc:hive2://192.168.153.139:10000> select name,wp,skill,score from employee 
0: jdbc:hive2://192.168.153.139:10000> lateral view explode(work_place) wo_pl as wp 
0: jdbc:hive2://192.168.153.139:10000> lateral view explode(skills_score) sk_sc as skill, score;

WordCount案例 explode始用
0: jdbc:hive2://192.168.153.139:10000> create table wordcount(line string);

[root@gree139 tmp]# vi ./wc.txt 

hello world
hello java
hello hadoop
hello hive
are you ok?
no i'm very bad

0: jdbc:hive2://192.168.153.139:10000> load data local inpath '/opt/tmp/wc.txt' overwrite into table wordcount;

0: jdbc:hive2://192.168.153.139:10000> select * from wordcount;

0: jdbc:hive2://192.168.153.139:10000> select split(line,"\s") from wordcount;

0: jdbc:hive2://192.168.153.139:10000> select explode(split(line,"\s")) from wordcount;
+---------+--+
|   col   |
+---------+--+
| hello   |
| world   |
| hello   |
| java    |


0: jdbc:hive2://192.168.153.139:10000> select w.word,count(1) num from 
. . . . . . . . . . . . . . . . . . .> (select explode(split(line,"\s")) word(列名) from wordcount) w 
. . . . . . . . . . . . . . . . . . .> group by w.word order by num desc;

0: jdbc:hive2://192.168.153.139:10000> with 
. . . . . . . . . . . . . . . . . . .> t1 as (select explode(split(line,"\s")) word from wordcount)
. . . . . . . . . . . . . . . . . . .> select t1.word,count(1) num from t1 group by word order by num desc;


CASE WHEN THEN ELSE END
将年龄分段   int 转成 字符串
0: jdbc:hive2://192.168.153.139:10000> select name,age,case when age>=20 then 'old' when age<=18 then 'young' else 'middle' end as tag from studentp;
+------------+------+---------+--+
|    name    | age  |   tag   |
+------------+------+---------+--+
| zhansan1   | 20   | old     |
| zhansan2   | 20   | old     |
| zhansan3   | 18   | young   |
| zhansan4   | 18   | young   |
| zhansan5   | 30   | old     |
| zhansan6   | 19   | middle  |
| zhansan7   | 19   | middle  |
| zhansan8   | 19   | middle  |
| zhansan9   | 12   | young   |
| zhansan10  | 12   | young   |
| zhansan11  | 12   | young   |
+------------+------+---------+--+

将性别转数字 boy->1  girl->0
0: jdbc:hive2://192.168.153.139:10000> select name,gender, case when gender='boy' then 1 when gender='girl' then 0 else -1 end as gendertag from studentp;
+------------+---------+------------+--+
|    name    | gender  | gendertag  |
+------------+---------+------------+--+
| zhansan1   | boy     | 1          |
| zhansan2   | girl    | 0          |
| zhansan3   | boy     | 1          |
| zhansan4   | girl    | 0          |
| zhansan5   | boy     | 1          |
| zhansan6   | girl    | 0          |
| zhansan7   | girl    | 0          |
| zhansan8   | boy     | 1          |
| zhansan9   | girl    | 0          |
| zhansan10  | girl    | 0          |
| zhansan11  | girl    | 0          |
+------------+---------+------------+--+


select name,gender, 
case when gender='boy' then 1 else 0 end as boytag, 
case when gender='girl' then 1 else 0 end as girltag,
case when gender is null then 1 else 0 end as taijiantag
from studentp;

0: jdbc:hive2://192.168.153.139:10000> create temporary table tmp_gender_tag as select name,gender, 
. . . . . . . . . . . . . . . . . . .> case when gender='boy' then 1 else 0 end as boytag, 
. . . . . . . . . . . . . . . . . . .> case when gender='girl' then 1 else 0 end as girltag,
. . . . . . . . . . . . . . . . . . .> case when gender is null then 1 else 0 end as taijiantag
. . . . . . . . . . . . . . . . . . .> from studentp;
+----------------------+------------------------+------------------------+-------------------------+----------------------------+--+
| tmp_gender_tag.name  | tmp_gender_tag.gender  | tmp_gender_tag.boytag  | tmp_gender_tag.girltag  | tmp_gender_tag.taijiantag  |
+----------------------+------------------------+------------------------+-------------------------+----------------------------+--+
| zhansan1             | boy                    | 1                      | 0                       | 0                          |
| zhansan2             | girl                   | 0                      | 1                       | 0                          |
| zhansan3             | boy                    | 1                      | 0                       | 0                          |
| zhansan4             | girl                   | 0                      | 1                       | 0                          |
| zhansan5             | boy                    | 1                      | 0                       | 0                          |
| zhansan6             | girl                   | 0                      | 1                       | 0                          |
| zhansan7             | girl                   | 0                      | 1                       | 0                          |
| zhansan8             | boy                    | 1                      | 0                       | 0                          |
| zhansan9             | girl                   | 0                      | 1                       | 0                          |
| zhansan10            | girl                   | 0                      | 1                       | 0                          |
| zhansan11            | girl                   | 0                      | 1                       | 0                          |
+----------------------+------------------------+------------------------+-------------------------+----------------------------+--+

0: jdbc:hive2://192.168.153.139:10000> select sum(boytag) as boysum,sum(girltag) as girlnum,sum(taijiantag) as tjnum  from tmp_gender_tag;
+---------+----------+--------+--+
| boysum  | girlnum  | tjnum  |
+---------+----------+--------+--+
| 4       | 7        | 0      |
+---------+----------+--------+--+

行转列  concat_ws  collect_set collect_list
select gender,concat_ws(",",collect_set(name)) as stu_col from studentp group by gender;  去重
select gender,concat_ws(",",collect_list(name)) as stu_col from studentp group by gender; 不去重


导出表数据到指定文件夹,必须要求导出文件夹为空文件夹,否则会报错
0: jdbc:hive2://192.168.153.139:10000> export table studentp to '/tmp/out1';

导出表数据到本地文件夹,要确保目标文件夹存在,否导出失败
[root@gree139 tmp]# hive -e "select * from greedemo.studentp1" >> /opt/tmp/out/student.txt


order by    全局排序
sort by     分区内部有序,整体无序
distribute by  mapTask分区  一般与sort by合作使用
cluster by   如果分区和排序字段相同并且升序,等同与distrubute by age sort by age asc
0: jdbc:hive2://192.168.153.139:10000> select * from studentp order by age;

0: jdbc:hive2://192.168.153.139:10000> select * from studentp sort by  age;


0: jdbc:hive2://192.168.153.139:10000> create table dissorstudenttp as select * from studentp distribute by id sort by age ;
0: jdbc:hive2://192.168.153.139:10000> create table dissorstudenttp2 as select * from studentp distribute by age sort by age ;
0: jdbc:hive2://192.168.153.139:10000> insert overwrite table dissorstudenttp2 select * from studentp distribute by age sort by age desc ;

0: jdbc:hive2://192.168.153.139:10000> create table dissorstudenttp3 as select * from studentp cluster by age ;

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/5435927.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-11
下一篇 2022-12-11

发表评论

登录后才能评论

评论列表(0条)

保存