hive--HQL基础语法（附带案例）_随笔

hive--HQL基础语法（附带案例）写在前面

HQL与SQL语法都差不多，所以对于sql稍有了解的新手学习起来还是很容易的，不过不了解也没有关系，不会将sql与hql某些句法搞混hhh

在前一篇文章，hive创建元数据已经对hive的基础类型做了说明，总而言之，
基本类型：tinyint, smalint, int, bigint, boolean, float, double,string, timestamp, binary ；
集合类型： struct，array，map。

学习某一个数据结构最基础的就是要熟练运用他的“增删改查” *** 作，hql也不例外。

文章目录

写在前面
1. 针对数据库databases的增删改查
- “增”
- “查”
- “改”
- “删”
2. 针对表table的增删改查
- “增”
- “查”
- “改”
- “删”

1. 针对数据库databases的增删改查 “增”

创建数据库：带[ ] 表示可写可不写；

CREATE DATAbase [IF NOT EXISTS] database_name
[COMMENT database_comment]
[LOCATION hdfs_path]
[WITH DBPROPERTIES (property_name=property_value, ...)];

eg：创建一个student的数据库

create database if not exists student location '/hive/database/student.db'
# 最好是创建的数据库名(student)与路径名的最后一个名称(/student.db)一致，方便管理；
(当然不一样也可以，“.db”后缀表示是数据库，一般是加上，不加也可以）

location不写就会默认在 /user/hive/warehouse/下创建 student库；
下图是我们在hdfs上看到创建好的student的数据库；

“查”

查询数据库: show database;

查看数据库详情：desc database student;

切换当前数据库：use student;

“改”

(这个命令很少用，了解即可）
修改数据库： alter database student set dbproperties(“createtime”=“20150830”);

“删”

删除一个空的数据库：drop database if exists student;
删除一个不为空的数据库：drop database student cascade;

2. 针对表table的增删改查

hive创建元数据该文章讲述了创建一个表（元数据），并且与对应的数据关联。

“增”

创建表：

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIonED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...)
[SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)]
[AS select_statement]

说明：
create table ：同database语句，创建一个指定名字的表；一般都要加上if not exists；
external：外部表关键字，不加代表创建的是管理表（相当于内部表）；表示在建表的时候，指定一个指向实际数据的路径（location），删除表的时候，如果是管理表，则管理表的元数据和对应的数据被一起删除；如果是外部表，则hive下的元数据删除，元数据指向的实际数据不删除，仍在hdfs上可以查看到；
[(col_name data_type [COMMENT col_comment], …)] ：指明一个表的属性值与其对应的基本[集合]类型
comment：为表和列添加注释。
row format delimited [fields terminated by] ：在上一届hive常见元数据有解释；就是每个数据按照什么字符切割；

——————————————————————————————
一般来说，原始数据，比如原始日志表，都是外部表，是不可以删除的。在外部表（原始日志表）的基础上做大量的统计分析，用到的中间表、结果表使用内部表（管理表）存储，数据通过 SELECT+INSERT 进入内部表。

eg：
创建外部表dept（元数据），存放数据如下：
表1：

deptnodnameloc10accounting170020research180030sales190040operations1700

新建一个 dept.txt 文档

dept.txt内容如下：注意，用 'tab 键’连接；

10      ACCOUNTING      1700
20      RESEARCH        1800
30      SALES   1900
40      OPERATIONS      1700

显示如下：

创建命令如下，并对表属性值（deptno，dname，loc）与对应的类型进行说明，对数据的连接符号做了说明：’t’，表示为 ‘tab键’。

create external table if not exists dept(
deptno int, 
dname string,
loc int
)
row format delimited fields terminated by 't';
# location ‘/hive/dept’;  指定路径 可加可不加，这里不加

将表一的数据上传到hdfs上
方法1：

hadoop fs -put dept.txt /user/hive/warehouse/

方法2：推荐！在bin/hive启动后 *** 作

load data local inpath '/opt/module/hive/datas/dept.txt' into table dept;
#local表示本地路径，不加则为hdfs上的路径

由于创建表时未指定路径，所以表的默认路径为 /user/hive/warehouse/，（可以在hive-site.xml文件里hive.metastore.warehouse.dir下配置）我们可以在hdfs上查到上传的数据：

说明：
我们在 /user/hive/warehouse/ 路径下创建了表dept；
并在该路径下上传了数据dept.txt：

load data local inpath '/opt/module/hive/datas/dept.txt' into table dept;

所以此时，这个表dept里面有了一个数据dept.txt。我们通过查询表的内容，就可以查到该dept.txt里的数据。

“查”

查询表：show tables;

查询某个表的详细信息：desc dept;
查询表的内容：select * from table; 看，是不是和表一一样？
查询表内某一列的数据：select dept.dname form dept;
查询表中 loc=1700的数据，用到了where函数

“改”

对table增加元素：
insert into table dept values(50,‘client’,‘2000’)

增加完后，出现了增加的内容；
insert into table dept select * from student where dept.loc=1700; 将loc=1700的数据添加到表dept里。
insert overwrite table dept select * from student where dept.loc=1700; 将loc=1700的数据覆盖到表dept里。

“删”

清除表中的数据：只能删除内部表的数据，不能删除外部表的数据
truncate table dept;
（show tables；dept还在，只是dept里的数据没有了）
删除表 (内部外部都可)
drop table dept;

欢迎分享，转载请注明来源：内存溢出

原文地址: https://outofmemory.cn/zaji/5688939.html

hive--HQL基础语法（附带案例）

发表评论

评论列表（0条）