hive是怎么建表中用到其他表时怎么用_工具

1创建表的语句：Create [EXTERNAL] TABLE [IF NOT EXISTS] table_name [(col_name data_type [COMMENT col_comment], )] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment], )] [CLUSTERED BY (col_name, col_name, ) [SORTED BY (col_name [ASC|DESC], )] INTO num_buckets BUCKETS] [ROW FORMAT row_format] [STORED AS file_format] [LOCATION hdfs_path]

稍微解释下

CREATE TABLE 创建一个指定名字的表。如果相同名字的表已经存在，则抛出异常；用户可以用 IF NOT EXIST 选项来忽略这个异常。EXTERNAL 关键字可以让用户创建一个外部表，在建表的同时指定一个指向实际数据的路径（LOCATION），Hive 创建内部表时，会将数据移动到数据仓库指向的路径；若创建外部表，仅记录数据所在的路径，不对数据的位置做任何改变。在删除表的时候，内部表的元数据和数据会被一起删除，而外部表只删除元数据，不删除数据。如果文件数据是纯文本，可以使用 STORED AS TEXTFILE。如果数据需要压缩，使用 STORED AS SEQUENCE 。有分区的表可以在创建的时候使用 PARTITIONED BY 语句。一个表可以拥有一个或者多个分区，每一个分区单独存在一个目录下。而且，表和分区都可以对某个列进行 CLUSTERED BY *** 作，将若干个列放入一个桶（bucket）中。也可以利用SORT BY 对数据进行排序。这样可以为特定应用提高性能。

创建普通的表：create table test_table (id int,name string,no int) row format delimited fields terminated by ',' stored as textfile；

//指定了字段的分隔符为逗号，所以load数据的时候，load的文本也要为逗号，否则加载后为NULL。hive只支持单个字符的分隔符，hive默认的分隔符是\001

创建带有partition的表：create table test_part (id int,name string,no int) partitioned by (dt string) row format delimited fields terminated by '\t' stored as textfile ;

用创建用\t作分隔符的表，PT为分区字段，

加载如下：

load data local inpath '/home/zhangxin/hive/test_hivetxt' overwrite into table test_part partition (dt='2012-03-05');

//local是本地文件，注意不是你电脑上的文件，是hadoop所在的本地文件

//如果是在hdfs里的文件，则不需要local。 overwrite into是覆盖表分区，仅仅是这个分区的数据内容，如果是追加，则不需要overwrite

创建external表：(外部表)create external table test_external (id int,name string,no int) row format delimited fields terminated by ',' location '/home/zhangxin/hive/test_hivetxt';

//用逗号分隔的表，且无分区， location后是外部表数据的存放路径

创建与已知表相同结构的表 Like：只复制表的结构，而不复制表的内容。create table test_like_table like test_bucket;

hive

CREATE TABLE IF NOT EXISTS `test_01`(

`id` int,`name` String,`age` INT,`score` FLOAT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

CREATE external TABLE IF NOT EXISTS `test_02`(

`id` int, `name` String,`age` INT,`score` FLOAT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

vi /home/hadoop/share/mydata/hive/scoretxt

内容如下：

1,'zhang',20,120

2,'zhao',19,119

3,'qian',18,118

4,'li',21,121

vi /home/hadoop/share/mydata/hive/score02txt

内容如下：

5,'wang',20,120

6,'zhou',19,119

7,'wu',18,118

8,'hu',21,121

load data local inpath '/home/hadoop/share/mydata/hive/scoretxt' overwrite into table test_01;

load data local inpath '/home/hadoop/share/mydata/hive/scoretxt' overwrite into table test_02;

select from test_01;

select from test_02;

hadoop fs -ls /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_01

hadoop fs -ls /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_02

hadoop fs -cat /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_01/scoretxt

hadoop fs -cat /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_02/scoretxt

drop table test_01;

drop table test_02;

hadoop fs -ls /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb

CREATE TABLE IF NOT EXISTS `test_01`(

`id` int,`name` String,`age` INT,`score` FLOAT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

CREATE external TABLE IF NOT EXISTS `test_02`(

`id` int, `name` String,`age` INT,`score` FLOAT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

select from test_01;

select from test_02;

load data local inpath '/home/hadoop/share/mydata/hive/score02txt' overwrite into table test_01;

load data local inpath '/home/hadoop/share/mydata/hive/score02txt' overwrite into table test_02;

select from test_01;

select from test_02;

hadoop fs -ls /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb

hadoop fs -ls /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_01

hadoop fs -ls /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_02

hadoop fs -cat /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_02/

注意没有用overwrite

load data local inpath '/home/hadoop/share/mydata/hive/score02txt' into table test_02;

hadoop fs -cat /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_02/

hadoop fs -ls /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_02

注意这次用overwrite

load data local inpath '/home/hadoop/share/mydata/hive/score02txt' overwrite into table test_02;

select from test_02;

hadoop fs -ls /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_02

hadoop fs -cat /mylab/soft/apache-hive-312-bin/working/metastorewarehouse/testdbdb/test_02/

不指明类型的情况下，HIVE会默认新建的表为内部表，外部表需要使用external关键字。

当我们删除外部表时，删除的只是元数据，存储数据仍被保留。当我们删除内部表时，元数据和存储数据都被删除。

使用load data *** 作的时候，不管是外部表还是内部表，如果源数据存在于HDFS层，都是数据的移动。即源数据从HDFS存储路径移动到HIVE数据仓库默认路径。

使用load data *** 作的时候，要是使用了overwrite，则情况原来的文件，生成正在load的文件，要是没有用overwrite，则在原来的基础上，增加新加载的文件，要是有重名，hive会自动补足成唯一的文件名

hive中有两种表：外部表和内部表（managed and external）。可以通过 desc formatted table_name 命令来查看表的信息，来辨别表是外部表还是内部表。在hive默认创建到表是内部表，外部表创建需要加 EXTERNAL 命令，如： CREATE EXTERNAL table_name 。

内部表的文件，元数据和统计信息等由hive进行管理，一般被存储在 hivemetastorewarehousedir 目录下，当表被删除或者分区被删除，相对应的数据和元数据就会被删除。一般用来当做临时表。

外部表与内部表相反，可以指定location，可以不基于hive来 *** 作外部表文件。当表被删除或者分区被删除时对应的数据还会存在。只是hive删除了其元信息,表的数据文件依然存在于文件系统中。若是表被删除，可以重新建这个表，指定location到数据文件处，然后通过msck repair table table_name命令刷新数据的元信息到hive中，也就是恢复了数据。

msck repair table 的详细用法就不讲了，可以参考 HIVE常用命令之MSCK REPAIR TABLE命令简述

对于Hive中关于普通表和外部表描述不正确的是（）

A删除外部表时，只删除外部表数据而不删除元数据(正确答案)

B删除普通表时，元数据和数据同时被删除

C外部表实质是将已经存在HDFS上的文件路径跟表关联起来

D默认创建普通表

答案解析：

解析：外部表和内部表很类似，但是其数据不是放在自己表所属的目录中，而是存放到别处，这样的好处是如果你要删除这个外部表，该外部表所指向的数据是不会被删除的，它只会删除外部表对应的元数据;

以上就是关于hive是怎么建表中用到其他表时怎么用全部的内容，包括:hive是怎么建表中用到其他表时怎么用、好玩的大数据之18：Hive实验1（使用load data导入数据到外部表和内部表）、hive 建表方式及参数详解等相关内容解答，如果想了解更多相关内容，可以关注我们，你们的支持是我们更新的动力！

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/sjk/10199570.html

hive是怎么建表中用到其他表时怎么用

发表评论

评论列表（0条）