11 -------------------------Full join ----你有,我有,--你有,我没有---, 你没有,我有 ---- 两表全都显示, --笛卡尔积-----每一一个join一遍 -----数据量大的吓人 6 * 6 = 36 ------------------------------------------------- hive (mydb)> select * from u1 join u2 on u1.id = u2.id; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = root_20220112222831_c8641f66-8d56-4cb7-bf33-93e2b1e70128 Total jobs = 1 2022-01-12 22:28:39 Starting to launch local task to process map join; maximum memory = 518979584 2022-01-12 22:28:40 Dump the side-table for tag: 0 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-28-31_054_4724095481681766367-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile00--.hashtable 2022-01-12 22:28:40 Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-28-31_054_4724095481681766367-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile00--.hashtable (386 bytes) 2022-01-12 22:28:40 End of local task; Time Taken: 1.147 sec. Execution completed successfully MapredLocal task succeeded Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1641995564638_0001, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0001/ Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job -kill job_1641995564638_0001 Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0 2022-01-12 22:28:51,469 Stage-3 map = 0%, reduce = 0% 2022-01-12 22:28:59,717 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 1.18 sec MapReduce Total cumulative CPU time: 1 seconds 180 msec Ended Job = job_1641995564638_0001 MapReduce Jobs Launched: Stage-Stage-3: Map: 1 Cumulative CPU: 1.18 sec HDFS Read: 6092 HDFS Write: 147 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 180 msec OK u1.id u1.name u2.id u2.name 4 d 4 d 5 e 5 e 6 f 6 f Time taken: 29.726 seconds, Fetched: 3 row(s) hive (mydb)> select * from u1 left join u2 on u1.id = u2.id; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = root_20220112223213_e0369e9c-ff56-4164-a46e-1fbd05c7a21b Total jobs = 1 2022-01-12 22:32:20 Starting to launch local task to process map join; maximum memory = 518979584 2022-01-12 22:32:21 Dump the side-table for tag: 1 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-32-13_210_5733734088434832898-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile11--.hashtable 2022-01-12 22:32:21 Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-32-13_210_5733734088434832898-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile11--.hashtable (386 bytes) 2022-01-12 22:32:21 End of local task; Time Taken: 1.054 sec. Execution completed successfully MapredLocal task succeeded Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1641995564638_0002, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0002/ Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job -kill job_1641995564638_0002 Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0 2022-01-12 22:32:31,981 Stage-3 map = 0%, reduce = 0% 2022-01-12 22:32:37,151 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 0.74 sec MapReduce Total cumulative CPU time: 740 msec Ended Job = job_1641995564638_0002 MapReduce Jobs Launched: Stage-Stage-3: Map: 1 Cumulative CPU: 0.74 sec HDFS Read: 5770 HDFS Write: 213 SUCCESS Total MapReduce CPU Time Spent: 740 msec OK u1.id u1.name u2.id u2.name 1 a NULL NULL 2 b NULL NULL 3 c NULL NULL 4 d 4 d 5 e 5 e 6 f 6 f Time taken: 25.01 seconds, Fetched: 6 row(s) hive (mydb)> select * from u1 right join u2 on u1.id = u2.id; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = root_20220112223714_8fb055ba-3d36-4ea7-a252-b9ff2bb2ec54 Total jobs = 1 2022-01-12 22:37:21 Starting to launch local task to process map join; maximum memory = 518979584 2022-01-12 22:37:22 Dump the side-table for tag: 0 with group count: 6 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-37-14_046_7767994758097546401-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile20--.hashtable 2022-01-12 22:37:22 Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-37-14_046_7767994758097546401-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile20--.hashtable (386 bytes) 2022-01-12 22:37:22 End of local task; Time Taken: 1.094 sec. Execution completed successfully MapredLocal task succeeded Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1641995564638_0003, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0003/ Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job -kill job_1641995564638_0003 Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0 2022-01-12 22:37:32,594 Stage-3 map = 0%, reduce = 0% 2022-01-12 22:37:37,701 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 0.96 sec MapReduce Total cumulative CPU time: 960 msec Ended Job = job_1641995564638_0003 MapReduce Jobs Launched: Stage-Stage-3: Map: 1 Cumulative CPU: 0.96 sec HDFS Read: 5770 HDFS Write: 213 SUCCESS Total MapReduce CPU Time Spent: 960 msec OK u1.id u1.name u2.id u2.name 4 d 4 d 5 e 5 e 6 f 6 f NULL NULL 7 g NULL NULL 8 h NULL NULL 9 i Time taken: 24.754 seconds, Fetched: 6 row(s) hive (mydb)> select * from u1 full join u2 on u1.id = u2.id; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = root_20220112223920_77ef5961-e704-43d2-9a6b-2a56e7064e42 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Defaulting to jobconf value of: 4 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapreduce.job.reduces= Starting Job = job_1641995564638_0004, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0004/ Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job -kill job_1641995564638_0004 Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 4 2022-01-12 22:39:28,499 Stage-1 map = 0%, reduce = 0% 2022-01-12 22:39:40,692 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.7 sec 2022-01-12 22:39:48,647 Stage-1 map = 100%, reduce = 50%, Cumulative CPU 5.34 sec 2022-01-12 22:39:51,866 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.94 sec MapReduce Total cumulative CPU time: 7 seconds 940 msec Ended Job = job_1641995564638_0004 MapReduce Jobs Launched: Stage-Stage-1: Map: 2 Reduce: 4 Cumulative CPU: 7.94 sec HDFS Read: 27544 HDFS Write: 540 SUCCESS Total MapReduce CPU Time Spent: 7 seconds 940 msec OK u1.id u1.name u2.id u2.name 4 d 4 d NULL NULL 8 h 1 a NULL NULL 5 e 5 e NULL NULL 9 i 2 b NULL NULL 6 f 6 f 3 c NULL NULL NULL NULL 7 g Time taken: 32.148 seconds, Fetched: 9 row(s) hive (mydb)> select * from u1,u2; FAILED: SemanticException Cartesian products are disabled for safety reasons. If you know what you are doing, please sethive.strict.checks.cartesian.product to false and that hive.mapred.mode is not set to 'strict' to proceed. Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features. hive (mydb)> set hive.strict.checks.cartesian.product; hive.strict.checks.cartesian.product=true hive (mydb)> set hive.strict.checks.cartesian.product=false; hive (mydb)> select * from u1,u2; Warning: Map Join MAPJOIN[9][bigTable=?] in task 'Stage-3:MAPRED' is a cross product WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = root_20220112225121_2784f4ef-a0f1-4e44-95f5-d090246f17cc Total jobs = 1 2022-01-12 22:51:29 Starting to launch local task to process map join; maximum memory = 518979584 2022-01-12 22:51:30 Dump the side-table for tag: 0 with group count: 1 into file: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-51-21_357_175171533818054252-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile30--.hashtable 2022-01-12 22:51:30 Uploaded 1 File to: file:/tmp/root/d6200ad0-564c-4cd8-8a3a-2aa6255ab21d/hive_2022-01-12_22-51-21_357_175171533818054252-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile30--.hashtable (320 bytes) 2022-01-12 22:51:30 End of local task; Time Taken: 1.185 sec. Execution completed successfully MapredLocal task succeeded Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1641995564638_0005, Tracking URL = http://linux123:8088/proxy/application_1641995564638_0005/ Kill Command = /opt/lagou/servers/hadoop-2.9.2/bin/hadoop job -kill job_1641995564638_0005 Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0 2022-01-12 22:51:37,443 Stage-3 map = 0%, reduce = 0% 2022-01-12 22:51:43,602 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 1.26 sec MapReduce Total cumulative CPU time: 1 seconds 260 msec Ended Job = job_1641995564638_0005 MapReduce Jobs Launched: Stage-Stage-3: Map: 1 Cumulative CPU: 1.26 sec HDFS Read: 5721 HDFS Write: 807 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 260 msec OK u1.id u1.name u2.id u2.name 1 a 4 d 2 b 4 d 3 c 4 d 4 d 4 d 5 e 4 d 6 f 4 d 1 a 5 e 2 b 5 e 3 c 5 e 4 d 5 e 5 e 5 e 6 f 5 e 1 a 6 f 2 b 6 f 3 c 6 f 4 d 6 f 5 e 6 f 6 f 6 f 1 a 7 g 2 b 7 g 3 c 7 g 4 d 7 g 5 e 7 g 6 f 7 g 1 a 8 h 2 b 8 h 3 c 8 h 4 d 8 h 5 e 8 h 6 f 8 h 1 a 9 i 2 b 9 i 3 c 9 i 4 d 9 i 5 e 9 i 6 f 9 i Time taken: 24.318 seconds, Fetched: 36 row(s) hive (mydb)>
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)