Hadoop 的集群ID在namenode第一次格式化时生成,始终保持:namenode/journalnode/datanode的配置中ClusterID保持一致。即使更改ClusterID,这三个服务的各个配置中ClusterID依然需要保持一致,集群才可以正常运行。且对原来的集群依然是可用的,只是需要重启所有服务,新的ClusterID生效。现在看演示。
1.检查修改之前的namenode/journalnode/datanode里面的ClusterID配置信息。
Namenode1;
[hadoop@big81 current]$ cat /data02/current/VERSION
#Sat Apr 16 18:17:19 CST 2022
namespaceID=166178331
clusterID=CID-4d0a8992-de22-4357
cTime=1650104239176
storageType=NAME_NODE
blockpoolID=BP-1145621526-192.168.1.81-1650104239176
layoutVersion=-64
namenode2:
[hadoop@big82 current]$ more VERSION
#Mon Apr 18 08:00:11 CST 2022
namespaceID=166178331
clusterID=CID-4d0a8992-de22-4357
cTime=1650104239176
storageType=NAME_NODE
blockpoolID=BP-1145621526-192.168.1.81-1650104239176
layoutVersion=-64
Journalnode1:
[hadoop@big91 current]$ pwd
/data02/jdata/fgeduns/current
[hadoop@big91 current]$ cat VERSION
#Sat Apr 16 18:16:52 CST 2022
namespaceID=166178331
#clusterID=CID-6e446eb6-01fa-4f97-9e08-9f5842bf335a
clusterID=CID-4d0a8992-de22-4357
cTime=1650104239176
storageType=JOURNAL_NODE
layoutVersion=-64
Datanode1:
[hadoop@big91 current]$ cat /data02/current/VERSION
#Mon Apr 18 08:42:50 CST 2022
storageID=DS-88631297-ea08-4abb-89c7-e9d872a73c64
clusterID=CID-4d0a8992-de22-4357
cTime=0
datanodeUuid=cc3a1a28-8945-47cf-9f68-f16da75c9d78
storageType=DATA_NODE
layoutVersion=-57
Journalnode2:
/data02/jdata/fgeduns/current
[hadoop@big92 current]$ cat VERSION
#Sat Apr 16 18:16:42 CST 2022
namespaceID=166178331
#clusterID=CID-6e446eb6-01fa-4f97-9e08-9f5842bf335a
clusterID=CID-4d0a8992-de22-4357
cTime=1650104239176
storageType=JOURNAL_NODE
layoutVersion=-64
Datanode2:
[hadoop@big92 current]$ cat /data02/current/VERSION
#Mon Apr 18 08:42:39 CST 2022
storageID=DS-b16c30aa-d9a1-43b5-abea-5993a9224499
clusterID=CID-4d0a8992-de22-4357
cTime=0
datanodeUuid=31d0fd01-959e-434b-998e-016345cf3d7e
storageType=DATA_NODE
layoutVersion=-57
JournalNode3:
[hadoop@big93 current]$ pwd
/data02/jdata/fgeduns/current
[hadoop@big93 current]$ cat VERSION
#Sat Apr 16 18:16:41 CST 2022
namespaceID=166178331
#clusterID=CID-6e446eb6-01fa-4f97-9e08-9f5842bf335a
clusterID=CID-4d0a8992-de22-4357
cTime=1650104239176
storageType=JOURNAL_NODE
layoutVersion=-64
Datanode3:
[hadoop@big93 current]$ cat /data02/current/VERSION
#Mon Apr 18 08:42:38 CST 2022
storageID=DS-e0baece2-b9f6-4abb-8a5d-444cb16050f8
clusterID=CID-4d0a8992-de22-4357
cTime=0
datanodeUuid=971b21b8-e2d4-4a96-8db6-dd504610e216
storageType=DATA_NODE
layoutVersion=-57
2.检查集群状态。
Resourcemanager都活着。
[hadoop@big93 current]$ yarn rmadmin -getAllServiceState
big81:8033 active
big82:8033 standby
namenode都活着。
[hadoop@big93 current]$ hdfs haadmin -getAllServiceState
big81:9000 standby
big82:9000 active
查看Datanode:
[hadoop@big91 current]$ jps
42289 NodeManager
42945 Jps
41861 JournalNode
24231 QuorumPeerMain
41627 DataNode
[hadoop@big92 current]$ jps
38880 NodeManager
39457 Jps
22003 QuorumPeerMain
38643 DataNode
38755 JournalNode
[hadoop@big93 current]$ jps
41156 NodeManager
41879 Jps
23149 QuorumPeerMain
40927 DataNode
41039 JournalNode
可以看到:resourcemanager,datanode,namenode都活着,我们现在开始修改集群ID。
3.修改ClusterID;
Namenode1:
[hadoop@big81 current]$ hdfs --daemon stop namenode
[hadoop@big81 current]$ cat /data02/current/VERSION
#Sat Apr 16 18:17:19 CST 2022
namespaceID=166178331
clusterID=CID-4d0a8992-de22-4357-123456
cTime=1650104239176
storageType=NAME_NODE
blockpoolID=BP-1145621526-192.168.1.81-1650104239176
layoutVersion=-64
Namenode2:
[hadoop@big82 current]$ hdfs --daemon stop namenode
[hadoop@big82 current]$ vi VERSION
[hadoop@big82 current]$ cat VERSION
#Mon Apr 18 08:00:11 CST 2022
namespaceID=166178331
clusterID=CID-4d0a8992-de22-4357-123456
cTime=1650104239176
storageType=NAME_NODE
blockpoolID=BP-1145621526-192.168.1.81-1650104239176
layoutVersion=-64
如第二步的配置,依次修改:Journalnode1/Journalnode2/Datanode1/Datanode1/Datanode3
修改这个服务的配置的VERSION里面的clusterID.
修改前:CID-4d0a8992-de22-4357
修改后:CID-4d0a8992-de22-4357-123456
4.重启所有服务并验证。
[hadoop@big82 current]$ stop-all.sh
WARNING: Stopping all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: Use CTRL-C to abort.
Stopping namenodes on [big81 big82]
Stopping datanodes
Stopping journal nodes [big93 big91 big92]
Stopping nodemanagers
Stopping resourcemanagers on [ big81 big82]
[hadoop@big82 current]$ start-all.sh --在任意一个节点重启即可。
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [big81 big82]
Starting datanodes
Starting journal nodes [big93 big91 big92]
Starting resourcemanagers on [ big81 big82]
Starting nodemanagers
可以看到datanode,nodemanager,zk,journalnode 都启动了。namenode1/2未启动。
[hadoop@big91 current]$ jps
43509 JournalNode
24231 QuorumPeerMain
43399 DataNode
43624 NodeManager
43790 Jps
[hadoop@big92 current]$ jps
22003 QuorumPeerMain
40132 NodeManager
40297 Jps
39903 DataNode
40015 JournalNode
[hadoop@big93 current]$ jps
42560 NodeManager
42727 Jps
23149 QuorumPeerMain
42333 DataNode
42445 JournalNode
[hadoop@big81 current]$ jps --namenode1未启动。
95400 Jps
11357 DFSZKFailoverController
94940 ResourceManager
[hadoop@big82 current]$ jps --namenode2未启动。
54850 ResourceManager
36339 JobHistoryServer
25579 DFSZKFailoverController
55052 Jps
5.手工启动namenode1;
[hadoop@big81 current]$ hdfs --daemon start namenode
[hadoop@big81 current]$ jps
95585 Jps
95543 NameNode
11357 DFSZKFailoverController
94940 ResourceManager
日志如下:可以看到namenode1成功启动且无报错。
2022-04-18 09:30:55,507 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of over-replicated blocks = 0
2022-04-18 09:30:55,507 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of blocks being written = 0
2022-04-18 09:30:55,507 INFO org.apache.hadoop.hdfs.StateChange: STATE* Replication Queue initialization scan for invalid, over- and under-replicated blocks completed in 707 msec
6.手工启动namenode2:
[hadoop@big82 current]$ hdfs --daemon start namenode
[hadoop@big82 current]$ jps
54850 ResourceManager
36339 JobHistoryServer
55238 Jps
25579 DFSZKFailoverController
55199 NameNode
日志如下:无报错。
2022-04-18 09:32:56,148 INFO BlockStateChange: BLOCK* processReport 0xba13dde4b7e59b9e: Processing first storage report for DS-8fe0fb6a-9dd4-4ed0-949f-fe52a37b16ce from datanode 971b21b8-e2d4-4a96-8db6-dd504610e216
2022-04-18 09:32:56,148 INFO BlockStateChange: BLOCK* processReport 0xba13dde4b7e59b9e: from storage DS-8fe0fb6a-9dd4-4ed0-949f-fe52a37b16ce node DatanodeRegistration(192.168.1.93:9866, datanodeUuid=971b21b8-e2d4-4a96-8db6-dd504610e216, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-4d0a8992-de22-4357-123456;nsid=166178331;c=1650104239176), blocks: 0, hasStaleStorage: false, processing time: 0 msecs, invalidatedBlocks: 0
7.检查集群状态。
namenode正常。
[hadoop@big82 current]$ hdfs haadmin -getAllServiceState
big81:9000 active
big82:9000 standby
resourcemanager节点正常
[hadoop@big82 current]$ yarn rmadmin -getAllServiceState
big81:8033 active
big82:8033 standby
可以看到集群数据节点正常。
[hadoop@big82 current]$ hdfs dfsadmin -report
Configured Capacity: 464880697344 (432.95 GB)
Present Capacity: 430753386496 (401.17 GB)
DFS Remaining: 430753091584 (401.17 GB)
DFS Used: 294912 (288 KB)
DFS Used%: 0.00%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 192.168.1.91:9866 (big91)
Hostname: big91
Decommission Status : Normal
Configured Capacity: 154960232448 (144.32 GB)
DFS Used: 98304 (96 KB)
Non DFS Used: 5523914752 (5.14 GB)
DFS Remaining: 141493895168 (131.78 GB)
DFS Used%: 0.00%
DFS Remaining%: 91.31%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Apr 18 09:35:47 CST 2022
Last Block Report: Mon Apr 18 09:30:53 CST 2022
Num of Blocks: 0
Name: 192.168.1.92:9866 (big92)
Hostname: big92
Decommission Status : Normal
Configured Capacity: 154960232448 (144.32 GB)
DFS Used: 98304 (96 KB)
Non DFS Used: 2462048256 (2.29 GB)
DFS Remaining: 144555761664 (134.63 GB)
DFS Used%: 0.00%
DFS Remaining%: 93.29%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Apr 18 09:35:47 CST 2022
Last Block Report: Mon Apr 18 09:30:53 CST 2022
Num of Blocks: 0
Name: 192.168.1.93:9866 (big93)
Hostname: big93
Decommission Status : Normal
Configured Capacity: 154960232448 (144.32 GB)
DFS Used: 98304 (96 KB)
Non DFS Used: 2314375168 (2.16 GB)
DFS Remaining: 144703434752 (134.77 GB)
DFS Used%: 0.00%
DFS Remaining%: 93.38%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon Apr 18 09:35:47 CST 2022
Last Block Report: Mon Apr 18 09:30:53 CST 2022
Num of Blocks: 0
8.检查之前的目录都可以看到。
[hadoop@big82 current]$ hdfs dfs -ls /
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2022-04-16 18:47 /test1
drwxrwx--- - hadoop supergroup 0 2022-04-17 20:51 /tmp
我只建立一个目录:/tmp目录是集群自动生成的。可以看到之前的目录都在
综上:hadoop集群的ClusterID 是可以手工修改的,只是需要保证所有的:namenode/Journalnode/Datanode 修改后ClusterID全部一致,并重启所有的服务,以使更改生效。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)