NFS方式
一、Server 端配置 (root@hz-search-zookeeper-01)su - hbase -c "mkdir /home/hbase/hadoop_nfs && chmod 777 /home/hbase/hadoop_nfs"echo '/home/hbase/hadoop_nfs 172.37.0.202(rw)' >> /etc/exports service nfs restart二、Client 端配置 (hadoop namenode)su - hbase -c "mkdir -p /home/hbase/hadoop_nfs/name"/etc/init.d/nfslock start mount -t nfs 172.37.0.201:/home/hbase/hadoop_nfs/ /home/hbase/hadoop_nfs/nameecho 'mount -t nfs 172.37.0.201:/home/hbase/hadoop_nfs/ /home/hbase/hadoop_nfs/name' >> /etc/rc.d/rc.local三、配置dfs.name.dir为两份,并重启 hadoop 使其生效<property><name>dfs.name.dir</name><value>/home/admin/name/,/home/admin/hadoop_nfs/name</value></property>QJM方式
http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
配置ZK集群,参照我的另一篇博文ZooKeeper集群安装配置hdfs-site.xml追加namenode服务名称<property> <name>dfs.nameservices</name> <value>mycluster</value></property>追加namenode服务的节点,一个nameservice最多两个节点<property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value></property><property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>172.37.0.202:8020</value></property><property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>172.37.0.201:8020</value></property><property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>172.37.0.202:50070</value></property><property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>172.37.0.201:50070</value></property><property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://172.37.0.201:8485;172.37.0.202:8485;172.37.0.203:8485/mycluster</value></property><property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property><property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value></property><property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/root/.ssh/id_rsa</value></property><property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
core-site.xml<property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value></property><property> <name>dfs.journalnode.edits.dir</name> <value>/path/to/journal/node/local/data</value></property><property> <name>ha.zookeeper.quorum</name> <value>172.37.0.201:2181,172.37.0.202:2181,172.37.0.203:2181</value> </property> 在ZK上初期化HA,在一个namenode节点上执行就可以了$ hdfs zkfc -formatZK在每个节点上都得执行下面的命令, 启动journalnodehadoop-daemon.sh start journalnode [root logs]# jps 12821 JournalNode在各个节点运行 hadoop namenode -format运行start-dfs.shstart-yarn.sh察看状态:namenode16753 QuorumPeerMain18743 ResourceManager18634 DFSZKFailoverController18014 JournalNode18234 NameNode15797 Bootstrap19571 Jps18333 DataNode18850 NodeManagerDatanode1715 DataNode1869 NodeManager1556 JournalNode1142 QuorumPeerMain2179 Jps搭建好以后,jps查看相关进程,用kill -9 删除active的nn,通过hdfs haadmin -DFSHAAdmin -getServiceState nn1可以查看以前standby的NN已经active,执行查看等操作也都正常。想要在启动kill掉的namenode,用sbin/hadoop- daemon.sh start namenode
重新设置Hadoop HA在namenode机器上停止Stop-dfs.shStop-yarn.sh
所有机器上停止ZKzkServer.sh stop
删除所有机器zk的临时文件rm -rf /tmp/hadoop-root/zookeeper/version-2
删除所有机器JournalNode临时文件rm -rf /home/hadoop/hadoop-root/journal/node/local/data/*删除所有机器namenode,datanode文件rm -rf /home/hadoop/hadoop-root/dfs/name/*rm -rf /home/hadoop/hadoop-root/dfs/data/*启动所有机器的ZKzkServer.sh start在ZK上初期化HA,在一个namenode节点上执行就可以了 hdfs zkfc -formatZK在每个节点上都得执行下面的命令, 启动journalnodehadoop-daemon.sh start journalnode在namenode节点上运行 hadoop namenode -format启动hadoopstart-dfs.shstart-yarn.sh
查看节点状态http://172.37.0.201:50070/dfshealth.jsphttp://172.37.0.202:50070/dfshealth.jsp