使用heartbeat、PaceMaker、drbd实现hadoop的namenode热备份HA
使用heartbeat、PaceMaker、drbd实现hadoop的namenode热备份HA
缺点:无法做到完全无缝切换,切换过程中有一小段时间没法提供服务,不超过30秒,还可以改的更短。
10.24.1.48 nd8-rack2-cloud
10.24.1.49 nd9-rack2-cloud
10.24.1.7 nd-rack2-cloud
<name>fs.default.name</name>
<value>hdfs://nd-rack2-cloud:9000</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/data1/hha/hdfs/name,/mnt/drbd_sim/hdfs/name</value>
<name>fs.checkpoint.dir</name>
<value>/data1/hha/hdfs/namesecondary,/mnt/drbd_sim/hdfs/namesecondary</value>
name node should store the temporary images to merge.
If this is a comma-delimited list of directories then the image is
replicated in all of the directories for redundancy.
</description>
</property>
<name>dfs.data.dir</name>
<value>/data0/hha/hdfs/data</value>
</property>
<name>mapred.local.dir</name>
<value>/mnt/drbd_sim/mapreduce</value>
node nd8-rack2-cloud
1 crc
node $id=”38a09183-9fc0-4ca6-8f81-349d94930261″ nd8-rack2-cloud \
attributes standby=”off”
node $id=”560cc9af-2fd7-4af3-ab51-d221f95d80f8″ nd7-rack2-cloud \
attributes standby=”off”
primitive drbd lsb:hadoop_name_dir_copy \
op monitor interval=”10s”
primitive failover-ip ocf:heartbeat:IPaddr \
params ip=”10.24.1.7″ nic=”bond0″ cidr_netmask=”255.255.255.0″ \
op monitor interval=”10s”
primitive hadoop_jobtracker lsb:hadoop_jobtracker \
op monitor interval=”10s”
primitive hadoop_namenode lsb:hadoop_namenode \
op monitor interval=”10s”
primitive hadoop_secondarynamenode lsb:hadoop_secondarynamenode \
op monitor interval=”10s”
primitive ping ocf:pacemaker:ping \
params host_list=”10.24.1.1″ multiplier=”100″ \
op monitor interval=”15s” timeout=”60s” \
op start interval=”0″ timeout=”90s” \
op stop interval=”0″ timeout=”100s”
group hadoop failover-ip drbd hadoop_namenode hadoop_secondarynamenode hadoop_jobtracker
clone clone_ping ping \
meta globally-unique=”false”
location location_hadoop hadoop \
rule $id=”location_hadoop-rule” -inf: not_defined pingd or pingd lte 0
location location_hadoop_prior hadoop 100: nd7-rack2-cloud
property $id=”cib-bootstrap-options” \
dc-version=”1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f” \
cluster-infrastructure=”Heartbeat” \
no-quorum-policy=”ignore” \
start-failure-is-fatal=”false” \
stonith-enabled=”false”
rsc_defaults $id=”rsc-options” \
2. 实现三个hadoop进程及一个拷贝数据的服务型LSB脚本,这些脚本还不通用,换一个系统则都需要修改:
# ll /etc/init.d/
…
hadoop_jobtracker
hadoop_namenode
hadoop_secondarynamenode
hadoop_name_dir_copy
…
######################################################################
RETVAL=0
# Source function library.
. /etc/rc.d/init.d/functions
# Source networking configuration.
[ -f /etc/sysconfig/network ] && . /etc/sysconfig/network
######################################################################
# See how we were called.
case “$1” in
start)
su – root -c “/sbin/drbdadm primary all”
su – root -c “mount /dev/drbd0 /mnt/drbd_sim”
su – root -c “sleep 1”
su – root -c “nohup /sbin/drbd_connect.sh >/dev/null 2>&1 &”
su – root -c “chmod -R 777 /mnt/drbd_sim/”
su – hha -c “mkdir -p /data1/hha/hdfs/name”
su – hha -c “\cp -rf /mnt/drbd_sim/* /data1/hha/hdfs/name/”
echo $'”Start” command: [OK]!’
RETVAL=0
;;
stop)
su – root -c “sleep 2”
su – root -c “umount /mnt/drbd_sim”
su – root -c “sleep 1”
su – root -c “/sbin/drbdadm disconnect all”
su – root -c “/sbin/drbdadm secondary all”
su – root -c “/sbin/drbdadm — –discard-my-data connect all”
### “drbdadm connect all” until “drbdadm cstate all” return Connected
su – root -c “nohup /sbin/drbd_connect.sh >/dev/null 2>&1 &”
echo $'”Stop” command: [OK]!’
RETVAL=0
;;
reload)
echo $'”reload” command is not supported!’
RETVAL=0
;;
restart)
echo $'”restart” command is not supported!’
RETVAL=0
;;
status)
echo $'”status” command is not supported!’
RETVAL=0
;;
*)
echo $”Usage: $0 {start|stop|restart|reload|status|help}”
exit 1
esac
exit $RETVAL
######################################################################
######################################################################
RETVAL=0
# Source function library.
. /etc/rc.d/init.d/functions
# Source networking configuration.
[ -f /etc/sysconfig/network ] && . /etc/sysconfig/network
######################################################################
# See how we were called.
case “$1” in
start)
su – hha -c “/home/hha/hadoop/bin/hadoop-daemon.sh start namenode”
echo $'”Start” command: [OK]!’
RETVAL=0
;;
stop)
su – hha -c “/home/hha/hadoop/bin/hadoop-daemon.sh stop namenode”
echo $'”Stop” command: [OK]!’
RETVAL=0
;;
reload)
echo $'”reload” command is not supported!’
RETVAL=0
;;
restart)
echo $'”restart” command is not supported!’
RETVAL=0
;;
status)
echo $'”status” command is not supported!’
RETVAL=0
;;
*)
echo $”Usage: $0 {start|stop|restart|reload|status|help}”
exit 1
esac
exit $RETVAL
######################################################################
######################################################################
RETVAL=0
# Source function library.
. /etc/rc.d/init.d/functions
# Source networking configuration.
[ -f /etc/sysconfig/network ] && . /etc/sysconfig/network
######################################################################
# See how we were called.
case “$1” in
start)
su – hha -c “/home/hha/hadoop/bin/hadoop-daemon.sh start secondarynamenode”
echo $'”Start” command: [OK]!’
RETVAL=0
;;
stop)
su – hha -c “/home/hha/hadoop/bin/hadoop-daemon.sh stop secondarynamenode”
echo $'”Stop” command: [OK]!’
RETVAL=0
;;
reload)
echo $'”reload” command is not supported!’
RETVAL=0
;;
restart)
echo $'”restart” command is not supported!’
RETVAL=0
;;
status)
echo $'”status” command is not supported!’
RETVAL=0
;;
*)
echo $”Usage: $0 {start|stop|restart|reload|status|help}”
exit 1
esac
exit $RETVAL
######################################################################
######################################################################
RETVAL=0
# Source function library.
. /etc/rc.d/init.d/functions
# Source networking configuration.
[ -f /etc/sysconfig/network ] && . /etc/sysconfig/network
######################################################################
# See how we were called.
case “$1” in
start)
su – hha -c “/home/hha/hadoop/bin/hadoop-daemon.sh start jobtracker”
echo $'”Start” command: [OK]!’
RETVAL=0
;;
stop)
su – hha -c “/home/hha/hadoop/bin/hadoop-daemon.sh stop jobtracker”
echo $'”Stop” command: [OK]!’
RETVAL=0
;;
reload)
echo $'”reload” command is not supported!’
RETVAL=0
;;
restart)
echo $'”restart” command is not supported!’
RETVAL=0
;;
status)
echo $'”status” command is not supported!’
RETVAL=0
;;
*)
echo $”Usage: $0 {start|stop|restart|reload|status|help}”
exit 1
esac
exit $RETVAL
######################################################################
# nohup /sbin/drbd_connect.sh >/dev/null 2>&1 &
while [ 1 ] ;
do
cstate=`/sbin/drbdadm cstate all`
if [ “$cstate” != “Connected” ] ;
then
/sbin/drbdadm connect all
else
exit 0
fi
sleep 3
done
# add the following line to “rc.local” file:
# nohup /sbin/heartbeat_rc_local.sh 10.24.1.1 >/dev/null 2>&1 &
while [ 1 ] ;
do
pingRtn=`/bin/ping $1 -c 3 -q | grep “packet loss” | awk ‘{ print $6}’`
if [ “$pingRtn” = “100%” -o “$pingRtn” = “+3” -o “$pingRtn” = “” ] ;
then
service heartbeat stop
else
service heartbeat start
fi
sleep 10
done
dd if=/dev/zero of=drbd_sim bs=4k count=5000
losetup /dev/loop0 drbd_sim
mkfs.ext3 -F /dev/loop0
drbdadm detach r0
dd if=/dev/zero of=/dev/loop0
drbdadm create-md r0
安装,这个drbd跟内核关系密切,最好用yum install安装,或拷贝相应内核的rpm包安装(如:drbd-8.0.16-5.el5.centos.x86_64.rpm和kmod-drbd-8.0.16-5.el5_3.x86_64.rpm):
# yum install kmod-drbd
模拟出分区,实际中需要用真正的分区来代替,否则重启就没有了:
# dd if=/dev/zero of=drbd_sim bs=4k count=5000
# losetup /dev/loop0 drbd_sim
# mkfs.ext3 -F /dev/loop0
# mkdir /mnt/drbd_sim
# mount -t ext3 /dev/loop0 /mnt/drbd_sim
# umount /mnt/drbd_sim
配置drbd,这里为8.0.16的方式,以后版本的配置文件应该会有变化:
# vi /etc/drbd.conf
resource r0 {
protocol C;
on nd7-rack2-cloud {
device /dev/drbd0;
disk /dev/loop0;
meta-disk internal;
address 10.24.1.47:7789;
}
on nd8-rack2-cloud {
device /dev/drbd0;
disk /dev/loop0;
meta-disk internal;
address 10.24.1.48:7789;
}
}
# dd if=/dev/zero of=/dev/loop0
# drbdadm create-md r0
# service drbd start
在某一台上执行:
# drbdsetup /dev/drbd0 primary -o
# cat /proc/drbd
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r—
ns:19960 nr:0 dw:0 dr:19960 al:0 bm:2 lo:0 pe:0 ua:0 ap:0
resync: used:0/61 hits:1246 misses:2 starving:0 dirty:0 changed:2
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
# mkfs.ext3 -F /dev/drbd0
# mount -t ext3 /dev/drbd0 /mnt/drbd_sim
每做一次切换,都必须执行drbdadm primary all,否则可能双方都会是secondary。
第一次同步的过程很慢,只有350KB/s,而且整个盘需要同步一次,时间会很漫长。
原来打算用NFS,测试发现行不通:一方断网,另外一方hadoop启动不了。
一方配置:
# mkdir /data-nfs
# chmod 777 /data-nfs
# vi /etc/exports
/data-nfs PeerIP(rw,ync,no_root_squash,anonuid=65534,anongid=65534)
# service nfs start
另一方配置:
# mkdir /mnt/hdfs-nfs
# mount -t nfs -o hard,intr,bg,timeo=50 PeerIP:/data-nfs /mnt/hdfs-nfs
# vi /etc/fstab
PeerIP:/data-nfs /mnt/hdfs-nfs nfs hard,intr,bg,timeo=50 0 0
然后交换过来配置。
<<=============================================
5. 启动:
Master主机:
# service heartbeat start
# su – hha -c “/home/hha/hadoop/bin/start-all.sh”
必须先启动heartbeat(就同时启动了namenode/secondarynamenode/jobtracker),然后再运行start-all.sh,把别的datanode/tasktracker之类的启动。
另一台备份主机:
# service heartbeat start
两台机器启动需要一点时间,估计不超过2分钟。
GIT-hash: d30881451c988619e243d6294a899139eed1183d build by mockbuild@v20z-x86-64.home.local, 2009-08-22 13:26:57
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r—
ns:122 nr:0 dw:28942 dr:212787 al:5 bm:99 lo:0 pe:0 ua:0 ap:0
resync: used:0/61 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:8807 misses:5 starving:0 dirty:0 changed:5
# ifconfig
inet addr:10.24.1.7 Bcast:10.24.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
25403 SecondaryNameNode
28125 Jps
25635 JobTracker
============
Last updated: Fri Nov 11 09:18:54 2011
Stack: Heartbeat
Current DC: nd8-rack2-cloud (38a09183-9fc0-4ca6-8f81-349d94930261) – partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, unknown expected votes
2 Resources configured.
============
Online: [ nd8-rack2-cloud nd7-rack2-cloud ]
Resource Group: hadoop
failover-ip (ocf::heartbeat:IPaddr): Started nd8-rack2-cloud
drbd (lsb:hadoop_name_dir_copy): Started nd8-rack2-cloud
hadoop_namenode (lsb:hadoop_namenode): Started nd8-rack2-cloud
hadoop_secondarynamenode (lsb:hadoop_secondarynamenode): Started nd8-rack2-cloud
hadoop_jobtracker (lsb:hadoop_jobtracker): Started nd8-rack2-cloud
Clone Set: clone_ping [ping]
Started: [ nd7-rack2-cloud nd8-rack2-cloud ]
GIT-hash: d30881451c988619e243d6294a899139eed1183d build by mockbuild@v20z-x86-64.home.local, 2009-08-22 13:26:57
0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r—
ns:0 nr:177 dw:45003 dr:111904 al:5 bm:105 lo:0 pe:0 ua:0 ap:0
resync: used:0/61 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/127 hits:9363 misses:5 starving:0 dirty:0 changed:5
触发切换在30秒,原slave变成了master,等网络起来后,原Master变成了slave。