Tuesday, November 18, 2014

The Script Template For Checking On And Synchronizing Configuration Files To All Nodes In Hadoop

Here's the skeleton of shell script template:
for  i  in  $(cat  $HADOOP_HOME/etc/hadoop/slaves  |  grep  -v  "#")
do
  #......
done

It takes advantages of slaves file, in which all DataNodes are listed. In addition, we should append all the other nodes, say NameNode, in Hadoop exhaustively.

There are two scenarios that I commonly use the above shell in:

#1. Synchronizing Configuration Files

for i in $(cat $HADOOP_HOME/etc/hadoop/slaves | grep -v "#")
do
 echo '';
 echo $i;
 rsync -r --delete $HADOOP_HOME/etc/hadoop/ hadoop@$i:/home/supertool/hadoop-2.2.0/etc/hadoop/;
done

#2. Checking On Specific Processes

Every so often, we have to check out whether some specific processes have started or been killed on all related nodes after we executing commands like `start-yarn.sh`, `hadoop-daemon.sh start datanode`, `yarn-daemon.sh stop nodemanager`, etc. It would be time-saver if the script is applied.
for i in $(cat etc/hadoop/slaves | grep -v "#")
do
  echo ''
  echo $i
  ssh supertool@$i "/usr/java/jdk1.7.0_11/bin/jps | grep -i NodeManager"
done


© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu

No comments:

Post a Comment