Procedure of checking and configuring linux node
1. Check on hostname
To check whether current hostname is exactly what it should be.
hostname
2. Check on the type of file system
To make sure it is the same as all the other nodes in Hadoop cluster.
df -T
3. Change owner of all data disk to hadoop user
chown -R hadoop:hadoop /home/data*
4. Check and backup disk mount info
We can simply execute the output of mount.sh in case some mounted filepaths are lost upon restart of node.
su root --mount.sh-- n=1 ; for i in a b c d e f g h i j k l ; do a=`/sbin/blkid -s UUID | grep ^/dev/sd$i | awk '{print $2}'` ; echo mount $a /home/data$n ; n=`echo $n+1|bc` ; done > bash mount.sh mount UUID="09c42017-9308-45c3-9509-e77a2e99c732" /home/data1 mount UUID="72461da2-b0c0-432a-9b65-0ac5bc5bc69a" /home/data2 mount UUID="6d447f43-b2db-4f69-a3b2-a4f69f2544ea" /home/data3 mount UUID="37ca4fb8-377c-493d-9a4c-825f1500ae52" /home/data4 mount UUID="53334c93-13ff-41f5-8688-07023bd6f11a" /home/data5 mount UUID="10fa31f7-9c29-4190-8ecd-ec893d59634c" /home/data6 mount UUID="fe28b8dd-ff3b-49d9-87c6-6eee9f389966" /home/data7 mount UUID="5201d24b-9310-4cff-b3ad-5b09e47780a5" /home/data8 mount UUID="d3b85455-8b94-4817-b43e-69481f9c13c4" /home/data9 mount UUID="6f2630f1-7cfe-4cac-b52d-557f46779539" /home/data10 mount UUID="bafc742d-1477-439a-ade4-29711c5db840" /home/data11 mount UUID="bf6e36d8-1410-4547-853c-f541c9a07e52" /home/data12
We can append the output of mount.sh into /etc/rc.local, in which way the mount command will be invoked automatically every time node starts up.
5. Check on the version of operating system
lsb_release -a
6. Optimizing TCP parameters in sysctl
Append the following content to /etc/sysctl.conf.
fs.file-max = 800000 net.core.rmem_default = 12697600 net.core.wmem_default = 12697600 net.core.rmem_max = 873800000 net.core.wmem_max = 655360000 net.ipv4.tcp_rmem = 8192 262144 4096000 net.ipv4.tcp_wmem = 4096 262144 4096000 net.ipv4.tcp_mem = 196608 262144 3145728 net.ipv4.tcp_max_orphans = 300000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1 net.ipv4.ip_local_port_range = 1025 65535 net.ipv4.tcp_max_syn_backlog = 100000 net.ipv4.tcp_fin_timeout = 30 net.ipv4.tcp.keepalive_time = 1200 net.ipv4.tcp_max_tw_buckets = 5000 net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 1500 net.core.somaxconn=32768 vm.swappiness=0
Issue the following command to validate the previous setting.
/sbin/sysctl -p # List all cuurent parameters to double-check /sbin/sysctl -a
7. Max connections to a file
Check on current max connections to a file:
ulimit -n
Appending the following content to /etc/security/limits.confs so as to change it to 100000.
* soft nofile 100000 * hard nofile 100000
8. Check and sync /etc/hosts
Checkout a current /etc/hosts file from one of the existing nodes in Hadoop cluster, namely canonical node. Append the newly-added node's host in this file and synchronize it to all nodes.
9. Revise locale to en_US.UTF-8
Append in /etc/profile will just do the work.
export LANG=en_US.UTF-8 export LC_CTYPE=en_US.UTF-8 export LC_NUMERIC=en_US.UTF-8 export LC_TIME=en_US.UTF-8 export LC_COLLATE=en_US.UTF-8 export LC_MONETARY=en_US.UTF-8 export LC_MESSAGES=en_US.UTF-8 export LC_PAPER=en_US.UTF-8 export LC_NAME=en_US.UTF-8 export LC_ADDRESS=en_US.UTF-8 export LC_TELEPHONE=en_US.UTF-8 export LC_MEASUREMENT=en_US.UTF-8 export LC_IDENTIFICATION=en_US.UTF-8 export LC_ALL=en_US.UTF-8
10. Transfer .ssh directory to newly-added node
Since all nodes are sharing one set of .ssh directory, in this way, no-auth ssh among all the nodes is simple to achieve: Append id_rsa.pub in current authorized_keys and spread .ssh directory to all nodes.
11. Deploy Java & Hadoop environment
Copy java directory from canonical node to newly-added node, append the following environment variable to /etc/profile:
export JAVA_HOME=/usr/java/jdk1.7.0_11 export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin
Likewise, do the same to hadoop directory:
export HADOOP_HOME=/home/workspace/hadoop export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native/ export PATH=$PATH:$HADOOP_HOME/bin
At the same time, mkdir and chown the paths configured by dfs.datanode.data.dir parameter in hdfs-site.xml:
su hadoop for i in {1..12} do mkdir -p /home/data$i/hdfsdir/data chmod -R 755 /home/data$i/hdfsdir done
Lastly, append newly-added host to etc/hadoop/slaves and synchronize to all nodes.
su hadoop for i in $(cat $HADOOP_HOME/etc/hadoop/slaves | grep -v "#") do echo ''; echo $i; scp $HADOOP_HOME/etc/hadoop/slaves hadoop@$i:/home/workspace/hadoop/etc/hadoop/; done
12. Open iptables
This is well-explained in another post: Configure Firewall In Iptables For Hadoop Cluster.
13. Launch Hadoop services
Eventually, we simple invoke the following two commands so as to start DataNode and NodeManager service.
su hadoop $HADOOP_HOME/sbin/hadoop-daemon.sh start datanode $HADOOP_HOME/sbin/yarn-daemon.sh start nodemanager
After that, check whether the above two processes exist or not and look through the corresponding logs in $HADOOP_HOME/logs directory for the purpose of double-check.
The python script, which is the implementation of all the procedures listed above, can be found in my project in github.
© 2014-2017 jason4zhu.blogspot.com All Rights Reserved
If transfering, please annotate the origin: Jason4Zhu
No comments:
Post a Comment