Following are the preliminary steps that has to be performed in each individual machine before the cluster installation :-
Set Up Password-less SSH
Verify SSH installation
The first step is to check whether SSH is installed on your nodes. We can easily do this by use of the “which” UNIX command:
[hadoop-user@master]$ which ssh /usr/bin/ssh
[hadoop-user@master]$ which sshd
[hadoop-user@master]$ which ssh-keygen /usr/bin/ssh-keygen
If you instead receive an error message such as this,
/usr/bin/which: no ssh in (/usr/bin:/bin:/usr/sbin…
install OpenSSH (www.openssh.com) via a Linux package manager or by downloading the source directly. (Better yet, have your system administrator do it for you.)
Generate SSH key pair
Having verified that SSH is correctly installed on all nodes of the cluster, we use sshkeygen on the master node to generate an RSA key pair. Be certain to avoid entering a passphrase, or you’ll have to manually enter that phrase every time the master node attempts to access another node.
[hadoop-user@master]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop-user/.ssh/id_rsa): Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop-user/.ssh/id_rsa. Your public key has been saved in /home/hadoop-user/.ssh/id_rsa.pub.
After creating your key pair, your public key will be of the form
[hadoop-user@master]$ more /home/hadoop-user/.ssh/id_rsa.pub ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA1WS3RG8LrZH4zL2/1oYgkV1OmVclQ2OO5vRi0Nd K51Sy3wWpBVHx82F3x3ddoZQjBK3uvLMaDhXvncJG31JPfU7CTAfmtgINYv0kdUbDJq4TKG/fuO5q J9CqHV71thN2M310gcJ0Y9YCN6grmsiWb2iMcXpy2pqg8UM3ZKApyIPx99O1vREWm+4moFTg YwIl5be23ZCyxNjgZFWk5MRlT1p1TxB68jqNbPQtU7fIafS7Sasy7h4eyIy7cbLh8x0/V4/mcQsY 5dvReitNvFVte6onl8YdmnMpAh6nwCvog3UeWWJjVZTEBFkTZuV1i9HeYHxpm1wAzcnf7az78jT IRQ== hadoop-user@master
and we next need to distribute this public key across your cluster.
Distribute public key and validate logins
Albeit a bit tedious, you’ll next need to copy the public key to every slave node as well as the master node:
[hadoop-user@master]$ scp ~/.ssh/id_rsa.pub hadoop-user@target:~/master_key
Manually log in to the target node and set the master key as an authorized key (or append to the list of authorized keys if you have others defined).
[hadoop-user@target]$ mkdir ~/.ssh
[hadoop-user@target]$ chmod 700 ~/.ssh
[hadoop-user@target]$ mv ~/master_key ~/.ssh/authorized_keys
[hadoop-user@target]$ chmod 600 ~/.ssh/authorized_keys
After generating the key, you can verify it’s correctly defined by attempting to log in to the target node from the master:
[hadoop-user@master]$ ssh target
The authenticity of host ‘target (xxx.xxx.xxx.xxx)’ can’t be established. RSA key fingerprint is 72:31:d8:1b:11:36:43:52:56:11:77:a4:ec:82:03:1d. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘target’ (RSA) to the list of known hosts. Last login: Sun Jan 4 15:32:22 2009 from master
After confirming the authenticity of a target node to the master node, you won’t be prompted upon subsequent login attempts.
[hadoop-user@master]$ ssh target Last login: Sun Jan 4 15:32:49 2009 from master
NTP Installation and Configuration for CentOS
ntp : ntpd server which continuously adjusts system time and utilities used to query and configure the ntpd daemon.
ntpdate : Utility to set the date and time via NTP.
ntp-doc : NTP documentation
Procedure: Setup NTPD on CentOS Linux
Open the terminal or login over the ssh session. You must login as as the root user. Type the following yum command to install ntp
# yum install ntp ntpdate ntp-doc
Turn on service, enter:
# chkconfig ntpd on
Synchronize the system clock with 0.pool.ntp.org server (use this command only once or as required):
# ntpdate pool.ntp.org
Start the NTP server. The following will continuously adjusts system time from upstream NTP server. No need to run ntpdate:
# /etc/init.d/ntpd start
Configure ntpd (optional)
Edit /etc/ntp.conf, enter:
# vi /etc/ntp.conf
Set public servers from the pool.ntp.org project:
The clocks of all the nodes in your cluster and the machine that runs the browser through which you access Ambari Web must be able to synchronize with each other.
Using a text editor, open the hosts file on every host in your cluster
Add a line for each host in your cluster. The line should consist of the IP address and the FQDN
Do not remove the following two lines from your host file, or various programs that require network functionality may fail.
127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6
Network Configuration File
chkconfig iptables off
You can restart iptables after setup is complete.
Disable the Selinux an all nodes
◾SELINUX=disabled —- # — change the value from enforcing to disabled
Check using the command : Sestatus
You can also change the policy live like this:
◾setenforce 0 ‘to disable
◾setenforce 1 ‘to enable
Reboot the VM for the change to reflect the change