
hadoop完全分布式安装
环境
- 主机:3台centos7主机,用户名和密码都设置为root/root,hd/hd
- hadoop版本:hadoop2.8.0
- jdk版本:java1.8.0
配置节点
设置主机ip
主机 IP 对应角色
master 192.168.140.135 namendoe
slave1 192.168.140.136 datanode
slave2 192.168.140.137 datanode永久修改hostname 分别设为
hostnamectl set-hostname master
hostnamectl set-hostname slave1
hostnamectl set-hostname slave2设置 /etc/hosts文件 ,添加如下
192.168.140.135 master
192.168.140.136 slave1
192.168.140.137 slave2
查看java版本,确保java已经安装
[hd@master bigdata]$ java -version
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
[hd@master bigdata]$
关闭防火墙,关闭firewall
查看防火墙状态
[root@master ~]# firewall-cmd –state
running停止服务
[root@master ~]# systemctl stop firewalld.service
[root@master ~]# firewall-cmd –state
not running禁用防火墙
[root@master ~]# systemctl disable firewalld.service
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
[root@master ~]# firewall-cmd –state
not running
hd作为大数据环境用户,给hd用户增加sudo权限
[root@master ~]# ll /etc/sudoers
-r-xr-----. 1 root root 3907 Nov 4 2016 /etc/sudoers
[root@master ~]# chmod u+w /etc/sudoers
[root@master ~]# vi /etc/sudoers
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
hd ALL=(ALL)NOPASSWD:ALL
[root@master ~]# chmod u-w /etc/sudoers
切换到hd用户下,配置SSH免密码登录
ssh-keygen -t rsa -P ''
cp id_rsa.pub authorized_keys
配置jdk
[hd@master bigdata]$ sudo vi /etc/profile
export JAVA_HOME=/usr/local/bigdata/jdk1.8
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib/
export PATH=$PATH:$JAVA_HOME/bin
配置Hadoop
编辑hadoop-env.sh
#The java implementation to use. export JAVA_HOME=/usr/local/bigdata/jdk1.8
编辑core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/hd/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://master:9000</value> </property> </configuration>
编辑hdfs-site.xml
<configuration> <property> <name>dfs.name.dir</name> <value>/hd/hadoop/dfs/name</value> <description>Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.</description> </property> <property> <name>dfs.data.dir</name> <value>/hd/hadoop/dfs/data</value> <description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.</description> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.permissions</name> <value>false</value> <description>need not permissions</description> </property> </configuration>
编辑mapred-site.xml
<configuration> <property> <name>mapred.job.tracker</name> <value>master:49001</value> </property> <property> <name>mapred.local.dir</name> <value>/hd/hadoop/var</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
修改hadoop/etc/hadoop/slaves文件,修改如下
slave1 slave2
编辑yarn-site.xml
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <description>The address of the applications manager interface in the RM.</description> <name>yarn.resourcemanager.address</name> <value>${yarn.resourcemanager.hostname}:8032</value> </property> <property> <description>The address of the scheduler interface.</description> <name>yarn.resourcemanager.scheduler.address</name> <value>${yarn.resourcemanager.hostname}:8030</value> </property> <property> <description>The http address of the RM web application.</description> <name>yarn.resourcemanager.webapp.address</name> <value>${yarn.resourcemanager.hostname}:8088</value> </property> <property> <description>The https adddress of the RM web application.</description> <name>yarn.resourcemanager.webapp.https.address</name> <value>${yarn.resourcemanager.hostname}:8090</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>${yarn.resourcemanager.hostname}:8031</value> </property> <property> <description>The address of the RM admin interface.</description> <name>yarn.resourcemanager.admin.address</name> <value>${yarn.resourcemanager.hostname}:8033</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>2048</value> <discription>每个节点可用内存,单位MB,默认8182MB</discription> </property> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>2.1</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration>