Hadoop学习二 安装部署
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
? <name>fs.default.name</name>
? <value>dbrg-1:9000</value>
? <description>The name of the default file system. Either the literal string "local" or a host:port for DFS.</description>
</property>
<property>
? <name>mapred.job.tracker</name>
? <value>dbrg-1:9001</value>
? <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.</description>
</property>
<property>
? <name>hadoop.tmp.dir</name>
? <value>/home/dbrg/HadoopInstall/tmp</value>
? <description>A base for other temporary directories.</description>
</property>
<property>
? <name>dfs.name.dir</name>
? <value>/home/dbrg/HadoopInstall/filesystem/name</value>
? <description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>
</property>
<property>
? <name>dfs.data.dir</name>
? <value>/home/dbrg/HadoopInstall/filesystem/data</value>
? <description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description>
</property>
<property>
? <name>dfs.replication</name>
? <value>1</value>
? <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.</description>
</property>
</configuration>
??????
部署Hadoop
前面讲的这么多Hadoop的环境变量和配置文件都是在dbrg-1这台机器上的,现在需要将hadoop部署到其他的机器上,保证目录结构一致。
[dbrg@dbrg-1:~]$scp? -r? /home/dbrg/HadoopInstall? dbrg-2:/home/dbrg/
[dbrg@dbrg-1:~]$scp? -r? /home/dbrg/HadoopInstall? dbrg-3:/home/dbrg/
至此,可以说,Hadoop已经在各个机器上部署完毕了下面就让我们开始启动Hadoop吧
启动Hadoop
启动之前,我们先要格式化namenode,先进入~/HadoopInstall/hadoop目录,执行下面的命令
[dbrg@dbrg-1:hadoop]$bin/hadoop? namenode? -format
不出意外,应该会提示格式化成功。如果不成功,就去hadoop/logs/目录下去查看日志文件
下面就该正式启动hadoop啦,在bin/下面有很多启动脚本,可以根据自己的需要来启动。
* start-all.sh 启动所有的Hadoop守护。包括namenode, datanode, jobtracker, tasktrack
* stop-all.sh 停止所有的Hadoop
* start-mapred.sh 启动Map/Reduce守护。包括Jobtracker和Tasktrack
* stop-mapred.sh 停止Map/Reduce守护
* start-dfs.sh 启动Hadoop DFS守护.Namenode和Datanode
* stop-dfs.sh 停止DFS守护
在这里,简单启动所有守护
[dbrg@dbrg-1:hadoop]$bin/start-all.sh
同样,如果要停止hadoop,则
[dbrg@dbrg-1:hadoop]$bin/stop-all.sh
HDFS操作
运行bin/目录的hadoop命令,可以查看Haoop所有支持的操作及其用法,这里以几个简单的操作为例。
建立目录
[dbrg@dbrg-1:hadoop]$bin/hadoop? dfs? -mkdir? testdir
在HDFS中建立一个名为testdir的目录
复制文件
[dbrg@dbrg-1:hadoop]$bin/hadoop? dfs? -put? /home/dbrg/large.zip? testfile.zip
把本地文件large.zip拷贝到HDFS的根目录/user/dbrg/下,文件名为testfile.zip
查看现有文件
[dbrg@dbrg-1:hadoop]$bin/hadoop? dfs? -ls