- 論壇徽章:
- 0
|
<p><font class="Apple-style-span" color="#000080" size="2"><b>一、XenServer安裝</b></font></p> <p><font class="Apple-style-span" size="2">1.搭建完全的Hadoop分布式計(jì)算平臺(tái),至少需要2~3臺(tái)機(jī)器,這需要了解Hadoop的組成,從HDFS的角度包括NameNode(NN)和DataNode(DN),從Map/Reduce的角度包括JobTracker(JT)和TaskTracker(TT),其中NN和JT所在的主機(jī)稱為Master,可以分機(jī)器部署也可以部署在一臺(tái)機(jī)器上,除此之外的DN和TT稱為Slaves。如果是搭建單機(jī)環(huán)境,以上四部分也可以部署在同一臺(tái)機(jī)器上,因?yàn)槭稚嫌幸慌_(tái)4G內(nèi)存的機(jī)器,所以我們進(jìn)行完全分布式部署。</font></p> <p><font class="Apple-style-span" size="2">2.之所以選擇Xenserver,是因?yàn)楸萔mware Sphere更易于安裝配置,而且最重要的是免費(fèi),最新版是5.6,提供了XenMotion(相當(dāng)于VMotion)、Convert、存儲(chǔ)管理等高級(jí)功能,可惜免費(fèi)授權(quán)不提供XenServer的HA功能,用于實(shí)際業(yè)務(wù)系統(tǒng)缺少了一層保障。安裝光盤的ISO免費(fèi)下載,裸機(jī)直接安裝,然后在Windows上安裝XenCenter,使用XenCenter連接上裝有XenServer的服務(wù)器以后,需要先獲得免費(fèi)授權(quán),否則這臺(tái)服務(wù)器只能使用30天,點(diǎn)擊‘Tools’-‘License Manager’,在彈出的窗口中選中需要制作授權(quán)文件的XenServer,然后選‘Request Activation Keys...’,會(huì)彈出網(wǎng)頁,要求輸入一些信息,提交后會(huì)有包含授權(quán)文件的郵件發(fā)到郵箱里,還在在‘License Manager’窗口中,選‘Apply Activation Keys...’,選擇授權(quán)文件,這樣,XenServer就可以使用一年了。最后的效果如下圖:</font></p> <p><a href="http://blog.chinaunix.nethttp://blog.chinaunix.net/attachment/201105/15/93477_1305468326hAw0.png" target="_blank" target="_blank"><font class="Apple-style-span" size="2"><img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image_thumb[2]" border="0" alt="image_thumb[2]" src="http://blog.chinaunix.nethttp://blog.chinaunix.net/attachment/201105/15/93477_13054683287pGg.png" width="688" height="309"></font></a></p> <p><font class="Apple-style-span" size="2">3.開始在XenServer中安裝虛擬機(jī),只要先裝一臺(tái),其他的機(jī)器可有由模板生成,我習(xí)慣使用CentOS 5.5 X86_64,安裝過程跟VMware沒什么區(qū)別,你可以用你所知道的方法把ISO文件掛到XenCenter上,包括NFS,CIFS,ISCSI等等,當(dāng)然也可以直接用光盤:)</font></p> <blockquote> <p><font class="Apple-style-span" size="2">安裝完成以后,關(guān)閉虛擬機(jī),點(diǎn)擊右鍵‘Convert to Template’,然后從模板中生成3個(gè)虛擬機(jī)實(shí)例,啟動(dòng)以后配置相應(yīng)的IP地址和主機(jī)名,這三臺(tái)機(jī)器要能互相解析:</font></p> <p><font class="Apple-style-span" size="2">210.45.176.49 hadoop1.ahau.edu.cn hadoop1 NameNode和JobTracker Master主機(jī)</font></p> <p><font class="Apple-style-span" size="2">210.45.176.50 hadoop2.ahau.edu.cn hadoop2 DataNode和TaskTracker Slave主機(jī)</font></p> <p><font class="Apple-style-span" size="2">210.45.176.46 hadoop3.ahau.edu.cn hadoop3 DataNode和TaskTracker Slave主機(jī)</font></p> </blockquote> <p><b><font class="Apple-style-span" color="#000080" size="2">二、配置SSH、JAVA</font></b></p> <p><font class="Apple-style-span" size="2">4.在三臺(tái)機(jī)器上增加用戶grid,用于Hadoop的配置和運(yùn)行,并且都配置互相SSH 免密碼登錄,分別制作一對(duì)ssh密鑰,以hadoop1上的命令為例</font></p> <blockquote> <p><font class="Apple-style-span" size="2">$ssh-keygen –t rsa ##生成ssh密鑰對(duì)</font></p> </blockquote> <blockquote> <p><font class="Apple-style-span" size="2">$ssh-copy-id –i ~/.ssh/id_rsa.pub <a href="mailto:grid@hadoop2" target="_blank" target="_blank">grid@hadoop2</a> ##把自己的公鑰分別加到其他機(jī)器的authorized_keys文件中</font></p> </blockquote> <blockquote> <p><font class="Apple-style-span" size="2">互相加完以后不要忘記把自己的公鑰也加到authorized_keys里,否則啟動(dòng)Hadoop時(shí)會(huì)有提示,很討厭</font></p> <p><font class="Apple-style-span" size="2">$cat ~/.ssh/id_rsa.pub >> authorized_keys</font></p> </blockquote> <p><font class="Apple-style-span" size="2">5.在三臺(tái)機(jī)器上安裝JAVA環(huán)境,從Oracle的網(wǎng)站上下載最新的jdk,jdk-6u25-linux-x64.bin,在Hadoop1上安裝,安裝路徑為/usr/local/jdk1.6.0_25,三臺(tái)機(jī)器的安裝路徑最好一致,方便以后配置</font></p> <p><b><font class="Apple-style-span" color="#000080" size="2">三、配置Hadoop</font></b></p> <p><font class="Apple-style-span" size="2">6.在Hadoop1上下載Hadoop,穩(wěn)定版為0.20.203,解壓到/home/grid/hadoop目錄下,修改conf/hadoop-env.sh,至少要設(shè)置JAVA_HOME為JAVA的安裝路徑</font></p> <p><font class="Apple-style-span" size="2">7.Hadoop的配置文件被分為三個(gè),均在conf目錄下,core-site.xml,hdfs-site.xml和mapred-site.xml,這三個(gè)文件的配置示例在src/core/core-default.xml,src/hdfs/hdfs-default.xml,src/mapred/mapred-default.xml中,同時(shí)也是默認(rèn)配置,不要直接修改這三個(gè)目錄中的文件,如果需要修改將他們復(fù)制到conf目錄下的對(duì)應(yīng)文件后再修改</font></p> <p><font class="Apple-style-span" size="2">8.配置core-site.xml,添加如下行:</font></p> <p><font class="Apple-style-span" size="2"><configuration></font></p> <p><font class="Apple-style-span" size="2"><property> <br><name>hadoop.tmp.dir</name> <br><value>/home/grid/hadoop/tmp</value> ##設(shè)定Hadoop的臨時(shí)目錄 <br><description> </description> <br></property></font></p> <p><font class="Apple-style-span" size="2"><property> <br><name>fs.default.name</name> <br><value>hdfs://hadoop1.ahau.edu.cn:9100</value> ##設(shè)置文件系統(tǒng)的路徑 <br></property></font></p> <p><font class="Apple-style-span" size="2"></configuration> <br></font></p> <p><font class="Apple-style-span" size="2">9.配置hdfs-site.xml,添加如下行:</font></p> <p><font class="Apple-style-span" size="2"><configuration></font></p> <p><font class="Apple-style-span" size="2"><property> <br><name>dfs.relplication</name> ##HDFS的副本數(shù),默認(rèn)為3,如果DataNode的數(shù)量小于這個(gè)值會(huì)有問題 <br><value>2</value> <br></property></font></p> <p><font class="Apple-style-span" size="2"></configuration> <br></font></p> <p><font class="Apple-style-span" size="2">10.配置mapred-site.xml,添加如下行:</font></p> <p><font class="Apple-style-span" size="2"><configuration></font></p> <p><font class="Apple-style-span" size="2"><property> <br><name>mapred.job.tracker</name> <br><value>hadoop1.ahau.edu.cn:9200</value> ##設(shè)置MapReduce Job運(yùn)行的主機(jī)和端口 <br></property></font></p> <p><font class="Apple-style-span" size="2"></configuration> <br>11.以上為這三個(gè)文件最簡單的配置,其中hadoop.tmp.dir指定的目錄要在運(yùn)行Hadoop之前創(chuàng)建好,如果需要更進(jìn)一步的配置,可以參看src中的相應(yīng)文件</font></p> <p><font class="Apple-style-span" size="2">12.配置conf/masters和conf/slaves,增加主機(jī)名,一個(gè)一行</font></p> <blockquote> <p><font class="Apple-style-span" size="2">在conf/masters中添加Master的主機(jī)名:hadoop1.ahau.edu.cn</font></p> <p><font class="Apple-style-span" size="2">在conf/slaves中添加Slave的主機(jī)名:</font></p> <p><font class="Apple-style-span" size="2">hadoop2.ahau.edu.cn</font></p> <p><font class="Apple-style-span" size="2">hadoop3.ahau.edu.cn</font></p> </blockquote> <p><font class="Apple-style-span" size="2">13.將hadoop目錄拷貝到其他機(jī)器上,如果Java的安裝路徑不一樣,需要修改hadoop-env.sh文件</font></p> <blockquote> <p><font class="Apple-style-span" size="2">$scp –r hadoop <a href="mailto:grid@hadoop2:/home/grid" target="_blank" target="_blank">grid@hadoop2:/home/grid</a></font></p> </blockquote> <p><b><font class="Apple-style-span" color="#000080" size="2">四、運(yùn)行Hadoop</font></b></p> <p><font class="Apple-style-span" size="2">14.格式化分布式文件系統(tǒng)</font></p> <blockquote> <p><font class="Apple-style-span" size="2">$bin/hadoop namenode -format</font></p> </blockquote> <p><font class="Apple-style-span" size="2">15.啟動(dòng)Hadoop,最好在啟動(dòng)前檢查三臺(tái)主機(jī)的SELinux和Iptables是否關(guān)上,以免不必要的麻煩</font></p> <blockquote> <p><font class="Apple-style-span" size="2">在hadoop1的hadoop目錄中執(zhí)行:</font></p> <p><font class="Apple-style-span" size="2">$bin/start-all.sh </font></p> <p><font class="Apple-style-span" size="2">##啟動(dòng)所有進(jìn)程,腳本輸出會(huì)指出日志文件存放位置,從輸出可以看到先啟動(dòng)NameNode進(jìn)程,然后是DataNode,JobTracker,TaskTracker,Master會(huì)自動(dòng)啟動(dòng)Slave上的相關(guān)進(jìn)程,可以通過下面的命令檢查進(jìn)程的運(yùn)行情況</font></p> <p><font class="Apple-style-span" size="2">[grid@hadoop1 hadoop]$ /usr/local/jdk1.6.0_25/bin/jps <br>11905 NameNode <br>14863 DataNode <br>12036 SecondaryNameNode <br>12113 JobTracker <br>12421 Jps <br></font></p> </blockquote> <blockquote> <p><font class="Apple-style-span" size="2">也可以分部啟動(dòng):</font></p> <p><font class="Apple-style-span" size="2">$bin/hadoop-daemon.sh start namenode ##啟動(dòng)NameNode</font></p> <p><font class="Apple-style-span" size="2">$bin/hadoop-daemon.sh start datanode</font></p> </blockquote> <blockquote> <p><font class="Apple-style-span" size="2">查看文件系統(tǒng)的情況:</font></p> <p><font class="Apple-style-span" size="2">[grid@hadoop1 hadoop]$ bin/hadoop dfsadmin -report <br>Configured Capacity: 152406405120 (141.94 GB) <br>Present Capacity: 133594103808 (124.42 GB) <br>DFS Remaining: 133334999040 (124.18 GB) <br>DFS Used: 259104768 (247.1 MB) <br>DFS Used%: 0.19% <br>Under replicated blocks: 1 <br>Blocks with corrupt replicas: 0 <br>Missing blocks: 0</font></p> <p><font class="Apple-style-span" size="2">------------------------------------------------- <br>Datanodes available: 3 (3 total, 0 dead)</font></p> <p><font class="Apple-style-span" size="2">Name: 210.45.176.45:50010 <br>Decommission Status : Normal <br>Configured Capacity: 50802135040 (47.31 GB) <br>DFS Used: 86433792 (82.43 MB) <br>Non DFS Used: 6207848448 (5.78 GB) <br>DFS Remaining: 44507852800(41.45 GB) <br>DFS Used%: 0.17% <br>DFS Remaining%: 87.61% <br>Last contact: Sun May 15 21:32:42 CST 2011</font></p> <p> <font class="Apple-style-span" size="2"><br>Name: 210.45.176.50:50010 <br>Decommission Status : Normal <br>Configured Capacity: 50802135040 (47.31 GB) <br>DFS Used: 86335488 (82.34 MB) <br>Non DFS Used: 6420262912 (5.98 GB) <br>DFS Remaining: 44295536640(41.25 GB) <br>DFS Used%: 0.17% <br>DFS Remaining%: 87.19% <br>Last contact: Sun May 15 21:32:42 CST 2011</font></p> <p> <font class="Apple-style-span" size="2"><br>Name: 210.45.176.46:50010 <br>Decommission Status : Normal <br>Configured Capacity: 50802135040 (47.31 GB) <br>DFS Used: 86335488 (82.34 MB) <br>Non DFS Used: 6184189952 (5.76 GB) <br>DFS Remaining: 44531609600(41.47 GB) <br>DFS Used%: 0.17% <br>DFS Remaining%: 87.66% <br>Last contact: Sun May 15 21:32:42 CST 2011 <br></font></p> </blockquote> <blockquote> <p><font class="Apple-style-span" size="2">可以通過<a href="http://hadoop1.ahau.edu.cn:50070" target="_blank" target="_blank">http://hadoop1.ahau.edu.cn:50070</a>查看HDFS的情況,通過<a href="http://hadoop1.ahau.edu.cn:50030" target="_blank" target="_blank">http://hadoop1.ahau.edu.cn:50030</a> 查看MapReduce的情況</font></p> </blockquote> <blockquote> <p><font class="Apple-style-span" size="2">以下是一些常用的命令:</font></p> <p><font class="Apple-style-span" size="2">hadoop fs –ls 查看/usr/root目錄下的內(nèi)容,默認(rèn)如果不填路徑這就是當(dāng)前用戶路徑; <br>hadoop fs –rmr xxx xxx就是刪除目錄; <br>hadoop dfsadmin -report 這個(gè)命令可以全局的查看DataNode的情況; <br>hadoop job -list 后面增加參數(shù)是對(duì)于當(dāng)前運(yùn)行的Job的操作,例如list,kill等; <br>hadoop balancer 均衡磁盤負(fù)載的命令。</font></p> </blockquote> <p><font class="Apple-style-span" size="2">16.測試Hadoop</font></p> <blockquote> <p><font class="Apple-style-span" size="2">將輸入文件拷貝到分布式文件系統(tǒng): <br>$ bin/hadoop fs -mkdir input <br>$ bin/hadoop fs -put conf/core-site.xml input </font></p> <p><font class="Apple-style-span" size="2">運(yùn)行發(fā)行版提供的示例程序: <br>$ bin/hadoop jar hadoop-examples-0.20.203.0.jar grep input output 'fs[a-z.]+'</font></p> </blockquote> <blockquote> <p><font class="Apple-style-span" size="2">將輸出文件從分布式文件系統(tǒng)拷貝到本地文件系統(tǒng)查看: <br>$ bin/hadoop fs -get output output <br>$ cat output/*</font></p> <p><font class="Apple-style-span" size="2">或者</font></p> </blockquote> <blockquote> <p><font class="Apple-style-span" size="2">在分布式文件系統(tǒng)上查看輸出文件: <br>$ bin/hadoop fs -cat output/*</font></p> </blockquote> <p><font class="Apple-style-span" size="2">17.停止Hadoop</font></p> <blockquote> <p><font class="Apple-style-span" size="2">$bin/stop-all.sh</font></p> </blockquote> <p><font class="Apple-style-span" size="2">18.增加Slave節(jié)點(diǎn)hadoop4</font></p> <blockquote> <p><font class="Apple-style-span" size="2">只需要再新的機(jī)器上安裝java、配置ssh無密碼登錄,修改hadoop1上的slaves文件,增加hadoop4,然后把hadoop拷貝到hadoop4上,重新運(yùn)行bin/start-all.sh就可以了,非常方便,上面的bin/hadoop dfsadmin -report就是在增加了hadoop4以后的</font></p> </blockquote> <p><font class="Apple-style-span" size="2">至此基于XenServer的Hadoop分布式計(jì)算平臺(tái)就搭建完成了</font></p> |
|