mapreduce - Hadoop uses one node to process data -

- March 15, 2014

i have hadoop 2.6.4 setup 1 master , 2 slaves. of nodes appear installed , can communicate each other , can ssh each other without passwords. uploaded 16gb text file dfs , ran simple modified wordcount example (code here) on test working.

hadoop jar test1.jar wordcount /user/text.txt /user/output

i ran code on master node , noticed master node doing of processing while slaves idle. (i monitored cpu workload) ran code on slave1 , noticed master , slave2 idle while slave1 did work. why processing done on node code submitted on? related configuration of hadoop or misunderstanding something?

the configuration of master

core-site

<configuration>     <property>         <name>fs.defaultfs</name>         <value>hdfs://master:9000</value>     </property> </configuration>

mapred-site

<configuration>     <property>         <name>mapred.job.tracker</name>         <value>master:54311</value>     </property> </configuration>

hdfs-site

<configuration>     <property>         <name>dfs.replication</name>         <value>1</value>     </property>     <property>         <name>dfs.namenode.name.dir</name>         <value>file:/usr/local/hadoop/hadoop_data/hdfs/namenode</value>     </property> </configuration>

yarn-site

<configuration>     <property>         <name>yarn.nodemanager.aux-services</name>         <value>mapreduce_shuffle</value>     </property>     <property>         <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>         <value>org.apache.hadoop.mapred.shufflehandler</value>     </property>     <property>         <name>yarn.resourcemanager.resource-tracker.address</name>         <value>master:8025</value>     </property>     <property>         <name>yarn.resourcemanager.scheduler.address</name>         <value>master:8030</value>     </property>     <property>         <name>yarn.resourcemanager.address</name>         <value>master:8050</value>     </property> </configuration>

masters:

master

slaves:

slave1 slave2

slave1 configuration:

core-site(same master)

mapred-site(same master)

hdfs-site

<configuration>     <property>         <name>dfs.replication</name>         <value>1</value>     </property>     <property>         <name>dfs.datanode.data.dir</name>         <value>file:/usr/local/hadoop/hadoop_data/hdfs/datanode</value>     </property> </configuration>

yarn-site(same master)

slaves:

 slave1  slave2

Search This Blog

Image

mapreduce - Hadoop uses one node to process data -

Comments

Post a Comment

Popular posts from this blog

PHP while loop dynamic rowspan -

javascript - image slideshow using canvas HTML5 -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -