Hadoop setup in ten seconds flat

I needed to get Apache hadoop setup quickly on my local machine recently, I needed to hunt around for the correct ports which I found here:

http://www.cloudera.com/blog/2009/08/hadoop-default-ports-quick-reference/

Installation involved downloading the hadoop distribution and creating the right configurations (core-site.xml, hdfs-site.xml and mapred-site.xml). It also involved formatting the namenode and starting everything up. It’s a pain to do manually so I wrote a script to do the whole lot to save time. A few caveats:

  • You need to set the JAVA_HOME location on your machine at the top of the script
  • Specify the distro name (hadoop-0.20.2) to match the apache download link
  • Configure passwordless SSH for your local machine

You can configure passwordless SSH as follows:

 Bash |  copy |? 
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa_for_hadoop
cat ~/.ssh/id_dsa_for_hadoop.pub >> ~/.ssh/authorized_keys

Hopefully that’ll save you a bunch of time! : )

About the author
Currently there is no additional info about this author.
Comment on this post