Last week I attended an admin training about Hadoop, held by Cloudera in a comfortable and well prepared location in London. This 3-day course covers several topics of the Hadoop ecosystem, all within 500+ slides and some exercises. The range is from historical information, illustration of why Hadoop is needed, introduction to MapReduce and job scheduling up to planning, maintaining and troubleshooting a Hadoop cluster. Additional tools, e.g. Flume and Sqoop, are being discussed in an extra chapter also.
Even though the title suggests exclusively administration related topics, from my point of view this training is more a general introduction to Hadoop to get the “big picture” and basic ideas of it. Thereby it is not limited to system administrators, it fits best for developers, IT architects, simply anybody who wants to start diving into Hadoop. On the other side the training doesn’t explain specific operations related tasks in detail, it is somehow a high-level view to the system concentrated on the suggestions from Cloudera.
If you are responsible for maintaining a Hadoop cluster and understand the relationship of the involved daemons, have collected some experience, configured some parameters and run into troubles (and hopefully solved it afterwards) already, you will not benefit from this training. By the way, in this case you are outside the target audience of this training, so think twice in advance.
To put it in a nutshell:
This course is perfectly suited to get a basic understanding of the concepts behind Hadoop. HDFS and MapReduce are explained very well. I have benefited from it in that I now understand how the different daemons relate to each other, what they are for, as well as what to do in case of an (Hadoop related) emergency.
Now it’s time to gain experience and share it with the community.