Introduction to Hadoop

A hands-on workout in Hadoop, MapReduce and the art of thinking "parallel"

What's Inside

Course Description

Taught by a 4 person team including 2 Stanford-educated, ex-Googlers and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with Java and with billions of rows of data.

This course covers the individual components of Hadoop in great detail, and also gives you a higher level picture of how they interact with each other. The course will get you hands-on with Hadoop very early on. You'll learn how to set up your own cluster using both VMs and the Cloud.

What's Covered:

  • Build your Hadoop cluster:
    • Install Hadoop in Standalone, Pseudo-Distributed and Fully Distributed modes
    • Set up a hadoop cluster using Linux VMs.
    • Set up a cloud Hadoop cluster on AWS with Cloudera Manager.
    • Understand HDFS, MapReduce and YARN and their interaction
  • MapReduce : Understand how to setup and run a MapReduce job in Hadoop
  • HDFS & YARN: Namenode, Datanode, Resource manager, Node manager, the anatomy of a MapReduce application, YARN Scheduling, Configuring HDFS and YARN to performance tune your cluster.

Mail us about anything - anything! - and we will always reply :-)

What are the requirements?

  • You'll need an IDE where you can write Java code or open the source code that's shared. IntelliJ and Eclipse are both great options.
  • You'll need some background in Object-Oriented Programming, preferably in Java. All the source code is in Java and we dive right in without going into Objects, Classes etc
  • A bit of exposure to Linux/Unix shells would be helpful, but it won't be a blocker

What am I going to get from this course?

  • Self-sufficiently set up their own mini-Hadoop cluster whether it's a single node, a physical cluster or in the cloud.
  • Understand HDFS, MapReduce and YARN and how they interact with each other
  • Understand the basics of performance tuning and managing your own cluster

What is the target audience?

  • Yep! Analysts who want to leverage the power of HDFS where traditional databases don't cut it anymore
  • Yep! Engineers who want to develop complex distributed computing applications to process lot's of data
  • Yep! Data Scientists who want to add MapReduce to their bag of tricks for processing data

Course Curriculum

Get started now!



Certificate Available
438+ Students
30 Lectures
5+ Hours of Video
Lifetime Access
24/7 Support
Instructor Rating
Loonycorn

Loonycorn is comprised of a couple of individuals —Janani Ravi and Vitthal Srinivasan—who have honed their tech expertises at Google and Stanford. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Popular Bundles