Big Data and Hadoop for Absolute Beginners

Big data is a term used to refer large data sets that is very complex for traditional data-processing application softwares to adequately process it. Big data are originally associated with three key concepts which are, volume, variety, and velocity. This course covers main Big Data topics such as Hadoop, Map-Reduce and YARN.

365 days course access

Live instructor-led online classes

Industry-based projects

Learn all the concepts related to Big Data and Hadoop in a simplified way.

E-box Job Assistant

Get noticed by the top hiring companies

Guide from “Amphi”

The Super teacher


  • 4 hours of lecture Videos
  • 73 hands-on practice exercises
  • 12 Assessment exercises
  • 210 knowledge based questions
  • 3 Live connect sessions
             (Master classes)
  • Lifetime access
Contact Us
+91 95669 33778

Big Data and Hadoop for Absolute Beginners


This course helps you to understand the detailed introduction about the tools used to process Big Data, storage in HDFS and retrieval using HBase, Resource allocation by YARN Customizing, Testing and Debugging MapReduce.


Upon successful completion of the course, the learner will be able to :
  • Learn about Big Data, storing and retrieval of Big Data in HDFS and HBase.
  • Learn about processing, testing and debugging the Big Data using Map Reduce.
  • Learn about Resource allocation using YARN

Course Content

Introduction to Big Data

In this module, you will learn about the list of Big Data elements, various types of Big Data and the importance of structuring data. You will be learning the usage of Big Data across industries, career opportunities in Big Data, significance of Social Media data in business context, application of Big Data for fraud management in the financial sector, fraud management in Insurance using Big Data, application of Big Data in Retail Industry and the concept of distributed computing in relation to Big Data.

  • 1 Video
  • 2 Hours
  • 40 Problems

Understanding the Hadoop 2 Ecosystem

In this module, you will learn about the various components of the Hadoop 2 Ecosystem, process of storing files in Hadoop Distributed File System (HDFS), role of Hadoop MapReduce and the process of storing data with HBase. You will be learning how Hive aids mining Big Data, roles of various components of Hadoop ecosystem such as Zookeeper, Sqoop, Oozie and Flume, role of map and reduce in MapReduce, techniques to optimize MapReduce tasks, roles HBase and Hive play in processing of Big Data and some applications of MapReduce.

  • 1 Video
  • 2 Hours
  • 40 Problems

Storing data in Hadoop 2 - HDFS and HBase

In this module, you will learn about the Hadoop Distributed File System (HDFS), how to work with HDFS files, the role of HDFS Federation, the architecture and the role of HBase. You will be learning the characteristics of HBase schema design, how to implement basic programming for Hbase, the best capabilities of HBase and HDFS for effective data storage.

  • 1 Video
  • 8 Hours
  • 75 Problems

Working with MapReduce on YARN

In this module, you will learn about the MapReduce 2 framework, how to apply the steps to build and execute a basic MapReduce on YARN program, how to apply various techniques for designing MapReduce implementation, process of building joins with MapReduce and the techniques to build iterative MapReduce applications.

  • 1 Video
  • 7 Hours
  • 51 Problems

Customizing MapReduce fundamentals

In this module, you will learn about the implement controlling of MapReduce execution with InputFormat, implement reading data with custom RecordReader, organize output data with custom OutputFormats, how to write data with custom RecordWriter, how to optimize MapReduce execution with a combiner and the implement controlling reducer execution with partitioners.

  • 1 Video
  • 3 Hours
  • 34 Problems

Testing and Debugging MapReduce Applications

Module description In this module, you will perform unit testing of MapReduce applications using MRUnit, perform local testing of MapReduce applications and use logging for Hadoop testing.

  • 1 Video
  • 2 Hours
  • 30 Problems

Working with YARN

In this module, you will learn about the advantages of YARN over MapReduce in Hadoop 1.0, YARN ecosystem, YARN architecture, key concepts of YARN API and the schedule jobs with YARN.

  • 1 Video
  • 2 Hours
  • 25 Problems

Recommended Courses

You can opt for the following courses once you comeplete your ongoing course

About E-Box

E-Box is a Technology Enabled Active Learning and
Assessment platform for technology and engineering
domains apart from the basic LMS components like
quizzes, assignments, lesson components.

Connect with us