ABOUT THE COURSEThis course helps you to understand the detailed introduction about the tools used to process Big Data, storage in HDFS and retrieval using HBase, Resource allocation by YARN Customizing, Testing and Debugging MapReduce.
COURSE OBJECTIVESUpon successful completion of the course, the learner will be able to :
In this module, you will learn about the list of Big Data elements, various types of Big Data and the importance of structuring data. You will be learning the usage of Big Data across industries, career opportunities in Big Data, significance of Social Media data in business context, application of Big Data for fraud management in the financial sector, fraud management in Insurance using Big Data, application of Big Data in Retail Industry and the concept of distributed computing in relation to Big Data.
In this module, you will learn about the various components of the Hadoop 2 Ecosystem, process of storing files in Hadoop Distributed File System (HDFS), role of Hadoop MapReduce and the process of storing data with HBase. You will be learning how Hive aids mining Big Data, roles of various components of Hadoop ecosystem such as Zookeeper, Sqoop, Oozie and Flume, role of map and reduce in MapReduce, techniques to optimize MapReduce tasks, roles HBase and Hive play in processing of Big Data and some applications of MapReduce.
In this module, you will learn about the Hadoop Distributed File System (HDFS), how to work with HDFS files, the role of HDFS Federation, the architecture and the role of HBase. You will be learning the characteristics of HBase schema design, how to implement basic programming for Hbase, the best capabilities of HBase and HDFS for effective data storage.
In this module, you will learn about the MapReduce 2 framework, how to apply the steps to build and execute a basic MapReduce on YARN program, how to apply various techniques for designing MapReduce implementation, process of building joins with MapReduce and the techniques to build iterative MapReduce applications.
In this module, you will learn about the implement controlling of MapReduce execution with InputFormat, implement reading data with custom RecordReader, organize output data with custom OutputFormats, how to write data with custom RecordWriter, how to optimize MapReduce execution with a combiner and the implement controlling reducer execution with partitioners.
Module description In this module, you will perform unit testing of MapReduce applications using MRUnit, perform local testing of MapReduce applications and use logging for Hadoop testing.
In this module, you will learn about the advantages of YARN over MapReduce in Hadoop 1.0, YARN ecosystem, YARN architecture, key concepts of YARN API and the schedule jobs with YARN.
You can opt for the following courses once you comeplete your ongoing course
E-Box is a Technology Enabled Active Learning and
Assessment platform for technology and engineering
domains apart from the basic LMS components like
quizzes, assignments, lesson components.