ABOUT THE COURSEThis course helps you to understand the Spark program flow, basic Scala constructs, RDD operations, querying data using Spark SQL and Spark Streaming to initialize, transform, deploy and monitor applications.
COURSE OBJECTIVESUpon successful completion of the course, the learner will be able to :
In this module, you will be able to understand to the difference between Spark and Hadoop framework, key components of Spark ecosystem, Spark program flow, how to work with basic Scala constructs and building programs in Spark.
In this module, you will be able to understand the creation and performance of RDD operations, how to pass functions to Spark, perform transformations and actions on RDD, how to work with key/value pairs and how to load and save data in various formats.
In this module, you will be able to understand the use of SchemaRDD in Spark programs, how to learn and query data with Apache Hive and JSON support, how to use Spark SQL JDBC server to run Spark SQL, how to use Spark SQL UDFs and Hive UDFs and Fine-Tune Spark SQL Performance.
In this module, you will be able to understand spark Streaming architecture and the concept of linking, how to initialize StreamingContext, input DStreams and receivers, various transformations on DStreams, how to deploy Spark streaming applications and Monitor streaming applications.
Module description In this module, you will be aware of Graphs and its computational features, GraphX and its use-cases, Machine Learning Tools and its Algorithms.
You can opt for the following courses once you complete your ongoing course
E-Box is a Technology Enabled Active Learning and
Assessment platform for technology and engineering
domains apart from the basic LMS components like
quizzes, assignments, lesson components.