Group Services: Technology Consulting
phone +91-9999-283-283/9540-283-283
email info@sisoft.in
Sisoft

Course Details

Course Outline for Big Data - Hadoop

Audience

This course is designed for learners , who are willing to make career in Big Data - Hadoop

Pre-requisites:

Linux Shell Scripting, Java, SQL

Course Modules:

Module Name Course Content Duration
1.Big Data Hadoop Foundation Topics 1-9 30 hours (6 weeks)
2.Big Data Hadoop Professional All Topics 60 hours (12 weeks)

Course Outline:

Hadoop Architecture

Module 1
  • Big Data Overview
Module 2
  • Hadoop Architecture
  • Hadoop Ecosystem Components
  • Hadoop Storage : HDFS
Module 3
  • Hadoop Processing : Map Reduce Framework
  • Hadoop Server Roles
Module 4
  • Namenode, Secondary Namenode & data node
  • Anatomy of file read and write

Hadoop Cluster Configuration and Data Loading

Module 5
  • Hadoop Cluster Architecture
  • Hadoop Cluster Configuration files
  • Hadoop Cluster Mode
Module 6
  • Multi-Node Hadoop Cluste
  • A Typical Production Hadoop
  • Cluster MapReduce Job execution
  • Common Hadoop Shell commands
Module 7
  • Data Loading Techniques: FLUME, SQOOP
  • Hadoop Copy Commands

Hadoop MapReduce framework

Module 8
  • Hadoop Data Types
  • Hadoop MapReduce paradigm
  • Map and Reduce tasks
  • MapReduce Execution Framework
Module 9
  • Partitioners and Combiners
  • Input Formats (Input Splits and Records, Text Input, Binary Input, Multiple Inputs)
  • Output Formats (TextOutput, BinaryOutPut, Multiple Output)

Advance MapReduce and YARN (Mrv2)

Module 10
  • Custom Input Format
  • Error Handling
  • Tuning
  • Advance MapReduce
  • Fair and Capacity
  • Scheduler
Module 11
  • Hadoop 2.0 New Features
  • NameNode High Availability
  • HDFS Federation, YARN etc., 1 1
  • Programming in YARN,
  • Running Mrv1 in YARN,
  • Upgrade your existing code to Mrv2, 1 1

Pig and Pig Latin

Module 12
  • Installing and Running Pig
  • Grunt
  • Pig's Data Model
  • Pig Latin
  • Developing & Testing Pig Latin Scripts
  • Writing Evaluation Filter, Load & Store Functions

Hive

Module 13
  • Hive Architecture and Installation
  • Comparison with Traditional Database
  • HiveQL: Data Types, Operators and Functions
  • Hive Tables(Managed Tables and External Tables, Partitions and Buckets
  • Storage Formats Importing Data, Altering Tables, Dropping Tables)
Module 14
  • Storage Formats Importing Data, Altering Tables, Dropping Tables)
  • Map Reduce Scripts
  • Joins & Subqueries
  • Views
  • Map and Reduce side Joins to optimize Query)
  • Defined Functions,
  • Appending Data into existing Hive Table
  • Custom Map/Reduce in Hive,

Hadoop Advance

Module 15
  • Introduction to HBase
  • Client API's and their features
  • Hbase Architecture,2 2
  • MapReduce Integration,Advanced Usage, Advance Indexing, Coprocessors
Module 16
  • Introduction to spark
  • Spark Advantages
  • Spark Architecture
Module 17
  • Why Oozie?
  • Installation of Oozie
  • Workflow Engine
  • Job Processin
  • Security
Module 18
  • The Zookeeper Service:
  • Data Modal, Operations, Implementation,Consistency, Sessions, States.