Module – 1
Introduction To Hadoop Distributed File Sytem (HDFS)

Learning Objectives – In this module, you will understand what is HDFS, why it is required for running Map-Reduce and how it differs from other distributed file systems. You will also get a basic idea how data gets fetched and written on HDFS.

Topics – Design of HDFS, HDFS Concepts, Command Line Interface, Hadoop File Systems, Java Interface, Data Flow (Anatomy of a File Read, Anatomy of a File Write, Coherency Model), Parallel Copying with DISTCP, Hadoop Archives.


Setting Up Hadoop Cluster

Learning Objectives – After this module, you will get a clear understanding of How to setup Hadoop Cluster and about different configuration files which need to be edited for Cluster setup.

Topics – Cluster Specification, Cluster Setup and Installation, SSH Configuration, Hadoop Configuration (Configuration Management, Environment Settings, Important Hadoop Daemon Properties, Hadoop Daemon Addresses and Ports, Other Hadoop Properties, User Account Creation), Security, Benchmarking a Hadoop Cluster.


Understanding – Map-Reduce Basics and Map-Reduce Types and Formats

Learning Objectives – After this module, you will get an idea of how Map-Reduce framework works and why Map-Reduce is tightly coupled with HDFS. Also, you will learn what the different types of Input and Output formats are and why they are required.

Topics – Hadoop Data Types, Functional Programming Roots, Imperative vs Functional Programming, Concurrency and lock free data structure, Functional – Concept of Mappers, Functional – Concept of Reducers, The Execution Framework (Scheduling, Data/Code co-location, Synchronization, Error and Fault Handling), Functional – Concept of Partitioners, Functional – Concept of Combiners, Distributed File System, Hadoop Cluster Architecture, MapReduce Types, Input Formats (Input Splits and Records, Text Input, Binary Input, Multiple Inputs, Database Input and Output), OutPut Formats (TextOutput, BinaryOutPut, Multiple Outputs, Databaseoutput).



Learning Objectives – In this module you will learn What is Pig, In Which type of use case we can use Pig, How Pig is tightly coupled with Map-Reduce, along with an example.

Topics – Installing and Running Pig, Grunt, Pig’s Data Model, Pig Latin, Developing & Testing Pig Latin Scripts, Making Pig Fly, Writing Evaluation, Filter, Load & Store Functions, Pig and Other Members of the Hadoop Community.



Learning Objectives – This module will provide you with a clear understanding of What is HIVE, How you can load data into HIVE and query data from Hive and so on.

Topics – Installing Hive, Running Hive (Configuring Hive, Hive Services, MetaStore), Comparison with Traditional Database (Schema on Read Versus Schema on Write, Updates, Transactions and Indexes), HiveQL (Data Types, Operators and Functions), Tables (Managed Tables and External Tables, Partitions and Buckets, Storage Formats, Importing Data, Altering Tables, Dropping Tables), Querying Data (Sorting And Aggregating, Map Reduce Scripts, Joings & Subqueries & Views, Map and Reduce site Join to optimize Query), User Defined Functions, Appending Data into existing Hive Table, Custom Map/Reduce in Hive.



Learning Objectives – In this module you will acquire in-depth knowledge of What is Hbase, How you can load data into Hbase and query data from Hbase using client and so on.

Topics – Introduction, Installation, Client API – Basics, Client API – Advanced Features, Client API – Administrative Features, Available Client, Architecture, MapReduce Integration, Advanced Usage, Advance Indexing.



Learning Objectives – At the end of this module, you will learn about What is a Zookeeper, How it helps in monitoring a cluster and Why Hbase uses Zookeeper.

Topics – The Zookeeper Service (Data Modal, Operations, Implementation, Consistancy, Sessions, States), Building Applications with Zookeeper (Zookeeper in Production).



Learning Objectives – After this last module you will get to know What is Sqoop, How you can do Import and export In/from Hdfs and What is the Internal Architecture of Sqoop.

Topics – Database Imports, Working with Imported Data, Importing Large Objects, Performing Exports, Exports – A Deeper Look.


If you want to know more about HADOOP ADMIN ONLINE TRAINING do not hesitate to call +91-7774892805 or mail us on contact@intelogik.com