This training is designed by Hadoop Experts to provide the knowledge and
skills in the field of Big Data and Hadoop and train you to become a successful
Hadoop Developer.
Hadoop Architecture - What is Big
Data, Hadoop Architecture, Hadoop ecosystem components, Hadoop Storage: HDFS,
Hadoop Processing: MapReduce Framework, Hadoop Server Roles: Namenode,
Secondary Namenode, and Datanode, Anatomy of File Write and Read.
http://www.exuberantsolutions.com/bigdata-hadoop_course.htm
Hadoop Cluster Configuration and Data Loading - Hadoop Cluster Architecture, Hadoop Cluster
Configuration files, Hadoop Cluster Modes, Multi-Node Hadoop Cluster, A Typical
Production Hadoop Cluster, MapReduce Job execution, Common Hadoop Shell
commands, Data Loading Techniques: FLUME, SQOOP, Hadoop Copy Commands, Hadoop
Project: Data Loading.
Hadoop MapReduce framework - Hadoop Data Types, Hadoop MapReduce paradigm, Map and Reduce tasks,
MapReduce Execution Framework, Partitioners and Combiners, Input Formats (Input
Splits and Records, Text Input, Binary Input, Multiple Inputs),Output Formats
(TextOutput, BinaryOutPut, Multiple Output),Hadoop Project: MapReduce
Programming.
Advance MapReduce and YARN (MRv2) - Custom Input Format, Error Handling, Tuning, Advance MapReduce, Fair and
Capacity Schedulers, Hadoop 2.0 New Features--namely, NameNode High
Availability, HDFS Federation, YARN etc., Programming in YARN, Running MRv1 in
YARN, Upgrade your existing code to MRv2, Hadoop Project: Advance MapReduce
programming and error handling.
Pig and Pig Latin - Installing and
Running Pig, Grunt, Pig's Data Model, Pig Latin, Developing & Testing Pig
Latin Scripts, Writing Evaluation, Filter, Load & Store Functions, Hadoop
Project: Pig Scripting.
Hive - Hive Architecture and
Installation, Comparison with Traditional Database, HiveQL: Data Types,
Operators and Functions, Hive Tables(Managed Tables and External Tables,
Partitions and Buckets, Storage Formats, Importing Data, Altering Tables,
Dropping Tables),Querying Data (Sorting And Aggregating, Map Reduce Scripts,
Joins & Subqueries, Views, Map and Reduce side Joins to optimize Query),User Defined Functions, Appending Data into existing Hive Table, Custom
Map/Reduce in Hive, Hadoop Project: Hive Scripting.
NoSQL Databases, HBase and ZooKeeper - Introduction to HBase, Client API's and their features, Available
Client, Hbase Architecture, MapReduce Integration, Advanced Usage, Advance
Indexing, Coprocessors, The Zookeeper Service: Data Modal, Operations,
Implementation, Consistency, Sessions, States, Hadoop Project: HBase tables.
Hadoop Project Discussion - In this module you will understand how multiple Hadoop ecosystem
components work together in a Hadoop implementation to solve Big Data problems.
We will also discuss multiple data sets and specifications of the project.