Hadoop - Bigdata
There are no pre-requisites
for this course.
Basic knowledge of Core Java
and SQL is advantageous.
Course Content
Overview of Java
Classes and Objects
Garbage Collection and Modifiers
Inheritance, Aggregation, Polymorphism
Command line argument
Abstract class and Interfaces
String Handling
Exception Handling, Multithreading
Serialization and Advanced Topics
Collection Framework, GUI, JDBC
Unix History & Over View
Command line file-system browsing
Bash/CORN Shell
Users Groups and Permissions
VI Editor
Introduction to Process
Basic Networking
Shell Scripting live scenarios
Introduction to SQL, Data Definition Language (DDL)
Data Manipulation Language(DML)
Operator and Sub Query
Various Clauses, SQL Key Words
Joins, Stored Procedures, Constraints, Triggers
Cursors /Loops / IF Else / Try Catch, Index
Data Manipulation Language (Advanced)
Constraints, Triggers,
Views, Index Advanced
Hadoop - Bigdata
Introduction to Bigdata
Introduction and relevance
Uses of Big Data analytics in various industries like Telecom, E-
commerce, Finance and Insurance etc.
Problems with Traditional Large-Scale Systems
Hadoop (Big Data) Ecosystem
Motivation for Hadoop
Different types of projects by Apache
Role of projects in the Hadoop Ecosystem
Key technology foundations required for Big Data
Limitations and Solutions of existing Data Analytics Architecture
Comparison of traditional data management systems with Big Data
management systems
Evaluate key framework requirements for Big Data analytics
Hadoop Ecosystem & Hadoop 2.x core components
Explain the relevance of real-time data
Explain how to use big and real-time data as a Business
planning tool
Building Blocks
Quick tour of Java (As Hadoop is Written in Java , so it will help
us to understand it better)
Quick tour of Linux commands ( Basic Commands to traverse the Linux
Quick Tour of RDBMS Concepts (to use HIVE and Impala)
Quick hands on experience of SQL.
Introduction to Cloudera VM and usage instructions
Hadoop Cluster Architecture – Configuration Files
Hadoop Master-Slave Architecture
The Hadoop Distributed File System - data storage
Explain different types of cluster setups (Fully distributed/Pseudo
Hadoop Cluster set up - Installation
Hadoop 2.x Cluster Architecture
A Typical enterprise cluster – Hadoop Cluster Modes
Hadoop Core Components – HDFS & Map Reduce (YARN)
HDFS Overview & Data storage in HDFS
Get the data into Hadoop from local machine (Data Loading
Techniques) - vice versa
MapReduce Overview (Traditional way Vs. MapReduce way)
Concept of Mapper & Reducer
Understanding MapReduce program skeleton
Running MapReduce job in Command line/Eclipse
Develop MapReduce Program in JAVA
Develop MapReduce Program with the streaming API
Test and debug a MapReduce program in the design time
How Partitioners and Reducers Work Together
Writing Customer Partitioners Data Input and Output
Creating Custom Writable and Writable Comparable Implementations
Data Integration Using Sqoop and
Integrating Hadoop into an existing Enterprise
Loading Data from an RDBMS into HDFS by Using Sqoop
Managing Real-Time Data Using Flume
Accessing HDFS from Legacy Systems with FuseDFS and HttpFS
Introduction to Talend (community system)