- HADOOP – Big Data – Content
Hadoop
What is big Data
What is Hadoop
Relation between Big data and Hadoop
Need of going ahead with Hadoop
Challenges with Big Data
Storage
Processing
Comparison with other Technologies
RDBMS
DATA WAREHOUSE
TERADATA
Components of Hadoop Echo System
Storage Components
Processing Components
HDFS (Hadoop Distributed file System)
What is a cluster environment
Cluster Vs Hadoop cluster
Features of HDFS
Storage aspects of HDFS
Block
Configuring the Block size
Why HDFS Block size is so large
Design Principles of Block size
HDFS Architecture – 5 Daemons of Hadoop
Name Node
Data Node
Secondary Name Node
Job Tracker
Task Tracker
Replication in Hadoop – Fail over Mechanism
Data Storage in Data Nodes
Replication
Custom Replication
MapReduce
Why Map Reduce is essential in Hadoop
Processing Daemons of Hadoop
Job Tracker
Roles of Job Tracker
How to configure Job Tracker in Hadoop
Task Tracker
Roles of Task Tracker
Drawbacks W.R.T failure in cluster
Input Split
Need of Input Split
Input Split Size
Input split size Vs block size
Input Split Vs Mappers
Map Reduce Programming Model
Different phases of Map Reduce Algorithm
Data Types in Map Reduce
Basis Map Reduce program
Driver code
Mapper Code
Reducer Code
Combiner in Map Reduce
Practitioner in Map Reduce
Joins in map Reduce
Map side join
Reduce side join
Performance trade off
Map Reduce Streaming
Apache PIG
Introduction to PIG
Map Reduce Vs PIG
SQL Vs PIG
Data Types in PIG
Execution Modes of Pig ( Local/Distributed)
Execution Mechanism { Grunt Shell, Script }
Writing Simple pig script
Bags, Tuples, and Fields in PIG
UDF’s in PIG
HIVE
Need of Apache Hive
HIVE Architecture [ Driver, Compiler, Executer ] HIVE Query language
SQL Vs HIVE QL
Collection Data types in Hive [ Array, Struct, Map ] UDF’s in HIVE
UDAFs
UDTFs
SerDe [ Hive serializer / Deserializer ] SQOOP
Introduction
MySQL Initialization
Connecting RDBMS using SQOOP
Sqoop Commands
HBASE
Introduction
HDFS Vs HBase
HBase Architecture
MapReduce over HBase - Pre Requisites: Core JAVA + Linux Commands
Time Duration: 5 Weeks [30 Hrs + lab]
- Home
- About us
- Courses
- Advanced Courses
- Development
- Online Offer
- Awards
- Gallery
- New Batches
- Affiliate
- Contact Us