Certified course on HADOOP ADMINISTRATION
Course
In Bangalore
Description
-
Type
Course
-
Level
Intermediate
-
Location
Bangalore
-
Class hours
48h
-
Duration
1 Week
-
Start date
Different dates available
Hadoop is a free, Java -based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes. Its distributed file system facilitates rapid data transfer rates among nodes and allows the system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic system failure, even if a significant number of nodes become inoperative.
Important information
Documents
- HADOOP administration.pdf
Facilities
Location
Start date
Start date
Reviews
Subjects
- Basics of Linux
- Basic Computer programming
Teachers and trainers (1)
LOGANATHAN VENKATASWAMY
TRAINER
Course programme
1. Introduction to Big Data,
- What is Big Data ?
- Big Data Facts
- The Three V’s of Big Data
- What is Hadoop ?
- Why learn Hadoop ?
- Relational Databases Vs. Hadoop
- Motivation for Hadoop
- 6 Key Hadoop Data Types
- What is HDFS ?
- HDFS components
- Understanding Block Storage
- The Name Node
- Data Node Failures
- HDFS Commands
- HDFS File Permissions
- Overview of MapReduce
- Understanding MapReduce
- The Map Phase
- The Reduce Phase
- WordCount in MapReduce
- Running MapReduce Job
- Single Node Cluster Configuration
- Multi-Node Cluster Configuration
- Checking HDFS Status
- Breaking the Cluster
- Copying Data Between Clusters
- Adding And Removing Cluster Nodes
- Rebalancing the cluster
- Name Node Metabata Backup
- Cluster Upgrading
- Sqoop
- Flume
- Hive
- Pig
- HBase
- Oozie
- Managing Jobs
- The FIFO Scheduler
- The Fair Schedule
- How to stop and start jobs running on the cluster
- General System conditions to Monitor
- Name Node and Job Tracker Web Uis
- View and Manage Hadoop’s Log files
- Ganglia Monitoring Tool
- Common cluster issues and their resolutions
- Benchmark your cluster’s performance
- How to use Sqoop to import data from RDBMSs to HDFS
- How to gather logs from multiple systems using Flume
- Features of Hive, Hbase and Pig
- How to populate HDFS from external Sources
Certified course on HADOOP ADMINISTRATION