HADOOP ADMINISTRATION

Certified course on HADOOP ADMINISTRATION

ISM UNIV
In Bangalore

Rs 8,000
You can also call the Study Centre
98450... More

Important information

  • Course
  • Intermediate
  • Bangalore
  • 48 hours of class
  • Duration:
    1 Week
  • When:
    Flexible
Description

Hadoop is a free, Java -based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes. Its distributed file system facilitates rapid data transfer rates among nodes and allows the system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic system failure, even if a significant number of nodes become inoperative.

Important information Venues

Where and when

Starts Location
Flexible
Bangalore
29/18 , 17TH E MAIN , 5TH BLOCK , RAJAJINAGAR , JEDARALLI , 560010, Karnataka, India
See map

What you'll learn on the course

Basics of Linux
Basic Computer programming

Teachers and trainers (1)

LOGANATHAN VENKATASWAMY
LOGANATHAN VENKATASWAMY
TRAINER

Course programme


1. Introduction to Big Data,
  • What is Big Data ?
  • Big Data Facts
  • The Three V’s of Big Data
2. Understanding Hadoop
  • What is Hadoop ?
  • Why learn Hadoop ?
  • Relational Databases Vs. Hadoop
  • Motivation for Hadoop
  • 6 Key Hadoop Data Types
3. The Hadoop Distributed File system (HDFS)
  • What is HDFS ?
  • HDFS components
  • Understanding Block Storage
  • The Name Node
  • Data Node Failures
  • HDFS Commands
  • HDFS File Permissions
4. The MapReduce Framework
  • Overview of MapReduce
  • Understanding MapReduce
  • The Map Phase
  • The Reduce Phase
  • WordCount in MapReduce
  • Running MapReduce Job
5. Planning Your Hadoop Cluster
  • Single Node Cluster Configuration
  • Multi-Node Cluster Configuration
6. Cluster Maintenance
  • Checking HDFS Status
  • Breaking the Cluster
  • Copying Data Between Clusters
  • Adding And Removing Cluster Nodes
  • Rebalancing the cluster
  • Name Node Metabata Backup
  • Cluster Upgrading
7. Installing and Mangaing Hadoop Ecosystem Projects
  • Sqoop
  • Flume
  • Hive
  • Pig
  • HBase
  • Oozie
8. Managing and Scheduling Jobs
  • Managing Jobs
  • The FIFO Scheduler
  • The Fair Schedule
  • How to stop and start jobs running on the cluster
9. Cluster Monitoring, Troubleshooting, and Optimizing
  • General System conditions to Monitor
  • Name Node and Job Tracker Web Uis
  • View and Manage Hadoop’s Log files
  • Ganglia Monitoring Tool
  • Common cluster issues and their resolutions
  • Benchmark your cluster’s performance
10. Populating HDFS from External Sources
  • How to use Sqoop to import data from RDBMSs to HDFS
  • How to gather logs from multiple systems using Flume
  • Features of Hive, Hbase and Pig
  • How to populate HDFS from external Sources

Compare this course with other similar courses
See all