Certified course on HADOOP ADMINISTRATION

Course

In Bangalore

₹ 8,000 VAT incl.

Description

  • Type

    Course

  • Level

    Intermediate

  • Location

    Bangalore

  • Class hours

    48h

  • Duration

    1 Week

  • Start date

    Different dates available

HADOOP ADMINISTRATION

Hadoop is a free, Java -based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes. Its distributed file system facilitates rapid data transfer rates among nodes and allows the system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic system failure, even if a significant number of nodes become inoperative.

Important information

Documents

  • HADOOP administration.pdf

Facilities

Location

Start date

Bangalore (Karnātaka)
See map
29/18 , 17TH E MAIN , 5TH BLOCK , RAJAJINAGAR , JEDARALLI , 560010

Start date

Different dates availableEnrolment now open

Questions & Answers

Add your question

Our advisors and other users will be able to reply to you

Who would you like to address this question to?

Fill in your details to get a reply

We will only publish your name and question

Reviews

Subjects

  • Basics of Linux
  • Basic Computer programming

Teachers and trainers (1)

LOGANATHAN VENKATASWAMY

LOGANATHAN VENKATASWAMY

TRAINER

Course programme


1. Introduction to Big Data,
  • What is Big Data ?
  • Big Data Facts
  • The Three V’s of Big Data
2. Understanding Hadoop
  • What is Hadoop ?
  • Why learn Hadoop ?
  • Relational Databases Vs. Hadoop
  • Motivation for Hadoop
  • 6 Key Hadoop Data Types
3. The Hadoop Distributed File system (HDFS)
  • What is HDFS ?
  • HDFS components
  • Understanding Block Storage
  • The Name Node
  • Data Node Failures
  • HDFS Commands
  • HDFS File Permissions
4. The MapReduce Framework
  • Overview of MapReduce
  • Understanding MapReduce
  • The Map Phase
  • The Reduce Phase
  • WordCount in MapReduce
  • Running MapReduce Job
5. Planning Your Hadoop Cluster
  • Single Node Cluster Configuration
  • Multi-Node Cluster Configuration
6. Cluster Maintenance
  • Checking HDFS Status
  • Breaking the Cluster
  • Copying Data Between Clusters
  • Adding And Removing Cluster Nodes
  • Rebalancing the cluster
  • Name Node Metabata Backup
  • Cluster Upgrading
7. Installing and Mangaing Hadoop Ecosystem Projects
  • Sqoop
  • Flume
  • Hive
  • Pig
  • HBase
  • Oozie
8. Managing and Scheduling Jobs
  • Managing Jobs
  • The FIFO Scheduler
  • The Fair Schedule
  • How to stop and start jobs running on the cluster
9. Cluster Monitoring, Troubleshooting, and Optimizing
  • General System conditions to Monitor
  • Name Node and Job Tracker Web Uis
  • View and Manage Hadoop’s Log files
  • Ganglia Monitoring Tool
  • Common cluster issues and their resolutions
  • Benchmark your cluster’s performance
10. Populating HDFS from External Sources
  • How to use Sqoop to import data from RDBMSs to HDFS
  • How to gather logs from multiple systems using Flume
  • Features of Hive, Hbase and Pig
  • How to populate HDFS from external Sources

Certified course on HADOOP ADMINISTRATION

₹ 8,000 VAT incl.