Authorised Cloudera Developer Training for Apache Hadoop | 4 Days

Xebia IT Architects India Private Limited
In Bangalore

Rs 74,400
VAT incl.
Compare this course with other similar courses
See all

Important information

  • Training
  • Beginner
  • Bangalore
  • Duration:
    4 Days

Xebia is the authorised Cloudera training partner. We are running more than 3 batches per month of Cloudera with certifictions.

Cloudera Developer Training for Apache Hadoop

Important information

Where and when

Starts Location
On request
Karnataka, India
See map

Frequent Asked Questions

· What are the objectives of this course?

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as: Using the Spark shell for interactive data analysis The features of Spark’s Resilient Distributed Datasets How Spark runs on a cluster Parallel programming with Spark Writing Spark applications Processing streaming data with Spark

· Who is it intended for?

This course is best suited to developers and engineers who have programming experience.

· Requirements

Knowledge of Java is strongly recommended and is required to complete the hands-on exercises.

What you'll learn on the course

Cloudera Developer

Teachers and trainers (1)

Xebia Xebia
Xebia Xebia

Course programme

Course Outline:

The Motivation for Hadoop

  • Problems with Traditional Large-Scale Systems
  • Introducing Hadoop
  • Hadoopable Problems

Hadoop: Basic Concepts and HDFS

  • The Hadoop Project and Hadoop Components
  • The Hadoop Distributed File System

Introduction to MapReduce

  • MapReduce Overview
  • Example: WordCount
  • Mappersn
  • Reducers

Hadoop Clusters and the Hadoop Ecosystem

  • Hadoop Cluster Overview
  • Hadoop Jobs and Tasks
  • Other Hadoop Ecosystem Components

Writing a MapReduce Program in Java

  • Basic MapReduce API Concepts
  • Writing MapReduce Drivers, Mappers, and Reducers in Java
  • Speeding Up Hadoop Development by Using


  • Differences Between the Old and New MapReduce APIs
  • Writing a MapReduce Program Using Streaming
  • Writing Mappers and Reducers with the Streaming API

Unit Testing MapReduce Programs

  • Unit Testing
  • The JUnit and MRUnit Testing Frameworks
  • Writing Unit Tests with MRUnit
  • Running Unit Tests

Delving Deeper into the Hadoop API

  • Using the ToolRunner Class
  • Setting Up and Tearing Down Mappers and Reducers
  • Decreasing the Amount of Intermediate Data with Combiners
  • Accessing HDFS Programmatically
  • Using The Distributed Cache
  • Using the Hadoop API’s Library of Mappers,

Reducers, and Partitioners Practical Development Tips and Techniques

  • Strategies for Debugging MapReduce Code
  • Testing MapReduce Code Locally by Using


  • Writing and Viewing Log Files
  • Retrieving Job Information with Counters
  • Reusing Objects
  • Creating Map-Only MapReduce Jobs

Partitioners and Reducers

  • How Partitioners and Reducers Work Together
  • Determining the Optimal Number of Reducers for a Job
  • Writing Customer Partitioners

Data Input and Output

  • Creating Custom Writable and WritableComparable Implementations
  • Saving Binary Data Using SequenceFile and Avro Data Files
  • Issues to Consider When Using File Compression
  • Implementing Custom InputFormats and Output Formats

Common MapReduce Algorithms

  • Sorting and Searching Large Data Sets
  • Indexing Data
  • Computing Term Frequency — Inverse Document Frequency
  • Calculating Word Co-Occurrence
  • Performing Secondary Sort

Joining Data Sets in MapReduce Jobs

  • Writing a Map-Side Join
  • Writing a Reduce-Side Join

Integrating Hadoop into the Enterprise Workflow

  • Integrating Hadoop into an Existing Enterprise
  • Loading Data from an RDBMS into HDFS by Using Sqoop
  • Managing Real-Time Data Using Flume
  • Accessing HDFS from Legacy Systems with FuseDFS and HttpFS

An Introduction to Hive, Imapala, and Pig

  • The Motivation for Hive, Impala, and Pig
  • Hive Overview
  • Impala Overview
  • Pig Overview
  • Choosing Between Hive, Impala, and Pig

An Introduction to Oozie

  • Introduction to Oozie
  • Creating Oozie Workflows

Compare this course with other similar courses
See all