Apache Spark & Scala

4.6
6 opinions
  • Training was in depth and the practice exercises included were really helpful.
    |
  • Fabulous training; trainer has external knowledge on the subject. Also, the course content was effective.
    |
  • Trainer was good and helpful.
    |

Course

Online

Price on request

Description

  • Type

    Course

  • Methodology

    Online

Simplilearn is the World’s Largest Certification Training Provider, with over 400,000+ professionals trained globally
Trusted by the Fortune 500 companies as their learning provider for career growth and training
2000+ certified and experienced trainers conduct trainings for various courses across the globe
All our Courses are designed and developed under a tried and tested Unique Learning Framework that is proven to deliver 98.6% pass rate in first attempt.

Questions & Answers

Add your question

Our advisors and other users will be able to reply to you

Who would you like to address this question to?

Fill in your details to get a reply

We will only publish your name and question

Reviews

4.6
excellent
  • Training was in depth and the practice exercises included were really helpful.
    |
  • Fabulous training; trainer has external knowledge on the subject. Also, the course content was effective.
    |
  • Trainer was good and helpful.
    |
100%
4.5
excellent

Course rating

Recommended

Centre rating

Ramana Venkata Kare

4.5
14/03/2023
What I would highlight: Training was in depth and the practice exercises included were really helpful.
Would you recommend this course?: Yes

Santosh Srivastava

4.5
14/03/2022
What I would highlight: Fabulous training; trainer has external knowledge on the subject. Also, the course content was effective.
Would you recommend this course?: Yes

Pawan Preet Bhatia

5.0
14/03/2021
What I would highlight: Trainer was good and helpful.
Would you recommend this course?: Yes

Sudhanshu Dhall

4.0
14/03/2020
What I would highlight: Course was very helpful to understand the concepts. Also, the trainer was excellent.
Would you recommend this course?: Yes

Binu Nair

5.0
17/03/2014
What I would highlight: Good course on PRINCE2; well organized by Simplilearn.
Would you recommend this course?: Yes

Pramukh N Vasist

4.5
12/03/2014
What I would highlight: Excellent training done by Simplilearn Team.
Would you recommend this course?: Yes
*All reviews collected by Emagister & iAgora have been verified

This centre's achievements

2017
2016

All courses are up to date

The average rating is higher than 3.7

More than 50 reviews in the last 12 months

This centre has featured on Emagister for 8 years

Subjects

  • Apache

Course programme

Course Preview Course Agenda
  • Apache Spark & Scala
    • Lesson 00 - Course Overview
      • 0.1 Introduction
      • 0.2 Course Objectives
      • 0.3 Course Overview
      • 0.4 Target Audience
      • 0.5 Course Prerequisites
      • 0.6 Value to the Professionals
      • 0.7 Value to the Professionals (contd.)
      • 0.8 Value to the Professionals (contd.)
      • 0.9 Lessons Covered
      • 0.10 Conclusion
    • Lesson 01 - Introduction to Spark
      • 1.1 Introduction
      • 1.2 Objectives
      • 1.3 Evolution of Distributed Systems
      • 1.4 Need of New Generation Distributed Systems
      • 1.5 Limitations of MapReduce in Hadoop
      • 1.6 Limitations of MapReduce in Hadoop (contd.)
      • 1.7 Batch vs. Real-Time Processing
      • 1.8 Application of Stream Processing
      • 1.9 Application of In-Memory Processing
      • 1.10 Introduction to Apache Spark
      • 1.11 Components of a Spark Project
      • 1.12 History of Spark
      • 1.13 Language Flexibility in Spark
      • 1.14 Spark Execution Architecture
      • 1.15 Automatic Parallelization of Complex Flows
      • 1.16 Automatic Parallelization of Complex Flows-Important Points
      • 1.17 APIs That Match User Goals
      • 1.18 Apache Spark-A Unified Platform of Big Data Apps
      • 1.19 More Benefits of Apache Spark
      • 1.20 Running Spark in Different Modes
      • 1.21 Installing Spark as a Standalone Cluster-Configurations
      • 1.22 Installing Spark as a Standalone Cluster-Configurations
      • 1.23 Demo-Install Apache Spark
      • 1.24 Demo-Install Apache Spark
      • 1.25 Overview of Spark on a Cluster
      • 1.26 Tasks of Spark on a Cluster
      • 1.27 Companies Using Spark-Use Cases
      • 1.28 Hadoop Ecosystem vs. Apache Spark
      • 1.29 Hadoop Ecosystem vs. Apache Spark (contd.)
      • 1.30 Quiz
      • 1.31 Summary
      • 1.32 Summary (contd.)
      • 1.33 Conclusion
    • Lesson 02 - Introduction to Programming in Scala
      • 2.1 Introduction
      • 2.2 Objectives
      • 2.3 Introduction to Scala
      • 2.4 Features of Scala
      • 2.5 Basic Data Types
      • 2.6 Basic Literals
      • 2.7 Basic Literals (contd.)
      • 2.8 Basic Literals (contd.)
      • 2.9 Introduction to Operators
      • 2.10 Types of Operators
      • 2.11 Use Basic Literals and the Arithmetic Operator
      • 2.12 Demo Use Basic Literals and the Arithmetic Operator
      • 2.13 Use the Logical Operator
      • 2.14 Demo Use the Logical Operator
      • 2.15 Introduction to Type Inference
      • 2.16 Type Inference for Recursive Methods
      • 2.17 Type Inference for Polymorphic Methods and Generic Classes
      • 2.18 Unreliability on Type Inference Mechanism
      • 2.19 Mutable Collection vs. Immutable Collection
      • 2.20 Functions
      • 2.21 Anonymous Functions
      • 2.22 Objects
      • 2.23 Classes
      • 2.24 Use Type Inference, Functions, Anonymous Function, and Class
      • 2.25 Demo Use Type Inference, Functions, Anonymous Function and Class
      • 2.26 Traits as Interfaces
      • 2.27 Traits-Example
      • 2.28 Collections
      • 2.29 Types of Collections
      • 2.30 Types of Collections (contd.)
      • 2.31 Lists
      • 2.32 Perform Operations on Lists
      • 2.33 Demo Use Data Structures
      • 2.34 Maps
      • 2.35 Maps-Operations
      • 2.36 Pattern Matching
      • 2.37 Implicits
      • 2.38 Implicits (contd.)
      • 2.39 Streams
      • 2.40 Use Data Structures
      • 2.41 Demo Perform Operations on Lists
      • 2.42 Quiz
      • 2.43 Summary
      • 2.44 Summary (contd.)
      • 2.45 Conclusion
    • Lesson 03 - Using RDD for Creating Applications in Spark
      • 3.1 Introduction
      • 3.2 Objectives
      • 3.3 RDDs API
      • 3.4 Features of RDDs
      • 3.5 Creating RDDs
      • 3.6 Creating RDDs—Referencing an External Dataset
      • 3.7 Referencing an External Dataset—Text Files
      • 3.8 Referencing an External Dataset—Text Files (contd.)
      • 3.9 Referencing an External Dataset—Sequence Files
      • 3.10 Referencing an External Dataset—Other Hadoop Input Formats
      • 3.11 Creating RDDs—Important Points
      • 3.12 RDD Operations
      • 3.13 RDD Operations—Transformations
      • 3.14 Features of RDD Persistence
      • 3.15 Storage Levels Of RDD Persistence
      • 3.16 Choosing The Correct RDD Persistence Storage Level
      • 3.17 Invoking the Spark Shell
      • 3.18 Importing Spark Classes
      • 3.19 Creating the SparkContext
      • 3.20 Loading a File in Shell
      • 3.21 Performing Some Basic Operations on Files in Spark Shell RDDs
      • 3.22 Packaging a Spark Project with SBT
      • 3.23 Running a Spark Project With SBT
      • 3.24 Demo-Build a Scala Project
      • 3.25 Build a Scala Project
      • 3.26 Demo-Build a Spark Java Project
      • 3.27 Build a Spark Java Project
      • 3.28 Shared Variables—Broadcast
      • 3.29 Shared Variables—Accumulators
      • 3.30 Writing a Scala Application
      • 3.31 Demo-Run a Scala Application
      • 3.32 Run a Scala Application
      • 3.33 Demo-Write a Scala Application Reading the Hadoop Data
      • 3.34 Write a Scala Application Reading the Hadoop Data
      • 3.35 Demo-Run a Scala Application Reading the Hadoop Data
      • 3.36 Run a Scala Application Reading the Hadoop Data
      • 3.37 Scala RDD Extensions
      • 3.38 DoubleRDD Methods
      • 3.39 PairRDD Methods—Join
      • 3.40 PairRDD Methods—Others
      • 3.41 Java PairRDD Methods
      • 3.42 Java PairRDD Methods (contd.)
      • 3.43 General RDD Methods
      • 3.44 General RDD Methods (contd.)
      • 3.45 Java RDD Methods
      • 3.46 Java RDD Methods (contd.)
      • 3.47 Common Java RDD Methods
      • 3.48 Spark Java Function Classes
      • 3.49 Method for Combining JavaPairRDD Functions
      • 3.50 Transformations in RDD
      • 3.51 Other Methods
      • 3.52 Actions in RDD
      • 3.53 Key-Value Pair RDD in Scala
      • 3.54 Key-Value Pair RDD in Java
      • 3.55 Using MapReduce and Pair RDD Operations
      • 3.56 Reading Text File from HDFS
      • 3.57 Reading Sequence File from HDFS
      • 3.58 Writing Text Data to HDFS
      • 3.59 Writing Sequence File to HDFS
      • 3.60 Using GroupBy
      • 3.61 Using GroupBy (contd.)
      • 3.62 Demo-Run a Scala Application Performing GroupBy Operation
      • 3.63 Run a Scala Application Performing GroupBy Operation
      • 3.64 Demo-Run a Scala Application Using the Scala Shell
      • 3.65 Run a Scala Application Using the Scala Shell
      • 3.66 Demo-Write and Run a Java Application
      • 3.67 Write and Run a Java Application
      • 3.68 Quiz
      • 3.69 Summary
      • 3.70 Summary (contd.)
      • 3.71 Conclusion
    • Lesson 04 - Running SQL Queries Using Spark SQL
      • 4.1 Introduction
      • 4.2 Objectives
      • 4.3 Importance of Spark SQL
      • 4.4 Benefits of Spark SQL
      • 4.5 DataFrames
      • 4.6 SQLContext
      • 4.7 SQLContext (contd.)
      • 4.8 Creating a DataFrame
      • 4.9 Using DataFrame Operations
      • 4.10 Using DataFrame Operations (contd.)
      • 4.11 Demo-Run SparkSQL with a Dataframe
      • 4.12 Run SparkSQL with a Dataframe
      • 4.13 Interoperating with RDDs
      • 4.14 Using the Reflection-Based Approach
      • 4.15 Using the Reflection-Based Approach (contd.)
      • 4.16 Using the Programmatic Approach
      • 4.17 Using the Programmatic Approach (contd.)
      • 4.18 Demo-Run Spark SQL Programmatically
      • 4.19 Run Spark SQL Programmatically
      • 4.20 Data Sources
      • 4.21 Save Modes
      • 4.22 Saving to Persistent Tables
      • 4.23 Parquet Files
      • 4.24 Partition Discovery
      • 4.25 Schema Merging
      • 4.26 JSON Data
      • 4.27 Hive Table
      • 4.28 DML Operation-Hive Queries
      • 4.29 Demo-Run Hive Queries Using Spark SQL
      • 4.30 Run Hive Queries Using Spark SQL
      • 4.31 JDBC to Other Databases
      • 4.32 Supported Hive Features
      • 4.33 Supported Hive Features (contd.)
      • 4.34 Supported Hive Data Types
      • 4.35 Case Classes
      • 4.36 Case Classes (contd.)
      • 4.37 Quiz
      • 4.38 Summary
      • 4.39 Summary (contd.)
      • 4.40 Conclusion
    • Lesson 05 - Spark Streaming
      • 5.1 Introduction
      • 5.2 Objectives
      • 5.3 Introduction to Spark Streaming
      • 5.4 Working of Spark Streaming
      • 5.5 Features of Spark Streaming
      • 5.6 Streaming Word Count
      • 5.7 Micro Batch
      • 5.8 DStreams
      • 5.9 DStreams (contd.)
      • 5.10 Input DStreams and Receivers
      • 5.11 Input DStreams and Receivers (contd.)
      • 5.12 Basic Sources
      • 5.13 Advanced Sources
      • 5.14 Advanced Sources-Twitter
      • 5.15 Transformations on DStreams
      • 5.16 Transformations on Dstreams (contd.)
      • 5.17 Output Operations on DStreams
      • 5.18 Design Patterns for Using ForeachRDD
      • 5.19 DataFrame and SQL Operations
      • 5.20 DataFrame and SQL Operations (contd.)
      • 5.21 Checkpointing
      • 5.22 Enabling Checkpointing
      • 5.23 Socket Stream
      • 5.24 File Stream
      • 5.25 Stateful Operations
      • 5.26 Window Operations
      • 5.27 Types of Window Operations
      • 5.28 Types of Window Operations Types (contd.)
      • 5.29 Join Operations-Stream-Dataset Joins
      • 5.30 Join Operations-Stream-Stream Joins
      • 5.31 Monitoring Spark Streaming Application
      • 5.32 Performance Tuning-High Level
      • 5.33 Performance Tuning-Detail Level
      • 5.34 Demo-Capture and Process the Netcat Data
      • 5.35 Capture and Process the Netcat Data
      • 5.36 Demo-Capture and Process the Flume Data
      • 5.37 Capture and Process the Flume Data
      • 5.38 Demo-Capture the Twitter Data
      • 5.39 Capture the Twitter Data
      • 5.40 Quiz
      • 5.41 Summary
      • 5.42 Summary (contd.)
      • 5.43 Conclusion
    • Lesson 06 - Spark ML Programming
      • 6.1 Introduction
      • 6.2 Objectives
      • 6.3 Introduction to Machine Learning
      • 6.4 Common Terminologies in Machine Learning
      • 6.5 Applications of Machine Learning
      • 6.6 Machine Learning in Spark
      • 6.7 Spark ML API
      • 6.8 DataFrames
      • 6.9 Transformers and Estimators
      • 6.10 Pipeline
      • 6.11 Working of a Pipeline
      • 6.12 Working of a Pipeline (contd.)
      • 6.13 DAG Pipelines
      • 6.14 Runtime Checking
      • 6.15 Parameter Passing
      • 6.16 General Machine Learning Pipeline-Example
      • 6.17 General Machine Learning Pipeline-Example (contd.)
      • 6.18 Model Selection via Cross-Validation
      • 6.19 Supported Types, Algorithms, and Utilities
      • 6.20 Data Types
      • 6.21 Feature Extraction and Basic Statistics
      • 6.22 Clustering
      • 6.23 K-Means
      • 6.24 K-Means (contd.)
      • 6.25 Demo-Perform Clustering Using K-Means
      • 6.26 Perform Clustering Using K-Means
      • 6.27 Gaussian Mixture
      • 6.28 Power Iteration Clustering (PIC)
      • 6.29 Latent Dirichlet Allocation (LDA)
      • 6.30 Latent Dirichlet Allocation (LDA) (contd.)
      • 6.31 Collaborative Filtering
      • 6.32 Classification
      • 6.33 Classification (contd.)
      • 6.34 Regression
      • 6.35 Example of Regression
      • 6.36 Demo-Perform Classification Using Linear Regression
      • 6.37 Perform Classification Using Linear Regression
      • 6.38 Demo-Run Linear Regression
      • 6.39 Run Linear Regression
      • 6.40 Demo-Perform Recommendation Using Collaborative Filtering
      • 6.41 Perform Recommendation Using Collaborative Filtering
      • 6.42 Demo-Run Recommendation System
      • 6.43 Run Recommendation System
      • 6.44 Quiz
      • 6.45 Summary
      • 6.46 Summary (contd.)
      • 6.47 Conclusion
    • Lesson 07 - Spark GraphX Programming
      • 7.001 Introduction
      • 7.002 Objectives
      • 7.003 Introduction to Graph-Parallel System
      • 7.004 Limitations of Graph-Parallel System
      • 7.005 Introduction to GraphX
      • 7.006 Introduction to GraphX (contd.)
      • 7.007 Importing GraphX
      • 7.008 The Property Graph
      • 7.009 The Property Graph (contd.)
      • 7.010 Features of the Property Graph
      • 7.011 Creating a Graph
      • 7.012 Demo-Create a Graph Using GraphX
      • 7.013 Create a Graph Using GraphX
      • 7.014 Triplet View
      • 7.015 Graph Operators
      • 7.016 List of Operators
      • 7.017 List of Operators (contd.)
      • 7.018 Property Operators
      • 7.019 Structural Operators
      • 7.020 Subgraphs
      • 7.021 Join Operators
      • 7.022 Demo-Perform Graph Operations Using GraphX
      • 7.023 Perform Graph Operations Using GraphX
      • 7.024 Demo-Perform Subgraph Operations
      • 7.025 Perform Subgraph Operations
      • 7.026 Neighborhood Aggregation
      • 7.027 mapReduceTriplets
      • 7.028 Demo-Perform MapReduce Operations
      • 7.029 Perform MapReduce Operations
      • 7.030 Counting Degree of Vertex
      • 7.031 Collecting Neighbors
      • 7.032 Caching and Uncaching
      • 7.033 Graph Builders
      • 7.034 Vertex and Edge RDDs
      • 7.035 Graph System Optimizations
      • 7.036 Built-in Algorithms
      • 7.037 Quiz
      • 7.038 Summary
      • 7.039 Summary (contd.)
      • 7.040 Conclusion

Apache Spark & Scala

Price on request