Quality Seal Emagister EMAGISTER CUM LAUDE

Apache Spark & Scala

simplilearn
Online

Price on request
You can also call the Study Centre
81510... More
Compare this course with other similar courses
See all

Important information

  • Course
  • Online
Description

Simplilearn is the World’s Largest Certification Training Provider, with over 400,000+ professionals trained globally
Trusted by the Fortune 500 companies as their learning provider for career growth and training
2000+ certified and experienced trainers conduct trainings for various courses across the globe
All our Courses are designed and developed under a tried and tested Unique Learning Framework that is proven to deliver 98.6% pass rate in first attempt.

Important information

Opinions

B

17/03/2014
What I would highlight Good course on PRINCE2; well organized by Simplilearn.

Would you recommend this course? Yes.
P

12/03/2014
What I would highlight Excellent training done by Simplilearn Team.

Would you recommend this course? Yes.

What you'll learn on the course

Apache

Course programme

Course Preview Course Agenda
  • Apache Spark & Scala
    • Lesson 00 - Course Overview
      • 0.1 Introduction
      • 0.2 Course Objectives
      • 0.3 Course Overview
      • 0.4 Target Audience
      • 0.5 Course Prerequisites
      • 0.6 Value to the Professionals
      • 0.7 Value to the Professionals (contd.)
      • 0.8 Value to the Professionals (contd.)
      • 0.9 Lessons Covered
      • 0.10 Conclusion
    • Lesson 01 - Introduction to Spark
      • 1.1 Introduction
      • 1.2 Objectives
      • 1.3 Evolution of Distributed Systems
      • 1.4 Need of New Generation Distributed Systems
      • 1.5 Limitations of MapReduce in Hadoop
      • 1.6 Limitations of MapReduce in Hadoop (contd.)
      • 1.7 Batch vs. Real-Time Processing
      • 1.8 Application of Stream Processing
      • 1.9 Application of In-Memory Processing
      • 1.10 Introduction to Apache Spark
      • 1.11 Components of a Spark Project
      • 1.12 History of Spark
      • 1.13 Language Flexibility in Spark
      • 1.14 Spark Execution Architecture
      • 1.15 Automatic Parallelization of Complex Flows
      • 1.16 Automatic Parallelization of Complex Flows-Important Points
      • 1.17 APIs That Match User Goals
      • 1.18 Apache Spark-A Unified Platform of Big Data Apps
      • 1.19 More Benefits of Apache Spark
      • 1.20 Running Spark in Different Modes
      • 1.21 Installing Spark as a Standalone Cluster-Configurations
      • 1.22 Installing Spark as a Standalone Cluster-Configurations
      • 1.23 Demo-Install Apache Spark
      • 1.24 Demo-Install Apache Spark
      • 1.25 Overview of Spark on a Cluster
      • 1.26 Tasks of Spark on a Cluster
      • 1.27 Companies Using Spark-Use Cases
      • 1.28 Hadoop Ecosystem vs. Apache Spark
      • 1.29 Hadoop Ecosystem vs. Apache Spark (contd.)
      • 1.30 Quiz
      • 1.31 Summary
      • 1.32 Summary (contd.)
      • 1.33 Conclusion
    • Lesson 02 - Introduction to Programming in Scala
      • 2.1 Introduction
      • 2.2 Objectives
      • 2.3 Introduction to Scala
      • 2.4 Features of Scala
      • 2.5 Basic Data Types
      • 2.6 Basic Literals
      • 2.7 Basic Literals (contd.)
      • 2.8 Basic Literals (contd.)
      • 2.9 Introduction to Operators
      • 2.10 Types of Operators
      • 2.11 Use Basic Literals and the Arithmetic Operator
      • 2.12 Demo Use Basic Literals and the Arithmetic Operator
      • 2.13 Use the Logical Operator
      • 2.14 Demo Use the Logical Operator
      • 2.15 Introduction to Type Inference
      • 2.16 Type Inference for Recursive Methods
      • 2.17 Type Inference for Polymorphic Methods and Generic Classes
      • 2.18 Unreliability on Type Inference Mechanism
      • 2.19 Mutable Collection vs. Immutable Collection
      • 2.20 Functions
      • 2.21 Anonymous Functions
      • 2.22 Objects
      • 2.23 Classes
      • 2.24 Use Type Inference, Functions, Anonymous Function, and Class
      • 2.25 Demo Use Type Inference, Functions, Anonymous Function and Class
      • 2.26 Traits as Interfaces
      • 2.27 Traits-Example
      • 2.28 Collections
      • 2.29 Types of Collections
      • 2.30 Types of Collections (contd.)
      • 2.31 Lists
      • 2.32 Perform Operations on Lists
      • 2.33 Demo Use Data Structures
      • 2.34 Maps
      • 2.35 Maps-Operations
      • 2.36 Pattern Matching
      • 2.37 Implicits
      • 2.38 Implicits (contd.)
      • 2.39 Streams
      • 2.40 Use Data Structures
      • 2.41 Demo Perform Operations on Lists
      • 2.42 Quiz
      • 2.43 Summary
      • 2.44 Summary (contd.)
      • 2.45 Conclusion
    • Lesson 03 - Using RDD for Creating Applications in Spark
      • 3.1 Introduction
      • 3.2 Objectives
      • 3.3 RDDs API
      • 3.4 Features of RDDs
      • 3.5 Creating RDDs
      • 3.6 Creating RDDs—Referencing an External Dataset
      • 3.7 Referencing an External Dataset—Text Files
      • 3.8 Referencing an External Dataset—Text Files (contd.)
      • 3.9 Referencing an External Dataset—Sequence Files
      • 3.10 Referencing an External Dataset—Other Hadoop Input Formats
      • 3.11 Creating RDDs—Important Points
      • 3.12 RDD Operations
      • 3.13 RDD Operations—Transformations
      • 3.14 Features of RDD Persistence
      • 3.15 Storage Levels Of RDD Persistence
      • 3.16 Choosing The Correct RDD Persistence Storage Level
      • 3.17 Invoking the Spark Shell
      • 3.18 Importing Spark Classes
      • 3.19 Creating the SparkContext
      • 3.20 Loading a File in Shell
      • 3.21 Performing Some Basic Operations on Files in Spark Shell RDDs
      • 3.22 Packaging a Spark Project with SBT
      • 3.23 Running a Spark Project With SBT
      • 3.24 Demo-Build a Scala Project
      • 3.25 Build a Scala Project
      • 3.26 Demo-Build a Spark Java Project
      • 3.27 Build a Spark Java Project
      • 3.28 Shared Variables—Broadcast
      • 3.29 Shared Variables—Accumulators
      • 3.30 Writing a Scala Application
      • 3.31 Demo-Run a Scala Application
      • 3.32 Run a Scala Application
      • 3.33 Demo-Write a Scala Application Reading the Hadoop Data
      • 3.34 Write a Scala Application Reading the Hadoop Data
      • 3.35 Demo-Run a Scala Application Reading the Hadoop Data
      • 3.36 Run a Scala Application Reading the Hadoop Data
      • 3.37 Scala RDD Extensions
      • 3.38 DoubleRDD Methods
      • 3.39 PairRDD Methods—Join
      • 3.40 PairRDD Methods—Others
      • 3.41 Java PairRDD Methods
      • 3.42 Java PairRDD Methods (contd.)
      • 3.43 General RDD Methods
      • 3.44 General RDD Methods (contd.)
      • 3.45 Java RDD Methods
      • 3.46 Java RDD Methods (contd.)
      • 3.47 Common Java RDD Methods
      • 3.48 Spark Java Function Classes
      • 3.49 Method for Combining JavaPairRDD Functions
      • 3.50 Transformations in RDD
      • 3.51 Other Methods
      • 3.52 Actions in RDD
      • 3.53 Key-Value Pair RDD in Scala
      • 3.54 Key-Value Pair RDD in Java
      • 3.55 Using MapReduce and Pair RDD Operations
      • 3.56 Reading Text File from HDFS
      • 3.57 Reading Sequence File from HDFS
      • 3.58 Writing Text Data to HDFS
      • 3.59 Writing Sequence File to HDFS
      • 3.60 Using GroupBy
      • 3.61 Using GroupBy (contd.)
      • 3.62 Demo-Run a Scala Application Performing GroupBy Operation
      • 3.63 Run a Scala Application Performing GroupBy Operation
      • 3.64 Demo-Run a Scala Application Using the Scala Shell
      • 3.65 Run a Scala Application Using the Scala Shell
      • 3.66 Demo-Write and Run a Java Application
      • 3.67 Write and Run a Java Application
      • 3.68 Quiz
      • 3.69 Summary
      • 3.70 Summary (contd.)
      • 3.71 Conclusion
    • Lesson 04 - Running SQL Queries Using Spark SQL
      • 4.1 Introduction
      • 4.2 Objectives
      • 4.3 Importance of Spark SQL
      • 4.4 Benefits of Spark SQL
      • 4.5 DataFrames
      • 4.6 SQLContext
      • 4.7 SQLContext (contd.)
      • 4.8 Creating a DataFrame
      • 4.9 Using DataFrame Operations
      • 4.10 Using DataFrame Operations (contd.)
      • 4.11 Demo-Run SparkSQL with a Dataframe
      • 4.12 Run SparkSQL with a Dataframe
      • 4.13 Interoperating with RDDs
      • 4.14 Using the Reflection-Based Approach
      • 4.15 Using the Reflection-Based Approach (contd.)
      • 4.16 Using the Programmatic Approach
      • 4.17 Using the Programmatic Approach (contd.)
      • 4.18 Demo-Run Spark SQL Programmatically
      • 4.19 Run Spark SQL Programmatically
      • 4.20 Data Sources
      • 4.21 Save Modes
      • 4.22 Saving to Persistent Tables
      • 4.23 Parquet Files
      • 4.24 Partition Discovery
      • 4.25 Schema Merging
      • 4.26 JSON Data
      • 4.27 Hive Table
      • 4.28 DML Operation-Hive Queries
      • 4.29 Demo-Run Hive Queries Using Spark SQL
      • 4.30 Run Hive Queries Using Spark SQL
      • 4.31 JDBC to Other Databases
      • 4.32 Supported Hive Features
      • 4.33 Supported Hive Features (contd.)
      • 4.34 Supported Hive Data Types
      • 4.35 Case Classes
      • 4.36 Case Classes (contd.)
      • 4.37 Quiz
      • 4.38 Summary
      • 4.39 Summary (contd.)
      • 4.40 Conclusion
    • Lesson 05 - Spark Streaming
      • 5.1 Introduction
      • 5.2 Objectives
      • 5.3 Introduction to Spark Streaming
      • 5.4 Working of Spark Streaming
      • 5.5 Features of Spark Streaming
      • 5.6 Streaming Word Count
      • 5.7 Micro Batch
      • 5.8 DStreams
      • 5.9 DStreams (contd.)
      • 5.10 Input DStreams and Receivers
      • 5.11 Input DStreams and Receivers (contd.)
      • 5.12 Basic Sources
      • 5.13 Advanced Sources
      • 5.14 Advanced Sources-Twitter
      • 5.15 Transformations on DStreams
      • 5.16 Transformations on Dstreams (contd.)
      • 5.17 Output Operations on DStreams
      • 5.18 Design Patterns for Using ForeachRDD
      • 5.19 DataFrame and SQL Operations
      • 5.20 DataFrame and SQL Operations (contd.)
      • 5.21 Checkpointing
      • 5.22 Enabling Checkpointing
      • 5.23 Socket Stream
      • 5.24 File Stream
      • 5.25 Stateful Operations
      • 5.26 Window Operations
      • 5.27 Types of Window Operations
      • 5.28 Types of Window Operations Types (contd.)
      • 5.29 Join Operations-Stream-Dataset Joins
      • 5.30 Join Operations-Stream-Stream Joins
      • 5.31 Monitoring Spark Streaming Application
      • 5.32 Performance Tuning-High Level
      • 5.33 Performance Tuning-Detail Level
      • 5.34 Demo-Capture and Process the Netcat Data
      • 5.35 Capture and Process the Netcat Data
      • 5.36 Demo-Capture and Process the Flume Data
      • 5.37 Capture and Process the Flume Data
      • 5.38 Demo-Capture the Twitter Data
      • 5.39 Capture the Twitter Data
      • 5.40 Quiz
      • 5.41 Summary
      • 5.42 Summary (contd.)
      • 5.43 Conclusion
    • Lesson 06 - Spark ML Programming
      • 6.1 Introduction
      • 6.2 Objectives
      • 6.3 Introduction to Machine Learning
      • 6.4 Common Terminologies in Machine Learning
      • 6.5 Applications of Machine Learning
      • 6.6 Machine Learning in Spark
      • 6.7 Spark ML API
      • 6.8 DataFrames
      • 6.9 Transformers and Estimators
      • 6.10 Pipeline
      • 6.11 Working of a Pipeline
      • 6.12 Working of a Pipeline (contd.)
      • 6.13 DAG Pipelines
      • 6.14 Runtime Checking
      • 6.15 Parameter Passing
      • 6.16 General Machine Learning Pipeline-Example
      • 6.17 General Machine Learning Pipeline-Example (contd.)
      • 6.18 Model Selection via Cross-Validation
      • 6.19 Supported Types, Algorithms, and Utilities
      • 6.20 Data Types
      • 6.21 Feature Extraction and Basic Statistics
      • 6.22 Clustering
      • 6.23 K-Means
      • 6.24 K-Means (contd.)
      • 6.25 Demo-Perform Clustering Using K-Means
      • 6.26 Perform Clustering Using K-Means
      • 6.27 Gaussian Mixture
      • 6.28 Power Iteration Clustering (PIC)
      • 6.29 Latent Dirichlet Allocation (LDA)
      • 6.30 Latent Dirichlet Allocation (LDA) (contd.)
      • 6.31 Collaborative Filtering
      • 6.32 Classification
      • 6.33 Classification (contd.)
      • 6.34 Regression
      • 6.35 Example of Regression
      • 6.36 Demo-Perform Classification Using Linear Regression
      • 6.37 Perform Classification Using Linear Regression
      • 6.38 Demo-Run Linear Regression
      • 6.39 Run Linear Regression
      • 6.40 Demo-Perform Recommendation Using Collaborative Filtering
      • 6.41 Perform Recommendation Using Collaborative Filtering
      • 6.42 Demo-Run Recommendation System
      • 6.43 Run Recommendation System
      • 6.44 Quiz
      • 6.45 Summary
      • 6.46 Summary (contd.)
      • 6.47 Conclusion
    • Lesson 07 - Spark GraphX Programming
      • 7.001 Introduction
      • 7.002 Objectives
      • 7.003 Introduction to Graph-Parallel System
      • 7.004 Limitations of Graph-Parallel System
      • 7.005 Introduction to GraphX
      • 7.006 Introduction to GraphX (contd.)
      • 7.007 Importing GraphX
      • 7.008 The Property Graph
      • 7.009 The Property Graph (contd.)
      • 7.010 Features of the Property Graph
      • 7.011 Creating a Graph
      • 7.012 Demo-Create a Graph Using GraphX
      • 7.013 Create a Graph Using GraphX
      • 7.014 Triplet View
      • 7.015 Graph Operators
      • 7.016 List of Operators
      • 7.017 List of Operators (contd.)
      • 7.018 Property Operators
      • 7.019 Structural Operators
      • 7.020 Subgraphs
      • 7.021 Join Operators
      • 7.022 Demo-Perform Graph Operations Using GraphX
      • 7.023 Perform Graph Operations Using GraphX
      • 7.024 Demo-Perform Subgraph Operations
      • 7.025 Perform Subgraph Operations
      • 7.026 Neighborhood Aggregation
      • 7.027 mapReduceTriplets
      • 7.028 Demo-Perform MapReduce Operations
      • 7.029 Perform MapReduce Operations
      • 7.030 Counting Degree of Vertex
      • 7.031 Collecting Neighbors
      • 7.032 Caching and Uncaching
      • 7.033 Graph Builders
      • 7.034 Vertex and Edge RDDs
      • 7.035 Graph System Optimizations
      • 7.036 Built-in Algorithms
      • 7.037 Quiz
      • 7.038 Summary
      • 7.039 Summary (contd.)
      • 7.040 Conclusion

Compare this course with other similar courses
See all