Apache Spark & Scala

simplilearn
Online

Price on request
You can also call the Study Centre
81510... More
Want to speak to an Advisor about this course?
Students that were interested in this course also looked at...
See all

Important information

  • Course
  • Online
Description

Simplilearn is the World’s Largest Certification Training Provider, with over 400,000+ professionals trained globally
Trusted by the Fortune 500 companies as their learning provider for career growth and training
2000+ certified and experienced trainers conduct trainings for various courses across the globe
All our Courses are designed and developed under a tried and tested Unique Learning Framework that is proven to deliver 98.6% pass rate in first attempt.

Important information

Opinions

B
Binu Nair
17/03/2014
What I would highlight Good course on PRINCE2; well organized by Simplilearn.

Would you recommend this course? Yes.
P
Pramukh N Vasist
12/03/2014
What I would highlight Excellent training done by Simplilearn Team.

Would you recommend this course? Yes.

What you'll learn on the course

Apache

Course programme

Course Preview Course Agenda
  • Apache Spark & Scala
    • Lesson 00 - Course Overview
      • 0.1 Introduction
      • 0.2 Course Objectives
      • 0.3 Course Overview
      • 0.4 Target Audience
      • 0.5 Course Prerequisites
      • 0.6 Value to the Professionals
      • 0.7 Value to the Professionals (contd.)
      • 0.8 Value to the Professionals (contd.)
      • 0.9 Lessons Covered
      • 0.10 Conclusion
    • Lesson 01 - Introduction to Spark
      • 1.1 Introduction
      • 1.2 Objectives
      • 1.3 Evolution of Distributed Systems
      • 1.4 Need of New Generation Distributed Systems
      • 1.5 Limitations of MapReduce in Hadoop
      • 1.6 Limitations of MapReduce in Hadoop (contd.)
      • 1.7 Batch vs. Real-Time Processing
      • 1.8 Application of Stream Processing
      • 1.9 Application of In-Memory Processing
      • 1.10 Introduction to Apache Spark
      • 1.11 Components of a Spark Project
      • 1.12 History of Spark
      • 1.13 Language Flexibility in Spark
      • 1.14 Spark Execution Architecture
      • 1.15 Automatic Parallelization of Complex Flows
      • 1.16 Automatic Parallelization of Complex Flows-Important Points
      • 1.17 APIs That Match User Goals
      • 1.18 Apache Spark-A Unified Platform of Big Data Apps
      • 1.19 More Benefits of Apache Spark
      • 1.20 Running Spark in Different Modes
      • 1.21 Installing Spark as a Standalone Cluster-Configurations
      • 1.22 Installing Spark as a Standalone Cluster-Configurations
      • 1.23 Demo-Install Apache Spark
      • 1.24 Demo-Install Apache Spark
      • 1.25 Overview of Spark on a Cluster
      • 1.26 Tasks of Spark on a Cluster
      • 1.27 Companies Using Spark-Use Cases
      • 1.28 Hadoop Ecosystem vs. Apache Spark
      • 1.29 Hadoop Ecosystem vs. Apache Spark (contd.)
      • 1.30 Quiz
      • 1.31 Summary
      • 1.32 Summary (contd.)
      • 1.33 Conclusion
    • Lesson 02 - Introduction to Programming in Scala
      • 2.1 Introduction
      • 2.2 Objectives
      • 2.3 Introduction to Scala
      • 2.4 Features of Scala
      • 2.5 Basic Data Types
      • 2.6 Basic Literals
      • 2.7 Basic Literals (contd.)
      • 2.8 Basic Literals (contd.)
      • 2.9 Introduction to Operators
      • 2.10 Types of Operators
      • 2.11 Use Basic Literals and the Arithmetic Operator
      • 2.12 Demo Use Basic Literals and the Arithmetic Operator
      • 2.13 Use the Logical Operator
      • 2.14 Demo Use the Logical Operator
      • 2.15 Introduction to Type Inference
      • 2.16 Type Inference for Recursive Methods
      • 2.17 Type Inference for Polymorphic Methods and Generic Classes
      • 2.18 Unreliability on Type Inference Mechanism
      • 2.19 Mutable Collection vs. Immutable Collection
      • 2.20 Functions
      • 2.21 Anonymous Functions
      • 2.22 Objects
      • 2.23 Classes
      • 2.24 Use Type Inference, Functions, Anonymous Function, and Class
      • 2.25 Demo Use Type Inference, Functions, Anonymous Function and Class
      • 2.26 Traits as Interfaces
      • 2.27 Traits-Example
      • 2.28 Collections
      • 2.29 Types of Collections
      • 2.30 Types of Collections (contd.)
      • 2.31 Lists
      • 2.32 Perform Operations on Lists
      • 2.33 Demo Use Data Structures
      • 2.34 Maps
      • 2.35 Maps-Operations
      • 2.36 Pattern Matching
      • 2.37 Implicits
      • 2.38 Implicits (contd.)
      • 2.39 Streams
      • 2.40 Use Data Structures
      • 2.41 Demo Perform Operations on Lists
      • 2.42 Quiz
      • 2.43 Summary
      • 2.44 Summary (contd.)
      • 2.45 Conclusion
    • Lesson 03 - Using RDD for Creating Applications in Spark
      • 3.1 Introduction
      • 3.2 Objectives
      • 3.3 RDDs API
      • 3.4 Features of RDDs
      • 3.5 Creating RDDs
      • 3.6 Creating RDDs—Referencing an External Dataset
      • 3.7 Referencing an External Dataset—Text Files
      • 3.8 Referencing an External Dataset—Text Files (contd.)
      • 3.9 Referencing an External Dataset—Sequence Files
      • 3.10 Referencing an External Dataset—Other Hadoop Input Formats
      • 3.11 Creating RDDs—Important Points
      • 3.12 RDD Operations
      • 3.13 RDD Operations—Transformations
      • 3.14 Features of RDD Persistence
      • 3.15 Storage Levels Of RDD Persistence
      • 3.16 Choosing The Correct RDD Persistence Storage Level
      • 3.17 Invoking the Spark Shell
      • 3.18 Importing Spark Classes
      • 3.19 Creating the SparkContext
      • 3.20 Loading a File in Shell
      • 3.21 Performing Some Basic Operations on Files in Spark Shell RDDs
      • 3.22 Packaging a Spark Project with SBT
      • 3.23 Running a Spark Project With SBT
      • 3.24 Demo-Build a Scala Project
      • 3.25 Build a Scala Project
      • 3.26 Demo-Build a Spark Java Project
      • 3.27 Build a Spark Java Project
      • 3.28 Shared Variables—Broadcast
      • 3.29 Shared Variables—Accumulators
      • 3.30 Writing a Scala Application
      • 3.31 Demo-Run a Scala Application
      • 3.32 Run a Scala Application
      • 3.33 Demo-Write a Scala Application Reading the Hadoop Data
      • 3.34 Write a Scala Application Reading the Hadoop Data
      • 3.35 Demo-Run a Scala Application Reading the Hadoop Data
      • 3.36 Run a Scala Application Reading the Hadoop Data
      • 3.37 Scala RDD Extensions
      • 3.38 DoubleRDD Methods
      • 3.39 PairRDD Methods—Join
      • 3.40 PairRDD Methods—Others
      • 3.41 Java PairRDD Methods
      • 3.42 Java PairRDD Methods (contd.)
      • 3.43 General RDD Methods
      • 3.44 General RDD Methods (contd.)
      • 3.45 Java RDD Methods
      • 3.46 Java RDD Methods (contd.)
      • 3.47 Common Java RDD Methods
      • 3.48 Spark Java Function Classes
      • 3.49 Method for Combining JavaPairRDD Functions
      • 3.50 Transformations in RDD
      • 3.51 Other Methods
      • 3.52 Actions in RDD
      • 3.53 Key-Value Pair RDD in Scala
      • 3.54 Key-Value Pair RDD in Java
      • 3.55 Using MapReduce and Pair RDD Operations
      • 3.56 Reading Text File from HDFS
      • 3.57 Reading Sequence File from HDFS
      • 3.58 Writing Text Data to HDFS
      • 3.59 Writing Sequence File to HDFS
      • 3.60 Using GroupBy
      • 3.61 Using GroupBy (contd.)
      • 3.62 Demo-Run a Scala Application Performing GroupBy Operation
      • 3.63 Run a Scala Application Performing GroupBy Operation
      • 3.64 Demo-Run a Scala Application Using the Scala Shell
      • 3.65 Run a Scala Application Using the Scala Shell
      • 3.66 Demo-Write and Run a Java Application
      • 3.67 Write and Run a Java Application
      • 3.68 Quiz
      • 3.69 Summary
      • 3.70 Summary (contd.)
      • 3.71 Conclusion
    • Lesson 04 - Running SQL Queries Using Spark SQL
      • 4.1 Introduction
      • 4.2 Objectives
      • 4.3 Importance of Spark SQL
      • 4.4 Benefits of Spark SQL
      • 4.5 DataFrames
      • 4.6 SQLContext
      • 4.7 SQLContext (contd.)
      • 4.8 Creating a DataFrame
      • 4.9 Using DataFrame Operations
      • 4.10 Using DataFrame Operations (contd.)
      • 4.11 Demo-Run SparkSQL with a Dataframe
      • 4.12 Run SparkSQL with a Dataframe
      • 4.13 Interoperating with RDDs
      • 4.14 Using the Reflection-Based Approach
      • 4.15 Using the Reflection-Based Approach (contd.)
      • 4.16 Using the Programmatic Approach
      • 4.17 Using the Programmatic Approach (contd.)
      • 4.18 Demo-Run Spark SQL Programmatically
      • 4.19 Run Spark SQL Programmatically
      • 4.20 Data Sources
      • 4.21 Save Modes
      • 4.22 Saving to Persistent Tables
      • 4.23 Parquet Files
      • 4.24 Partition Discovery
      • 4.25 Schema Merging
      • 4.26 JSON Data
      • 4.27 Hive Table
      • 4.28 DML Operation-Hive Queries
      • 4.29 Demo-Run Hive Queries Using Spark SQL
      • 4.30 Run Hive Queries Using Spark SQL
      • 4.31 JDBC to Other Databases
      • 4.32 Supported Hive Features
      • 4.33 Supported Hive Features (contd.)
      • 4.34 Supported Hive Data Types
      • 4.35 Case Classes
      • 4.36 Case Classes (contd.)
      • 4.37 Quiz
      • 4.38 Summary
      • 4.39 Summary (contd.)
      • 4.40 Conclusion
    • Lesson 05 - Spark Streaming
      • 5.1 Introduction
      • 5.2 Objectives
      • 5.3 Introduction to Spark Streaming
      • 5.4 Working of Spark Streaming
      • 5.5 Features of Spark Streaming
      • 5.6 Streaming Word Count
      • 5.7 Micro Batch
      • 5.8 DStreams
      • 5.9 DStreams (contd.)
      • 5.10 Input DStreams and Receivers
      • 5.11 Input DStreams and Receivers (contd.)
      • 5.12 Basic Sources
      • 5.13 Advanced Sources
      • 5.14 Advanced Sources-Twitter
      • 5.15 Transformations on DStreams
      • 5.16 Transformations on Dstreams (contd.)
      • 5.17 Output Operations on DStreams
      • 5.18 Design Patterns for Using ForeachRDD
      • 5.19 DataFrame and SQL Operations
      • 5.20 DataFrame and SQL Operations (contd.)
      • 5.21 Checkpointing
      • 5.22 Enabling Checkpointing
      • 5.23 Socket Stream
      • 5.24 File Stream
      • 5.25 Stateful Operations
      • 5.26 Window Operations
      • 5.27 Types of Window Operations
      • 5.28 Types of Window Operations Types (contd.)
      • 5.29 Join Operations-Stream-Dataset Joins
      • 5.30 Join Operations-Stream-Stream Joins
      • 5.31 Monitoring Spark Streaming Application
      • 5.32 Performance Tuning-High Level
      • 5.33 Performance Tuning-Detail Level
      • 5.34 Demo-Capture and Process the Netcat Data
      • 5.35 Capture and Process the Netcat Data
      • 5.36 Demo-Capture and Process the Flume Data
      • 5.37 Capture and Process the Flume Data
      • 5.38 Demo-Capture the Twitter Data
      • 5.39 Capture the Twitter Data
      • 5.40 Quiz
      • 5.41 Summary
      • 5.42 Summary (contd.)
      • 5.43 Conclusion
    • Lesson 06 - Spark ML Programming
      • 6.1 Introduction
      • 6.2 Objectives
      • 6.3 Introduction to Machine Learning
      • 6.4 Common Terminologies in Machine Learning
      • 6.5 Applications of Machine Learning
      • 6.6 Machine Learning in Spark
      • 6.7 Spark ML API
      • 6.8 DataFrames
      • 6.9 Transformers and Estimators
      • 6.10 Pipeline
      • 6.11 Working of a Pipeline
      • 6.12 Working of a Pipeline (contd.)
      • 6.13 DAG Pipelines
      • 6.14 Runtime Checking
      • 6.15 Parameter Passing
      • 6.16 General Machine Learning Pipeline-Example
      • 6.17 General Machine Learning Pipeline-Example (contd.)
      • 6.18 Model Selection via Cross-Validation
      • 6.19 Supported Types, Algorithms, and Utilities
      • 6.20 Data Types
      • 6.21 Feature Extraction and Basic Statistics
      • 6.22 Clustering
      • 6.23 K-Means
      • 6.24 K-Means (contd.)
      • 6.25 Demo-Perform Clustering Using K-Means
      • 6.26 Perform Clustering Using K-Means
      • 6.27 Gaussian Mixture
      • 6.28 Power Iteration Clustering (PIC)
      • 6.29 Latent Dirichlet Allocation (LDA)
      • 6.30 Latent Dirichlet Allocation (LDA) (contd.)
      • 6.31 Collaborative Filtering
      • 6.32 Classification
      • 6.33 Classification (contd.)
      • 6.34 Regression
      • 6.35 Example of Regression
      • 6.36 Demo-Perform Classification Using Linear Regression
      • 6.37 Perform Classification Using Linear Regression
      • 6.38 Demo-Run Linear Regression
      • 6.39 Run Linear Regression
      • 6.40 Demo-Perform Recommendation Using Collaborative Filtering
      • 6.41 Perform Recommendation Using Collaborative Filtering
      • 6.42 Demo-Run Recommendation System
      • 6.43 Run Recommendation System
      • 6.44 Quiz
      • 6.45 Summary
      • 6.46 Summary (contd.)
      • 6.47 Conclusion
    • Lesson 07 - Spark GraphX Programming
      • 7.001 Introduction
      • 7.002 Objectives
      • 7.003 Introduction to Graph-Parallel System
      • 7.004 Limitations of Graph-Parallel System
      • 7.005 Introduction to GraphX
      • 7.006 Introduction to GraphX (contd.)
      • 7.007 Importing GraphX
      • 7.008 The Property Graph
      • 7.009 The Property Graph (contd.)
      • 7.010 Features of the Property Graph
      • 7.011 Creating a Graph
      • 7.012 Demo-Create a Graph Using GraphX
      • 7.013 Create a Graph Using GraphX
      • 7.014 Triplet View
      • 7.015 Graph Operators
      • 7.016 List of Operators
      • 7.017 List of Operators (contd.)
      • 7.018 Property Operators
      • 7.019 Structural Operators
      • 7.020 Subgraphs
      • 7.021 Join Operators
      • 7.022 Demo-Perform Graph Operations Using GraphX
      • 7.023 Perform Graph Operations Using GraphX
      • 7.024 Demo-Perform Subgraph Operations
      • 7.025 Perform Subgraph Operations
      • 7.026 Neighborhood Aggregation
      • 7.027 mapReduceTriplets
      • 7.028 Demo-Perform MapReduce Operations
      • 7.029 Perform MapReduce Operations
      • 7.030 Counting Degree of Vertex
      • 7.031 Collecting Neighbors
      • 7.032 Caching and Uncaching
      • 7.033 Graph Builders
      • 7.034 Vertex and Edge RDDs
      • 7.035 Graph System Optimizations
      • 7.036 Built-in Algorithms
      • 7.037 Quiz
      • 7.038 Summary
      • 7.039 Summary (contd.)
      • 7.040 Conclusion

Achievements for this centre

2016

Students that were interested in this course also looked at...
See all