Advanced Spark for Data Science and Data Engineering

开始时间: 04/22/2022 持续时间: 4 weeks

所在平台: EdxArchive

课程类别: 其他类别

大学或机构: UC BerkeleyX(加州大学伯克利分校)

授课老师: Anthony D. Joseph Jon Bates

课程主页: https://www.edx.org/archive/advanced-spark-data-science-data-uc-berkeleyx-cs115x

课程评论:没有评论

第一个写评论        关注课程

课程详情

Gain a deeper understanding of Spark by learning about its APIs, architecture, and common use cases.  This statistics and data analysis course will cover material relevant to both data engineers and data scientists.  You’ll learn how Spark efficiently transfers data across the network via its shuffle, details of memory management, optimizations to reduce compute costs, and more.  Learners will see several use cases for Spark and will work to solve a variety of real-world problems using public datasets.  After taking this course, you should have a thorough understanding of how Spark works and how you can best utilize its APIs to write efficient, scalable code.  You’ll also learn about a wide variety of Spark’s APIs, including the APIs in Spark Streaming. 

课程大纲

  • Common use cases for Spark
  • Details of internals like the shuffle, Spark SQL’s Catalyst Optimizer, and Project Tungsten
  • A deep architectural overview
  • Spark Streaming
  • Spark ML

课程评论(0条)

课程简介

Learn common Spark use cases and take a deeper dive into Spark’s architecture and APIs.

课程标签

1人关注该课程

主题相关的课程