*开始时间: 随时
持续时间: 自主*

所在平台: Udacity 课程类别: 统计和数据分析 大学或机构: 其他大学或机构 |

课程主页: https://www.udacity.com/course/ud651

Explore 1600+ online courses from top universities. Join Coursera today to learn data science, programming, business strategy, and more.课程评论：没有评论

**Syllabus**

Lesson 1: What is EDA?

Learn about exploratory data analysis (EDA) and its importance, and find out about the course structure and final project.

Lesson 2: Intro to R

EDA, which comes before formal hypothesis testing and modeling, often uses visual methods to analyze and summarize data sets, and R will be our tool for generating those visuals and conducting analyses. In this lesson, we will install RStudio and packages, learn the layout and basic commands of R, practice writing basic R scripts, and inspect data sets.

Lesson 3: Exploring One Variable

We perform EDA to understand the distribution of a variable and to check for anomalies and outliers. Learn how to quantify and visualize individual variables within a data set as we begin to make sense of the diamond data set. We will create histograms and boxplots, transform variables, and examine tradeoffs in visualizations.

Lesson 4: Exploring the Relationship of Two Variables

EDA allows us to identify the most important variables and relationships within a data set before building predictive models. In this lesson, we will learn techniques for exploring the relationship between any two variables in a data set.

Lesson 5: Exploring Multiple Variables

Data sets can be complex. In this lesson, we will learn powerful methods and visualizations for examining relationships among multiple variables. We’ll extend our knowledge of previous graphics as we continue to build intuition around the diamond data set.

Lesson 6: Exploring Data Sets

Learn about current tools for EDA, and investigate data alongside an expert. As a final project, you will create your own exploratory data analysis.

Exploratory Data Analysis (EDA) is an approach to data analysis for summarizing and visualizing the important characteristics of a data set. Promoted by John Tukey, EDA focuses on exploring data to understand the data’s underlying structure and variables, to develop intuition about the data set, consider how that date set came into existence, and decide how it can be investigated with more formal statistical methods.