Supervised Machine Learning: Classification

开始时间: 04/22/2022 持续时间: 未知

所在平台: Coursera

课程主页: https://www.coursera.org/learn/supervised-machine-learning-classification

课程评论：没有评论

课程详情

This course introduces you to one of the main types of modeling families of supervised Machine Learning: Classification. You will learn how to train predictive models to classify categorical outcomes and how to use error metrics to compare across different models. The hands-on section of this course focuses on using best practices for classification, including train and test splits, and handling data sets with unbalanced classes. By the end of this course you should be able to: -Differentiate uses and applications of classification and classification ensembles -Describe and use logistic regression models -Describe and use decision tree and tree-ensemble models -Describe and use other ensemble methods for classification -Use a variety of error metrics to compare and select the classification model that best suits your data -Use oversampling and undersampling as techniques to handle unbalanced classes in a data set Who should take this course? This course targets aspiring data scientists interested in acquiring hands-on experience with Supervised Machine Learning Classification techniques in a business setting. What skills should you have? To make the most out of this course, you should have familiarity with programming on a Python development environment, as well as fundamental understanding of Data Cleaning, Exploratory Data Analysis, Calculus, Linear Algebra, Probability, and Statistics.

课程大纲

Part: 1

Title:Logistic Regression

Description:Logistic regression is one of the most studied and widely used classification algorithms, probably due to its popularity in regulated industries and financial settings. Although more modern classifiers might likely output models with higher accuracy, logistic regressions are great baseline models due to their high interpretability and parametric nature. This module will walk you through extending a linear regression example into a logistic regression, as well as the most common error metrics that you might want to use to compare several classifiers and select that best suits your business problem.

Part: 2

Title:K Nearest Neighbors

Description:K Nearest Neighbors is a popular classification method because they are easy computation and easy to interpret. This module walks you through the theory behind k nearest neighbors as well as a demo for you to practice building k nearest neighbors models with sklearn.

Part: 3

Title:Support Vector Machines

Description:This module will walk you through the main idea of how support vector machines construct hyperplanes to map your data into regions that concentrate a majority of data points of a certain class. Although support vector machines are widely used for regression, outlier detection, and classification, this module will focus on the latter.

Part: 4

Title:Decision Trees

Description:Decision tree methods are a common baseline model for classification tasks due to their visual appeal and high interpretability. This module walks you through the theory behind decision trees and a few hands-on examples of building decision tree models for classification. You will realize the main pros and cons of these techniques. This background will be useful when you are presented with decision tree ensembles in the next module.

Part: 5

Title:Ensemble Models

Description:Ensemble models are a very popular technique as they can assist your models be more resistant to outliers and have better chances at generalizing with future data. They also gained popularity after several ensembles helped people win prediction competitions. Recently, stochastic gradient boosting became a go-to candidate model for many data scientists.

Part: 6

Title:Modeling Unbalanced Classes

Description:Some classification models are better suited than others to outliers, low occurrence of a class, or rare events. The most common methods to add robustness to a classifier are related to stratified sampling to re-balance the training data. This module will walk you through both stratified sampling methods and more novel approaches to model data sets with unbalanced classes.

课程评论(0条)

课程简介

本课程向您介绍监督机器学习的主要建模系列之一：分类。您将学习如何训练预测模型以对分类结果进行分类，以及如何使用错误度量来比较不同的模型。本课程的实践部分侧重于使用分类的最佳实践，包括训练和测试拆分，以及处理具有不平衡类的数据集。在本课程结束时，您应该能够： -区分分类和分类集成的用途和应用 -描述和使用逻辑回归模型 -描述和使用决策树和树集成模型 -描述和使用其他集成方法进行分类 -使用各种错误度量来比较和选择最适合您的数据的分类模型 - 使用过采样和欠采样作为处理数据集中不平衡类的技术谁应该参加本课程？本课程面向有抱负的数据科学家，他们有兴趣在商业环境中获得监督机器学习分类技术的实践经验。你应该具备哪些技能？为了充分利用本课程，您应该熟悉 Python 开发环境中的编程，以及对数据清理、探索性数据分析、微积分、线性代数、概率和统计的基本了解。

课程标签

监督机器学习：分类