Build, Train, and Deploy ML Pipelines using BERT

开始时间: 04/22/2022 持续时间: At the rate of 5 hours a week, it typically takes 3 weeks to complete course 2.

所在平台: Coursera

课程主页: https://www.coursera.org/learn/ml-pipelines-bert

课程评论：没有评论

课程详情

In the second course of the Practical Data Science Specialization, you will learn to automate a natural language processing task by building an end-to-end machine learning pipeline using Hugging Face’s highly-optimized implementation of the state-of-the-art BERT algorithm with Amazon SageMaker Pipelines. Your pipeline will first transform the dataset into BERT-readable features and store the features in the Amazon SageMaker Feature Store. It will then fine-tune a text classification model to the dataset using a Hugging Face pre-trained model, which has learned to understand the human language from millions of Wikipedia documents. Finally, your pipeline will evaluate the model’s accuracy and only deploy the model if the accuracy exceeds a given threshold. Practical data science is geared towards handling massive datasets that do not fit in your local hardware and could originate from multiple sources. One of the biggest benefits of developing and running data science projects in the cloud is the agility and elasticity that the cloud offers to scale up and out at a minimum cost. The Practical Data Science Specialization helps you develop the practical skills to effectively deploy your data science projects and overcome challenges at each step of the ML workflow using Amazon SageMaker. This Specialization is designed for data-focused developers, scientists, and analysts familiar with the Python and SQL programming languages and want to learn how to build, train, and deploy scalable, end-to-end ML pipelines - both automated and human-in-the-loop - in the AWS cloud.

课程大纲

Part: 1

Title:Week 1: Feature Engineering and Feature Store

Description:Transform a raw text dataset into machine learning features and store features in a feature store.

Part: 2

Title:Week 2: Train, Debug, and Profile a Machine Learning Model

Description:Fine-tune, debug, and profile a pre-trained BERT model.

Part: 3

Title:Week 3: Deploy End-To-End Machine Learning pipelines

Description: Orchestrate ML workflows and track model lineage and artifacts in an end-to-end machine learning pipeline.

课程评论(0条)

课程简介

在实用数据科学专业化的第二门课程中，您将学习通过使用 Hugging Face 对最先进的 BERT 算法的高度优化实现构建端到端机器学习管道来自动化自然语言处理任务使用 Amazon SageMaker 管道。您的管道将首先将数据集转换为 BERT 可读的特征，并将这些特征存储在 Amazon SageMaker Feature Store 中。然后，它将使用 Hugging Face 预训练模型将文本分类模型微调到数据集，该模型已经学会从数百万维基百科文档中理解人类语言。最后，您的管道将评估模型的准确性，并且仅在准确性超过给定阈值时才部署模型。实用数据科学旨在处理不适合本地硬件并且可能来自多个来源的海量数据集。在云中开发和运行数据科学项目的最大好处之一是云提供的敏捷性和弹性，可以以最低成本向上和向外扩展。实用数据科学专业化课程可帮助您培养实用技能，以有效部署您的数据科学项目，并使用 Amazon SageMaker 克服 ML 工作流程每个步骤中的挑战。本专业专为熟悉 Python 和 SQL 编程语言并希望了解如何构建、训练和部署可扩展的端到端 ML 管道（包括自动化和人工参与）的以数据为中心的开发人员、科学家和分析师而设计-the-loop - 在 AWS 云中。

课程标签

实用数据科学