How to Win a Data Science Competition: Learn from Top Kagglers

开始时间: 04/22/2022 持续时间: Unknown

所在平台: CourseraArchive

课程类别: 计算机科学

大学或机构: CourseraNew

课程主页: https://www.coursera.org/archive/competitive-data-science

课程评论:没有评论

第一个写评论        关注课程

课程详情

If you want to break into competitive data science, then this course is for you! Participating in predictive modelling competitions can help you gain practical experience, improve and harness your data modelling skills in various domains such as credit, insurance, marketing, natural language processing, sales’ forecasting and computer vision to name a few. At the same time you get to do it in a competitive context against thousands of participants where each one tries to build the most predictive algorithm. Pushing each other to the limit can result in better performance and smaller prediction errors. Being able to achieve high ranks consistently can help you accelerate your career in data science. In this course, you will learn to analyse and solve competitively such predictive modelling tasks. When you finish this class, you will: - Understand how to solve predictive modelling competitions efficiently and learn which of the skills obtained can be applicable to real-world tasks. - Learn how to preprocess the data and generate new features from various sources such as text and images. - Be taught advanced feature engineering techniques like generating mean-encodings, using aggregated statistical measures or finding nearest neighbors as a means to improve your predictions. - Be able to form reliable cross validation methodologies that help you benchmark your solutions and avoid overfitting or underfitting when tested with unobserved (test) data. - Gain experience of analysing and interpreting the data. You will become aware of inconsistencies, high noise levels, errors and other data-related issues such as leakages and you will learn how to overcome them. - Acquire knowledge of different algorithms and learn how to efficiently tune their hyperparameters and achieve top performance. - Master the art of combining different machine learning models and learn how to ensemble. - Get exposed to past (winning) solutions and codes and learn how to read them. Disclaimer : This is not a machine learning course in the general sense. This course will teach you how to get high-rank solutions against thousands of competitors with focus on practical usage of machine learning methods rather than the theoretical underpinnings behind them. Prerequisites: - Python: work with DataFrames in pandas, plot figures in matplotlib, import and train models from scikit-learn, XGBoost, LightGBM. - Machine Learning: basic understanding of linear models, K-NN, random forest, gradient boosting and neural networks. Do you have technical problems? Write to us: coursera@hse.ru

如何赢得数据科学竞赛:向顶尖Kagglers学习:如果您想进入竞争性数据科学领域,那么本课程适合您!参加预测建模竞赛可以帮助您获得实践经验,提高和利用信贷,保险,市场营销,自然语言处理,销售预测和计算机视觉等各个领域的数据建模技能。同时,您可以在竞争激烈的环境中与成千上万的参与者进行竞争,每个参与者都试图构建最具预测性的算法。将彼此推到极限可以导致更好的性能和更小的预测误差。始终保持较高的排名可以帮助您加速数据科学事业。 在本课程中,您将学习分析和解决此类预测建模任务。 完成本课程后,您将: -了解如何有效地解决预测建模竞赛,并了解所获得的哪些技能可以应用于现实世界中的任务。 -了解如何预处理数据并从文本和图像等各种来源生成新功能。 -学习先进的特征工程技术,例如生成均值编码,使用聚合的统计量度或查找最近的邻居来改善预测。 -能够形成可靠的交叉验证方法,以帮助您对解决方案进行基准测试,并在用未观察到的(测试)数据进行测试时避免过拟合或过拟合。 -获得分析和解释数据的经验。您将意识到不一致,高噪声水平,错误以及其他与数据相关的问题(例如泄漏),并且将学习如何解决这些问题。 -掌握不同算法的知识,并学习如何有效地调整其超参数并获得最佳性能。 -掌握将不同的机器学习模型结合在一起的技巧,并学习如何进行合奏。 -接触过去(获奖的)解决方案和代码,并学习如何阅读它们。 免责声明:这不是一般意义上的机器学习课程。本课程将教您如何针对成千上万的竞争对手获得高级解决方案,重点是机器学习方法的实际使用,而不是其背后的理论基础。 先决条件: -Python:在熊猫中使用DataFrames,在matplotlib中绘制图形,从scikit-learn,XGBoost和LightGBM导入和训练模型。 -机器学习:对线性模型,K-NN,随机森林,梯度提升和神经网络有基本的了解。 你有技术上的问题吗?写信给我们:coursera@hse.ru

课程大纲

This week we will introduce you to competitive data science. You will learn about competitions' mechanics, the difference between competitions and a real life data science, hardware and software that people usually use in competitions. We will also briefly recap major ML models frequently used in competitions.

课程评论(0条)

课程简介

If you want to break into competitive data science, then this course is for you! Participating in pr

课程标签

数据科学 Kaggle 数据科学竞赛

6人关注该课程

主题相关的课程