Data Mining Project

开始时间: 08/01/2020 持续时间: Unknown

所在平台: Coursera

课程类别: 计算机科学

大学或机构: CourseraNew

   

课程主页: https://www.coursera.org/learn/data-mining-project

Explore 1600+ online courses from top universities. Join Coursera today to learn data science, programming, business strategy, and more.

课程评论:没有评论

第一个写评论        关注课程

课程详情

Note: You should complete all the other courses in this Specialization before beginning this course. This six-week long Project course of the Data Mining Specialization will allow you to apply the learned algorithms and techniques for data mining from the previous courses in the Specialization, including Pattern Discovery, Clustering, Text Retrieval, Text Mining, and Visualization, to solve interesting real-world data mining challenges. Specifically, you will work on a restaurant review data set from Yelp and use all the knowledge and skills you’ve learned from the previous courses to mine this data set to discover interesting and useful knowledge. The design of the Project emphasizes: 1) simulating the workflow of a data miner in a real job setting; 2) integrating different mining techniques covered in multiple individual courses; 3) experimenting with different ways to solve a problem to deepen your understanding of techniques; and 4) allowing you to propose and explore your own ideas creatively. The goal of the Project is to analyze and mine a large Yelp review data set to discover useful knowledge to help people make decisions in dining. The project will include the following outputs: 1. Opinion visualization: explore and visualize the review content to understand what people have said in those reviews. 2. Cuisine map construction: mine the data set to understand the landscape of different types of cuisines and their similarities. 3. Discovery of popular dishes for a cuisine: mine the data set to discover the common/popular dishes of a particular cuisine. 4. Recommendation of restaurants to help people decide where to dine: mine the data set to rank restaurants for a specific dish and predict the hygiene condition of a restaurant. From the perspective of users, a cuisine map can help them understand what cuisines are there and see the big picture of all kinds of cuisines and their relations. Once they decide what cuisine to try, they would be interested in knowing what the popular dishes of that cuisine are and decide what dishes to have. Finally, they will need to choose a restaurant. Thus, recommending restaurants based on a particular dish would be useful. Moreover, predicting the hygiene condition of a restaurant would also be helpful. By working on these tasks, you will gain experience with a typical workflow in data mining that includes data preprocessing, data exploration, data analysis, improvement of analysis methods, and presentation of results. You will have an opportunity to combine multiple algorithms from different courses to complete a relatively complicated mining task and experiment with different ways to solve a problem to understand the best way to solve it. We will suggest specific approaches, but you are highly encouraged to explore your own ideas since open exploration is, by design, a goal of the Project. You are required to submit a brief report for each of the tasks for peer grading. A final consolidated report is also required, which will be peer-graded.

数据挖掘项目:注意:在开始本课程之前,您应该完成本专业中的所有其他课程。 这个为期六周的数据挖掘专业化项目课程将使您能够应用专业化以前课程中学到的算法和技术进行数据挖掘,包括模式发现,聚类,文本检索,文本挖掘和可视化,以解决有趣的现实世界中的数据挖掘挑战。具体来说,您将使用Yelp的餐厅评论数据集,并使用从以前的课程中学到的所有知识和技能来挖掘该数据集,以发现有趣且有用的知识。该项目的设计强调:1)在实际工作环境中模拟数据挖掘者的工作流程; 2)整合多个单独课程中涵盖的不同采矿技术; 3)尝试各种解决问题的方法,以加深您对技术的理解;和4)允许您创造性地提出和探索自己的想法。 该项目的目标是分析和挖掘大量的Yelp审查数据集,以发现有用的知识,以帮助人们在就餐方面做出决策。该项目将包括以下输出: 1.意见可视化:浏览并可视化评论内容,以了解人们在这些评论中所说的话。 2.构建美食地图:挖掘数据集以了解不同类型的美食及其相似之处。 3.发现美食的流行菜肴:挖掘数据集以发现特定美食的常见/大众菜肴。 4.建议餐馆,以帮助人们决定在哪里用餐:挖掘数据集以对特定菜式的餐馆进行排名,并预测餐馆的卫生状况。 从用户的角度来看,美食地图可以帮助他们了解那里的美食,并查看各种美食及其关系的全景图。一旦他们决定尝试哪种菜,他们就会对了解该菜的流行菜并决定要吃什么菜感兴趣。最后,他们将需要选择一家餐厅。因此,推荐基于特定菜肴的餐厅将是有用的。此外,预测餐厅的卫生状况也将有所帮助。 通过完成这些任务,您将获得有关数据挖掘中典型工作流程的经验,该工作流程包括数据预处理,数据探索,数据分析,分析方法的改进和结果的表示。您将有机会组合来自不同课程的多种算法来完成相对复杂的挖掘任务,并尝试以不同方式解决问题以了解解决问题的最佳方法。我们将建议特定的方法,但是强烈建议您探索自己的想法,因为从设计角度而言,公开探索是项目的目标。 您需要为每项任务提交一份简短的报告,以进行同行评分。还需要一份最终的综合报告,并将其进行同行评等。

课程评论(0条)

欢迎关注我们的公众号

NLPJob

课程简介

Note: You should complete all the other courses in this Specialization before beginning this course.

课程标签

数据挖掘 数据挖掘项目实践 韩家炜

0人关注该课程

主题相关的课程