Mining Massive Datasets

开始时间: 04/22/2022 持续时间: 7 weeks

所在平台: CourseraArchive

课程类别: 计算机科学

大学或机构: Stanford University(斯坦福大学)

授课老师: Jeff Ullman Anand Rajaraman Jure Leskovec

课程主页: https://www.coursera.org/course/mmds

课程评论: 1 个评论

评论课程        关注课程

课程详情

We introduce the student to modern distributed file systems and MapReduce, including what distinguishes good MapReduce algorithms from good algorithms in general.  The rest of the course is devoted to algorithms for extracting models and information from large datasets.  Students will learn how Google's PageRank algorithm models importance of Web pages and some of the many extensions that have been used for a variety of purposes.  We'll cover locality-sensitive hashing, a bit of magic that allows you to find similar items in a set of items so large you cannot possibly compare each pair.  When data is stored as a very large, sparse matrix, dimensionality reduction is often a good way to model the data, but standard approaches do not scale well; we'll talk about efficient approaches.  Many other large-scale algorithms are covered as well, as outlined in the course syllabus.

课程大纲

Week 1:
MapReduce
RageRank

Week 2
Locality-Sensitive Hashing
Nearest Neighbors
Decision Trees

Week 3
Frequent Itemsets
Analysis of large graphs

Week 4
Recommender systems
Data streams

Week 5
Distance measures
Dimensionality reduction
Clustering

Week 6
Support-Vector machines
More about MapReduce

Week 7
More about PageRank
More about Locality-Sensitive hashing
On-line algorithms

课程评论(1条)

0

skyline打酱油 2015-03-20 14:50 0 票支持; 0 票反对

正在上这门课,由于横跨了春节假期,加上其他的事情,误了多个quiz的deadline,但是会把课程修完,作业做好,拿不到证书也ok。感觉就是,覆盖面非常广,涉及聚类、推荐系统、信息检索、数据流挖掘等等内容,但大多介绍了原理与问题,浅尝辄止。应该算是大数据的导论性质课程。对于我这种学渣还是挺有用的。

课程简介

This class teaches algorithms for extracting models and other information from very large amounts of data. The emphasis is on techniques that are efficient and that scale well.

课程标签

大数据 MapReduce 支持向量机 SVM PageRank 推荐系统 决策树

82人关注该课程

主题相关的课程