Welcome to the Capstone Project for Big Data! In this culminating project, you will build a big data ecosystem using tools and methods form the earlier courses in this specialization. You will analyze a data set simulating big data generated from a large number of users who are playing our imaginary game "Catch the Pink Flamingo". During the five week Capstone Project, you will walk through the typical big data science steps for acquiring, exploring, preparing, analyzing, and reporting. In the first two weeks, we will introduce you to the data set and guide you through some exploratory analysis using tools such as Splunk and Open Office. Then we will move into more challenging big data problems requiring the more advanced tools you have learned including KNIME, Spark's MLLib and Gephi. Finally, during the fifth and final week, we will show you how to bring it all together to create engaging and compelling reports and slide presentations. As a result of our collaboration with Splunk, a software company focus on analyzing machine-generated big data, learners with the top projects will be eligible to present to Splunk and meet Splunk recruiters and engineering leadership.

大数据-Capstone项目:欢迎来到Capstone大数据项目!在这个最终项目中,您将使用本专业中较早课程的工具和方法来构建大数据生态系统。您将分析一个数据集,该数据集模拟正在玩我们想象中的游戏“ Catch the Pink Flamingo”的大量用户产生的大数据。在为期五周的Capstone项目中,您将完成典型的大数据科学步骤,以获取,探索,准备,分析和报告。在最初的两个星期中,我们将向您介绍数据集,并指导您使用Splunk和Open Office等工具进行一些探索性分析。然后,我们将进入更具挑战性的大数据问题,这些问题需要您已经学会了更高级的工具,包括KNIME,Spark的MLLib和Gephi。最后,在第五个也是最后一个星期,我们将向您展示如何将所有这些结合在一起,以创建引人入胜且引人注目的报告和幻灯片演示。由于我们与专注于分析机器生成的大数据的软件公司Splunk的合作,拥有顶级项目的学习者将有资格向Splunk展示并与Splunk招聘人员和工程领导者见面。





