Big Data Integration and Processing

开始时间: 07/04/2020 持续时间: Unknown

所在平台: Coursera

课程类别: 计算机科学

大学或机构: CourseraNew



At the end of the course, you will be able to: *Retrieve data from example database and big data management systems *Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications *Identify when a big data problem needs data integration *Execute simple big data integration and processing on Hadoop and Spark platforms This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications. Hardware Requirements: (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements: This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge (except for data charges from your internet provider). Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.

大数据集成和处理:在课程结束时,您将能够: *从示例数据库和大数据管理系统中检索数据 *描述数据管理操作与在大型分析应用程序中利用它们所需的大数据处理模式之间的联系 *确定大数据问题何时需要数据集成 *在Hadoop和Spark平台上执行简单的大数据集成和处理 本课程适用于数据科学新手。建议完成大数据入门。尽管需要具备安装应用程序和使用虚拟机的能力来完成动手分配,但不需要任何先验编程经验。有关完整的硬件和软件规格,请参阅专业化技术要求。 硬件要求: (A)四核处理器(建议支持VT-x或AMD-V),64位; (B)8 GB RAM; (C)20 GB可用磁盘。如何查找您的硬件信息:(Windows):通过单击“开始”按钮,右键单击“计算机”,然后单击“属性”,打开“系统”。 (Mac):通过单击Apple菜单,然后单击“关于本机”,打开“概述”。在过去3年中购买的大多数具有8 GB RAM的计算机都将满足最低要求。您将需要高速Internet连接。下载最大4 Gb的文件。 软件要求: 本课程依赖于几种开源软件工具,包括Apache Hadoop。可以免费下载和安装所有必需的软件(互联网提供商的数据费用除外)。软件要求包括:Windows 7 +,Mac OS X 10.10 +,Ubuntu 14.04+或CentOS 6+ VirtualBox 5+。


Welcome to the third course in the Big Data Specialization. This week you will be introduced to basic concepts in big data integration and processing. You will be guided through installing the Cloudera VM, downloading the data sets to be used for this course, and learning how to run the Jupyter server.





