Lecture 25 – Big Data

Presented by Anthony D. Joseph

Content by Anthony D. Joseph, Joseph Gonzalez, Josh Hug

A reminder – the right column of the table below contains Quick Checks. These are not required but suggested to help you check your understanding.

Video Quick Check
25.1
An overview of big data, with several pertinent examples. Operational data stores and data warehouses. Extract, transform, load (ETL).
25.1
25.2
The multidimensional data model. Fact tables and dimension tables. Star schemas and snowflake schemas. Online analytics processing (OLAP).
25.2
25.3
Data warehouses and data lakes.
25.3
25.4
Distributed file systems and fault tolerance.
25.4
25.5
Distributed aggregation with MapReduce. The MapReduce abstraction.
25.5
25.6
Hadoop and Spark. Resilient Distributed Datasets (RDDs). Modin.
25.6