Lecture 25 – Big Data
Presented by Anthony D. Joseph
Content by Anthony D. Joseph, Joseph Gonzalez, Josh Hug
A reminder – the right column of the table below contains Quick Checks. These are not required but suggested to help you check your understanding.
An overview of big data, with several pertinent examples. Operational data stores and data warehouses. Extract, transform, load (ETL).
The multidimensional data model. Fact tables and dimension tables. Star schemas and snowflake schemas. Online analytics processing (OLAP).
Data warehouses and data lakes.
Distributed file systems and fault tolerance.
Distributed aggregation with MapReduce. The MapReduce abstraction.
Hadoop and Spark. Resilient Distributed Datasets (RDDs). Modin.