Lecture 25 – Big Data
Presented by Anthony D. Joseph
Content by Anthony D. Joseph, Joseph Gonzalez, Josh Hug
A reminder – the right column of the table below contains Quick Checks. These are not required but suggested to help you check your understanding.
Video | Quick Check | |
---|---|---|
25.1 An overview of big data, with several pertinent examples. Operational data stores and data warehouses. Extract, transform, load (ETL). |
25.1 | |
25.2 The multidimensional data model. Fact tables and dimension tables. Star schemas and snowflake schemas. Online analytics processing (OLAP). |
25.2 | |
25.3 Data warehouses and data lakes. |
25.3 | |
25.4 Distributed file systems and fault tolerance. |
25.4 | |
25.5 Distributed aggregation with MapReduce. The MapReduce abstraction. |
25.5 | |
25.6 Hadoop and Spark. Resilient Distributed Datasets (RDDs). Modin. |
25.6 |