Lecture 24 – Big Data
Presented by Anthony D. Joseph
Content by Anthony D. Joseph, Joseph Gonzalez, Josh Hug
The Quick Check for this lecture is due Monday, December 7th at 11:59PM. A random one of the following Google Forms will give you an alphanumeric code once you submit; you should take this code and enter it into the “Lecture 24” question in the “Quick Check Codes” assignment on Gradescope to get credit for submitting this Quick Check.
Video | Quick Check | |
---|---|---|
24.1 An overview of big data, with several pertinent examples. Operational data stores and data warehouses. Extract, transform, load (ETL). |
24.1 | |
24.2 The multidimensional data model. Fact tables and dimension tables. Star schemas and snowflake schemas. Online analytics processing (OLAP). |
24.2 | |
24.3 Data warehouses and data lakes. |
24.3 | |
24.4 Distributed file systems and fault tolerance. |
24.4 | |
24.5 Distributed aggregation with MapReduce. The MapReduce abstraction. |
24.5 | |
24.6 Hadoop and Spark. Resilient Distributed Datasets (RDDs). Modin. |
24.6 |