Data 100: Principles and Techniques of Data Science

UC Berkeley, Summer 2025

Ed Datahub Pensieve Additional Accommodations Office Hours Queue

Josh Grossman profile photo

Josh GrossmanHe/Him

jdgg AT berkeley DOT edu

data100.instructors@berkeley.edu

Michael Xiao profile photo

Michael XiaoHe/Him

michaelxiao1999 AT berkeley DOT edu

data100.instructors@berkeley.edu

Welcome to Week 7 of Data 100!

Lectures will be webcast at: https://berkeley.zoom.us/j/91772046015.

Schedule

Week 1

Mon June 23
Lecture 1 Introduction
Lecture Participation 1 Slido
Lab 1 Prerequisite Coding (due Thu 6/26)
Homework 1 Prerequisite Math and Coding (due Fri 6/27)
Tue June 24
Lecture 2 Pandas I
Lecture Participation 2 Slido
Wed June 25
Lecture 3 Pandas II
Lecture Participation 3 Slido
Discussion 1 Prerequisites
Solutions, Catch Up Session Notebook
Exam Prep 1 Pandas
Solutions
Thu June 26
Lecture 4 Pandas III
Lecture Participation 4 Slido
Lab 2A Pandas (due Mon 6/30)
Fri June 27
Homework 2A Food Safety (due Wed 7/2)

Week 2

Mon June 30
Lecture 5 Data Cleaning & EDA
Lecture Participation 5 Slido
Discussion 2 Pandas I
Solutions, Catch Up Session Notebook, Notebook, Groupwork Notebook
Exam Prep 2 Pandas and EDA
Solutions
Lab 2B Data Cleaning & EDA (due Thu 7/3)
Tue July 1
Lecture 6 Regex
Lecture Participation 6 Slido
Homework 2B Food Safety II (due Mon 7/7)
Wed July 2
Lecture 7 Visualization I
Lecture Participation 7 Slido
Discussion 3 Pandas II, EDA
Solutions
Exam Prep 3 RegEx
Solutions
Thu July 3
Lecture 8 Visualization II
Lecture Participation 8 Slido
Lab 3 Regex, EDA (due Mon 7/7)
Fri July 4
Holiday
Homework 3 NYT Articles (due Wed 7/9)

Week 3

Mon July 7
Lecture 9 Sampling
Lecture Participation 9 Slido
Discussion 4 Regex, Visualization, and Transformation
Solutions, Notebook
Exam Prep 4 Data Visualization
Solutions
Lab 4 Transformations (due Thu 7/10)
Tue July 8
Lecture 10 Modeling, SLR
Lecture Participation 10 Slido
Homework 4 Bike Sharing (due Fri 7/11)
Wed July 9
Lecture 11 Constant model, loss, and transformations
Lecture Participation 11 Slido
Discussion 5 Probability, Sampling, & Visualization
Solutions
Exam Prep 5 SLR
Solutions
Thu July 10
Lecture 12 Ordinary Least Squares
Lecture Participation 12 Slido
Lab 5 Modeling, Summary Statistics, Loss Functions (due Mon 7/14)
Fri July 11
Homework 5 Modeling and OLS (due Mon 7/21)

Week 4

Mon July 14
Lecture 13 Gradient Descent, sklearn
Lecture Participation 13 Slido
Discussion 6 Modeling and OLS
Solutions
Exam Prep 6 OLS, Gradient Descent
Solutions
Lab 6 OLS (due Mon 7/21)
Tue July 15
Lecture 14 Feature Engineering
Lecture Participation 14 Slido
Project A1 Housing I (due Wed 7/23)
Wed July 16
Lecture 15 HCE Case Study: CCAO
Lecture Participation 15 Slido
Discussion 7 Gradient Descent
Solutions
Thu July 17
Midterm Exam See syllabus for midterm details.
Fri July 18
Midterm Exam

Week 5

Mon July 21
Lecture 16 Cross-Validation & Regularization
Lecture Participation 16 Slido
Discussion 8 Cross-Validation & Regularization
Solutions
Exam Prep 7 Cross Validation and Regularization
Solutions
Lab 7 Gradient Descent and Sklearn (due 7/24)
Tue July 22
Lecture 17 Random Variables
Lecture Participation 17 Slido
Project A2 Housing II (due Sun 7/27)
Wed July 23
Lecture 18 Estimators, Bias & Variance
Lecture Participation 18 Slido
Discussion 9 Random Variables, Bias, Variance
Solutions
Exam Prep 8 Probability and Bias-Variance
Solutions
Thu July 24
Lecture 19 Parameter Inference & The Bootstrap
Lecture Participation 19 Slido
Lab 8 Model Selection, Regularization, and Cross-Validation (due Mon 7/28)
Fri July 25
Homework 6 Probability and Written (due Wed 7/30)

Week 6

Mon July 28
Lecture 20 SQL I
Lecture Participation 20 Slido
Discussion 10 SQL
Notebook, Solutions
Exam Prep 9 SQL
Solutions
Lab 9 Random Variables and Inference & SQL (due Thu 7/31)
Tue July 29
Lecture 21 SQL II
Homework 7 SQL (due Fri 8/1)
Wed July 30
Lecture 22 Logistic Regression I
Lecture Participation 22 Slido
Discussion 11 Logistic Regression
Solutions
Exam Prep 10 Logistic Regression
Solutions
Thu July 31
Lecture 23 Logistic Regression II
Lecture Participation 23 Slido
Lab 10 Logistic Regression (due Mon 8/4)
Fri Aug 1
Project B1 Spam & Ham I (due Wed 8/6)

Week 7

Mon Aug 4
Lecture 24 Principal Component Analysis I
Lecture Participation 24 Slido
Discussion 12 ROC Curves and Performance Metrics
Solutions
Exam Prep 11 PCA and Clustering
Solutions
Lab 11 PCA (due Thu 8/7)
Tue Aug 5
Lecture 25 Principal Component Analysis II
Lecture Participation 25 Slido
Project B2 Spam & Ham II (due Mon 8/11)
Wed Aug 6
Lecture 26 Clustering
Lecture Participation 26 Slido
Discussion 13 Clustering
Thu Aug 7
Lecture 27 Closing
Lecture Participation 27 Slido
Lab 12 PCA & Clustering (due Mon 8/11)

Week 8

Mon Aug 11
Lecture 28 Review
Tue Aug 12
Lecture 29 Review
Wed Aug 13
Final See syllabus for final details.
Thu Aug 14
Final
Fri Aug 15
Final