Data 100: Principles and Techniques of Data Science

UC Berkeley, Summer 2024

Ed Datahub Gradescope Lectures Playlist Emergency Accommodations Office Hours Queue

Kevin Miao

Kevin Miao

He/Him/His

kevinmiao@berkeley.edu

Office Hours: Mondays & Fridays from 11:00AM to 12:00PM in Soda 411

Maya Shen

Maya Shen

She/Her/Hers

mayashen@berkeley.edu

Office Hours: Tuesdays & Thursdays 11:00AM to 12:00PM in Soda 411

Schedule

Week 1

June 17
Lecture 1 Course Overview
Note 1
Lab 1 Prerequisite Coding, Plotting, and Permutation (due 6/20)
June 18
Lecture 2 Pandas I
Note 2
Homework 1A Plotting and Permutation Tests (due 6/20)
Homework 1B Prerequisite Math (due 6/20)
Discussion 1 Prerequisites (virtual walkthrough only)
Solutions
June 19
Juneteenth
June 20
Lecture 3 Pandas II
Note 3
Lab 2 Pandas (due 6/23)
June 21
Lecture 4 Pandas III
Note 4
Homework 2 Food Safety I (due 6/24)

Week 2

June 24
Lecture 5 Data Wrangling and EDA I
Note 5
Lab 3 Data Wrangling and EDA (due 6/26)
Discussion 2 Pandas
June 25
Lecture 6 Data Wrangling and EDA II, Regex
Note 6
Homework 3 Food Safety II (due 6/27)
June 26
Discussion 3 Regex and EDA
June 27
Lecture 7 Visualization I
Note 7
Lab 4 Regex and EDA (due 6/30)
June 28
Lecture 8 Visualization II
Note 8
Homework 4 Text Analysis of Bloomberg Articles (due 7/1)

Week 3

July 1
Lecture 9 Sampling
Note 9
Lab 5 Transformations (due 7/3)
Discussion 4 Visualization and Transformation
July 2
Lecture 10 Modeling and SLR
Note 10
Homework 5 Bike Sharing (due 7/4)
July 3
No Discussion
July 4
No Lecture
July 5
No Lecture
Homework 6 Sampling and Modeling (due 7/8)

Week 4

July 8
Lecture 11 Constant Model, Loss, and Transformations
Note 11
Lab 6 Modeling, Loss Functions, and Summary Statistics (due 7/10)
Discussion 5 Probability, Sampling, and Simple Linear Regression
July 9
Lecture 12 OLS (Multiple Regression)
Note 12
Homework 7 Regression (due 7/11)
July 10
Discussion 6 Constant Models, OLS, and Multiple Linear Regression
July 11
Lecture 13 Gradient Descent and sklearn
Note 13
Lab 7 Ordinary Least Squares (due 7/14)
July 12
Lecture 14 Feature Engineering
Note 14
Project A1 Housing I (due 7/15)

Week 5

July 15
Lecture 15 Cross-Validation and Regularization
Note 15
Lab 8 Gradient Descent and Feature Engineering (due 7/17)
Project A2 Housing II (due 7/18)
Discussion 7 Gradient Descent and Feature Engineering
July 16
Lecture 16 TBD
July 17
Discussion 8 Exam Review
July 18
Lecture 17 Case Study (HCE): CCAO
Note 17
Lab 9 Model Selection, Regularization, and Cross-Validation (due 7/21)
July 19
Midterm Midterm

Week 6

July 22
Lecture 18 Estimators, Bias, and Variance
Note 18
Lab 10 Probability (due 7/10)
Discussion 9 Cross-Validation and Regularization
July 23
Lecture 19 Parameter Inference and Bootstrapping
Note 19
Homework 8 Probability and Estimators (due 7/25)
July 24
Discussion 10 Random Variables, Bias, and Variance
July 25
Lecture 20 SQL
Note 20
Lab 11 SQL (due 7/28)
July 26
Lecture 21 Logistic Regression I
Note 21
Homework 9 SQL (due 7/29)

Week 7

July 29
Lecture 22 Logistic Regression II
Note 22
Lab 12 Logistic Regression (due 7/31)
Project B1 Spam and Ham I (due 8/1)
Discussion 11 SQL
July 30
Lecture 23 Ensembles
July 31
Discussion 12 Logistic Regression
August 1
Lecture 24 PCA
Note 24
Lab 13 PCA (due 8/4)
August 2
Lecture 25 Clustering
Note 25
Project B2 Spam and Ham II (due 8/5)
Homework 10 PCA and Clustering (due 8/5)

Week 8

August 5
Lecture 26 Conclusion
Lab 14 Clustering (due 8/7)
Discussion 13 PCA and Clustering
August 6
Lecture 27 Guest Lecture
August 7
Discussion 14 Final Review
August 8
Final Exam Final