Principles and Techniques of Data Science
UC Berkeley, Fall 2022
Lecture Zoom Discussion Sign-Up Office Hour Queue
Fernando Perez
Will Fithian
Jump to current week: here.
- Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
- The Syllabus contains a detailed explanation of how each course component will work this Fall, please take time to take a look.
- Note: The schedule of lectures and assignments is subject to change.
Schedule
Week 1
- Aug 25
Lecture 1 Introduction
Quick Check 1 Quick Check 1 (due Aug 29)
- Aug 26
Lab 1 Prerequisite Coding (due Aug 30)
Homework 1 Prerequisite Math (due Sep 1)
Week 2
- Aug 30
Lecture 2 Pandas I
Textbook: Pandas Reference Table
Reference: Pandas API Documentation
Discussion 1 Prerequisite
- Sep 1
Lecture 3 Pandas II
Quick Check 2 Quick Check 2 (due Sep 6)
- Sep 2
Lab 2 Pandas (due Sep 7)
Homework 2 Food Safety (due Sep 9)
Week 3
- Sep 6
Lecture 4 Data Cleaning and EDA
Discussion 2 Pandas written questions, coding questions
written sol pdf, written sol notebook, coding sol pdf, coding sol notebook, recording
- Sep 8
Lecture 5 Regex
Quick Check 3 Quick Check 3 (due Sep 12)
- Sep 9
Exam prep 1 Pandas and Linear Algebra
Lab 3 Data Cleaning and EDA and Regex (due Sep 13)
Homework 3 Tweets (due Sep 15)
Week 4
- Sep 13
Lecture 6 Visualization I
Textbook: Seaborn Reference Table
Textbook: Matplotlib Reference Table
Discussion 3 EDA and Regex written questions, coding questions
written sol pdf, coding sol pdf, coding sol notebook, recording
- Sep 15
Lecture 7 Visualization II
Quick Check 4 Quick Check 4
- Sep 16
Exam prep 2 EDA and Regex
Lab 4 Transformation and KDEs (due Sep 20)
Homework 4 Bike Sharing (due Sep 22)
Week 5
- Sep 20
Lecture 8 Sampling and probability
Discussion 4 Visualization and Transformation written questions, coding questions
written sol pdf, coding sol pdf, coding sol notebook, recording
- Sep 22
Lecture 9 Modeling, SLR
Quick Check 5 Quick Check 5 (due Sep 26)
- Sep 23
Exam prep 3 Visualization
Lab 5 Modeling, Summary Statistics, and Loss Functions(due Sep 27)
Homework 5 Modeling (due Sep 29)
Week 6
- Sep 27
Lecture 10 Constant model, loss, and transformations
Discussion 5 Probability, Sampling, and SLR
- Sep 29
Lecture 11 OLS (multiple regression)
Quick Check 6 Quick Check 6 (due Oct 3)
- Sep 30
Exam prep 4 Probability & SLR
Lab 6 OLS (due Oct 4)
Homework 6 Regression (due Oct 6)
Week 7
- Oct 4
Lecture 12 Gradient descent / sklearn
Discussion 6 Models and OLS
- Oct 6
Lecture 13 Feature engineering
Quick Check 7 Quick Check 7 (due Oct 10)
- Oct 7
Exam prep 5 OLS
Lab 7 Gradient descent / sklearn (due Oct 11)
Project 1A Housing I (due Oct 13)
Week 8
- Oct 11
Discussion 7 Gradient Descent & Feature Engineering, Housing I
- Oct 13
Lecture 15 Cross-validation + Regularization
Quick Check 8 Quick Check 8 (due Oct 17)
- Oct 14
Lab 8 Model Selection, Regularization, and Cross-Validation (due Oct 18)
Project 1B Housing II (due Oct 27)
Week 9
- Oct 18
Lecture 16 Climate & Physical Data
Discussion 8 CV and Regularization
- Oct 19
midterm Midterm Exam (7-9 PM)
- Oct 20
Lecture 17 Probability I
Quick Check 9 Quick Check 9 (due Oct 24)
- Oct 21
Lab 9 Climate (due Oct 25)
Week 10
- Oct 25
Lecture 18 Probability II
Discussion 9 Housing II and Probability I, CCAO factsheet
- Oct 27
Lecture 19 Causal Inference and Confounding
Quick Check 10 Quick Check 10 (due Oct 31)
- Oct 28
Lab 10 Probability & Modeling (due Nov 1)
Homework 7 Probability (due Nov 3)
Exam prep 6 Probability
Week 11
- Nov 1
Lecture 20 SQL I
Discussion 10 Bias and Variance
- Nov 3
Lecture 21 SQL II and Cloud Data
Quick Check 11 Quick Check 11 (due Nov 7)
- Nov 4
Lab 11 SQL (due Nov 8)
Homework 8 SQL (due Nov 10)
Exam prep 7 Bias and Variance
Week 12
- Nov 8
Lecture 22 PCA
Discussion 11 SQL
- Nov 10
Lecture 23 Environmental DS
Quick Check 12 Quick Check 12
- Nov 11
Lab 12 PCA (due Nov 15)
Homework 9 PCA (due Nov 17)
Exam prep 8 Pandas2SQL
Week 13
- Nov 15
Lecture 24 Logistic regression I
Discussion 12 PCA
- Nov 17
Lecture 25 Logistic regression II
Quick Check 13
Quick Check 13(Removed due to strike).- Nov 18
Lab 13 Logistic regression
Project 2A Spam & Ham I
Week 14
- Nov 22
Lecture 26 Decision Trees
Discussion 13 Removed due to strike.
Project 2B Spam & Ham II
- Nov 24
No Lecture THANKSGIVING
Week 15
- Nov 29
Lecture 27 Clustering
Discussion 14 Removed due to strike.
Lab 15 Clustering
- Dec 1
Lecture 28 Closing Lecture: end of course logistics
Week 16
- Dec 5
RRR
- Dec 6
RRR
- Dec 7
RRR
- Dec 8
RRR
- Dec 9
RRR
Week 17
- Dec 13
final Final Exam (3-6 PM)