# Principles and Techniques of Data Science

UC Berkeley, Fall 2022

### Fernando Perez

### Will Fithian

- Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
- The Syllabus contains a detailed explanation of how each course component will work this Fall, please take time to take a look.
**Note:**The schedule of lectures and assignments is subject to change.

## Schedule

### Week 1

- Aug 25
**Lecture 1**Introduction**Quick Check 1**Quick Check 1 (due Aug 29)- Aug 26
**Lab 1**Prerequisite Coding (due Aug 30)**Homework 1**Prerequisite Math (due Sep 1)

### Week 2

- Aug 30
**Lecture 2**Pandas ITextbook: Pandas Reference Table

Reference: Pandas API Documentation

**Discussion 1**Prerequisite- Sep 1
**Lecture 3**Pandas II**Quick Check 2**Quick Check 2 (due Sep 6)- Sep 2
**Lab 2**Pandas (due Sep 7)**Homework 2**Food Safety (due Sep 9)

### Week 3

- Sep 6
**Lecture 4**Data Cleaning and EDA**Discussion 2**Pandas written questions, coding questionswritten sol pdf, written sol notebook, coding sol pdf, coding sol notebook, recording

- Sep 8
**Lecture 5**Regex**Quick Check 3**Quick Check 3 (due Sep 12)- Sep 9
**Exam prep 1**Pandas and Linear Algebra**Lab 3**Data Cleaning and EDA and Regex (due Sep 13)**Homework 3**Tweets (due Sep 15)

### Week 4

- Sep 13
**Lecture 6**Visualization ITextbook: Seaborn Reference Table

Textbook: Matplotlib Reference Table

**Discussion 3**EDA and Regex written questions, coding questionswritten sol pdf, coding sol pdf, coding sol notebook, recording

- Sep 15
**Lecture 7**Visualization II**Quick Check 4**Quick Check 4- Sep 16
**Exam prep 2**EDA and Regex**Lab 4**Transformation and KDEs (due Sep 20)**Homework 4**Bike Sharing (due Sep 22)

### Week 5

- Sep 20
**Lecture 8**Sampling and probability**Discussion 4**Visualization and Transformation written questions, coding questionswritten sol pdf, coding sol pdf, coding sol notebook, recording

- Sep 22
**Lecture 9**Modeling, SLR**Quick Check 5**Quick Check 5 (due Sep 26)- Sep 23
**Exam prep 3**Visualization**Lab 5**Modeling, Summary Statistics, and Loss Functions(due Sep 27)**Homework 5**Modeling (due Sep 29)

### Week 6

- Sep 27
**Lecture 10**Constant model, loss, and transformations**Discussion 5**Probability, Sampling, and SLR- Sep 29
**Lecture 11**OLS (multiple regression)**Quick Check 6**Quick Check 6 (due Oct 3)- Sep 30
**Exam prep 4**Probability & SLR**Lab 6**OLS (due Oct 4)**Homework 6**Regression (due Oct 6)

### Week 7

- Oct 4
**Lecture 12**Gradient descent / sklearn**Discussion 6**Models and OLS- Oct 6
**Lecture 13**Feature engineering**Quick Check 7**Quick Check 7 (due Oct 10)- Oct 7
**Exam prep 5**OLS**Lab 7**Gradient descent / sklearn (due Oct 11)**Project 1A**Housing I (due Oct 13)

### Week 8

- Oct 11
**Discussion 7**Gradient Descent & Feature Engineering, Housing I- Oct 13
**Lecture 15**Cross-validation + Regularization**Quick Check 8**Quick Check 8 (due Oct 17)- Oct 14
**Lab 8**Model Selection, Regularization, and Cross-Validation (due Oct 18)**Project 1B**Housing II (due Oct 27)

### Week 9

- Oct 18
**Lecture 16**Climate & Physical Data**Discussion 8**CV and Regularization- Oct 19
**midterm**Midterm Exam (7-9 PM)- Oct 20
**Lecture 17**Probability I**Quick Check 9**Quick Check 9 (due Oct 24)- Oct 21
**Lab 9**Climate (due Oct 25)

### Week 10

- Oct 25
**Lecture 18**Probability II**Discussion 9**Housing II and Probability I, CCAO factsheet- Oct 27
**Lecture 19**Causal Inference and Confounding**Quick Check 10**Quick Check 10 (due Oct 31)- Oct 28
**Lab 10**Probability & Modeling (due Nov 1)**Homework 7**Probability (due Nov 3)**Exam prep 6**Probability

### Week 11

- Nov 1
**Lecture 20**SQL I**Discussion 10**Bias and Variance- Nov 3
**Lecture 21**SQL II and Cloud Data**Quick Check 11**Quick Check 11 (due Nov 7)- Nov 4
**Lab 11**SQL (due Nov 8)**Homework 8**SQL (due Nov 10)**Exam prep 7**Bias and Variance

### Week 12

- Nov 8
**Lecture 22**PCA**Discussion 11**SQL- Nov 10
**Lecture 23**Environmental DS**Quick Check 12**Quick Check 12- Nov 11
**Lab 12**PCA (due Nov 15)**Homework 9**PCA (due Nov 17)**Exam prep 8**Pandas2SQL

### Week 13

- Nov 15
**Lecture 24**Logistic regression I**Discussion 12**PCA- Nov 17
**Lecture 25**Logistic regression II**Quick Check 13**~~Quick Check 13~~(Removed due to strike).- Nov 18
**Lab 13**Logistic regression**Project 2A**Spam & Ham I

### Week 14

- Nov 22
**Lecture 26**Decision Trees**Discussion 13**Removed due to strike.**Project 2B**Spam & Ham II- Nov 24
**No Lecture**THANKSGIVING

### Week 15

- Nov 29
**Lecture 27**Clustering**Discussion 14**Removed due to strike.**Lab 15**Clustering- Dec 1
**Lecture 28**Closing Lecture: end of course logistics

### Week 16

- Dec 5
### Week 17

- Dec 13
**final**Final Exam (3-6 PM)