# Principles and Techniques of Data Science

UC Berkeley, Spring 2023

Lecture Zoom Discussions Office Hour/Lab Help

### Lisa Yan

### Narges Norouzi

Jump to current week: here.

- Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
- The Syllabus contains a detailed explanation of how each course component will work this Spring, please take time to take a look.
**Note:**The schedule of lectures and assignments is subject to change.

## Schedule

### Week 1

- Jan 17
**Lecture 1**Introduction**Lecture Participation 1**Lecture Participation 1 **Survey**Pre-Semester Survey (due 1/20) - Jan 19
**Lecture 2**Pandas I**Lecture Participation 2**Lecture Participation 2 - Jan 20
**Lab 1**Prerequisite Refresher (due Jan 24) **Homework 1A**Plotting and the Permutation Test (due Jan 26)**Homework 1B**Prerequisite Math (due Jan 26)

### Week 2

- Jan 24
**Lecture 3**Pandas II**Discussion 1**Prerequisites**Lecture Participation 3**Lecture Participation 3 - Jan 26
**Lecture Participation 4**Lecture Participation 4- Jan 27
**Lab 2**Pandas (due Jan 31)**Homework 2**Food Safety (due Feb 2)

### Week 3

- Jan 31
**Lecture 5**Data Cleaning and EDA, Part 2**Discussion 2**Pandas worksheet, worksheet notebook, groupwork notebook**Lecture Participation 5**Lecture Participation 5- Feb 2
**Lecture 6**Text Wrangling and Regex**Lecture Participation 6**Lecture Participation 6- Feb 3
**Exam prep 1**Pandas**Lab 3**Data Cleaning, EDA, Regex (due Feb 7)**Homework 3**Tweets (due Feb 9)

### Week 4

- Feb 7
**Lecture 7**Visualization I**Discussion 3**EDA and Regex worksheet, worksheet notebook**Lecture Participation 7**Lecture Participation 7- Feb 9
**Lecture 8**Probability & Visualization II**Lecture Participation 8**- Feb 10
**Exam prep 2**Data Cleaning, EDA and Regex**Lin Alg Review**Linear Algebra Review #1**Lab 4**Transformations**Homework 4**Bike Sharing

### Week 5

- Feb 14
**Lecture 9**Sampling and probability II**Discussion 4**Visualization and Transformation**Lecture Participation 9**- Feb 16
**Lecture 10**Modeling, SLR**Lecture Participation 10**- Feb 17
**Lab 5**Modeling, Summary Statistics, and Loss Functions**Homework 5**Modeling

### Week 6

- Feb 21
**Lecture 11**Constant model, loss, and transformations**Discussion 5**Probability, Sampling, and SLR**Lecture Participation 11**- Feb 23
**Lecture 12**OLS (multiple regression)**Lecture Participation 12**- Feb 24
**Lab 6**OLS**Homework 6**Regression

### Week 7

- Feb 28
**Lecture 13**Gradient descent / sklearn**Discussion 6**Models and OLS**Lecture Participation 13**- Mar 2
**Lecture 14**Feature Engineering**Lecture Participation 14**- Mar 3
**Lab 7**Gradient descent / sklearn

### Week 8

- Mar 7
**Lecture 15**Cross-Validation / Regularization**Discussion 7**Gradient Descent, Feature Engineering, Housing I**Lecture Participation 15**- Mar 9
**Lecture 16**Probability I**Lecture Participation 16****midterm**Midterm Exam (7-9 PM)- Mar 10
**Lab 8**Model Selection**Project 1A**Housing

### Week 9

- Mar 14
**Lecture 17**Probability II**Discussion 8**Cross-Validation and Regularization**Lecture Participation 17**- Mar 16
**Lecture 18**Case study (HCE): CCAO**Lecture Participation 18**- Mar 17
**Lab 9**Climate**Project 1B**Housing

### Week 10

- Mar 21
**Lecture 19**Case Study: Climate & Physical Data**Discussion 9**Housing II and Probability I**Lecture Participation 19**- Mar 23
**Lecture 20**Causal Inference and Confounding**Lecture Participation 20**- Mar 24
**Lab 10**Probability & Inference**Homework 7**Probability

### Spring Break

- Mar 28
Spring Break

- Mar 30
Spring Break

### Week 11

- Apr 4
**Lecture 21**SQL I**Discussion 10**Bias and Variance**Lecture Participation 21**- Apr 6
**Lecture 22**SQL II / Cloud Data**Lecture Participation 22**- Apr 7
**Lab 11**SQL**Homework 8**SQL

### Week 12

- Apr 11
**Lecture 23**Logistic regression I**Discussion 11**SQL**Lecture Participation 23**- Apr 13
**Lecture 24**Logistic regression II**Lecture Participation 24**- Apr 14
**Lab 12**Logistic regression**Project 2A**Spam & Ham I

### Week 13

- Apr 18
**Lecture 25**PCA I**Discussion 12**Logistic Regression**Lecture Participation 25**- Apr 20
**Lecture 26**PCA II**Lecture Participation 26**- Apr 21
**Lab 12**PCA**Project 2B**Spam & Ham II

### Week 14

- Apr 25
**Lecture 27**KMeans Clustering**Discussion 13**PCA**Lecture Participation 27**- Apr 27
**Lecture 28**Guest + Closing**Lecture Participation 28**- Apr 28
**Lab 14**Clustering

### Week 16

- May 1
RRR

**Lab 15**Decision Trees (Optional, no due date)- May 2
RRR

- May 3
RRR

- May 4
RRR

- May 5
RRR

### Week 17

- May 11
**final**Final Exam (8-11 AM)