# Principles and Techniques of Data Science

UC Berkeley, Summer 2021

The course is being taught entirely online.
**Note:**The schedule of lectures and assignments is subject to change.

### Week 1

- Jun 21
**Lecture 1**Introduction, Course Overview**Homework 1**Prerequisites (due Jun 24)**Lab 1**Prerequisite Coding (due Jun 26)**Lab 2**Pandas (due Jun 26)- Jun 22
**Lecture 2**Data Sampling and Probability- Jun 23
**Lecture 3**Estimators and Bias**Discussion 1**Prerequisites, Probability (solutions)- Jun 24
**Lecture 4**Pandas I**Homework 2**Trump Sampling (due Jun 29)- Jun 25
**Live Session 1**(recording)

### Week 2

- Jun 28
**Lecture 5**Pandas II**Homework 3**Food Safety (due Jul 5)**Lab 3**Data Cleaning and EDA (due Jul 3)**Lab 4**SQL (due Jul 3)**Discussion 2**Probability, Pandas (videos) (solutions)- Jun 29
**Lecture 6**Data Cleaning and EDA- Jun 30
**Lecture 7**Regular Expressions**Discussion 3**Pandas (code) (solutions)- Jul 1
**Lecture 8**SQL- Jul 2
**Live Session 2**(recording)

### Week 3

- Jul 5
Independence Day

**Homework 4**Tweets (due Jul 8)**Lab 5**Transformations and KDEs (due Jul 10)**Lab 6**Modeling, Summary Statistics, and Loss Functions (due Jul 10)**Discussion 4**SQL, Regex (videos) (solutions)- Jul 6
**Lecture 9**Visualization I- Jul 7
**Lecture 10**Visualization II**Discussion 5**Visualization (solutions)- Jul 8
**Lecture 11**Modeling**Homework 5**Bike Sharing (due Jul 12)- Jul 9
**Live Session 3**(recording) (code)

### Week 4

- Jul 12
**Lecture 12**Simple Linear Regression**Homework 6**Regression (due Jul 19)**Lab 7**Simple Linear Regression (due Jul 17)**Discussion 6**Modeling and Simple Linear Regression (solutions)- Jul 13
**Lecture 13**Ordinary Least Squares + Geometric Interpretation- Jul 14
**Lecture 14**Feature Engineering**Discussion 7**Midterm Review- Jul 15
**Midterm**Midterm Exam (9:30 AM - 11:00 AM) (blank) (solutions) (video)- Jul 16
**Live Session 4**(recording)

### Week 5

- Jul 19
**Lecture 15**Modeling in Context: Fairness in Housing Appraisal**Homework 7**Housing I (due Jul 22)**Lab 8**Multiple Linear Regression and Feature Engineering (due Jul 24)**Lab 9**Feature Engineering and Cross-Validation (due Jul 24)**Discussion 8**Multiple Linear Regression (solutions)- Jul 20
**Lecture 16**Bias and Variance- Jul 21
**Lecture 17**Cross-Validation + Regularization**Discussion 9**Bias-Variance, HCE (solutions)- Jul 22
**Lecture 18**Gradient Descent**Homework 8**Housing II (due Jul 26)- Jul 23
**Live Session 5**(video pt. 1) (code)

### Week 6

- Jul 26
**Lecture 19**Logistic Regression I**Homework 9**Gradient Descent (due Jul 29)**Lab 10**Logistic Regression (due Jul 31)**Lab 11**Decision Trees (due Jul 31)**Discussion 10**Regularization, Cross-Validation, Gradient Descent (solutions)- Jul 27
**Lecture 20**Logistic Regression II and Classification- Jul 28
**Lecture 21**Decision Trees**Discussion 11**Logistic Regression, Classification (solutions)- Jul 29
**Lecture 22**Inference for Modeling**Homework 10**Spam/Ham I(due Aug 2)- Jul 30

### Week 7

- Aug 2
**Lecture 23**Principal Component Analysis**Homework 11**Spam/Ham II (due Aug 5)**Lab 12**Principal Component Analysis (due Aug 7)**Lab 13**Clustering (due Aug 7)**Discussion 12**Decision Trees, Inference (solutions)- Aug 3
**Lecture 24**Clustering I- Aug 4
**Lecture 25**Clustering II**Discussion 13**Principal Component Analysis, Clustering (solutions)- Aug 5
**Homework 12**Principal Component Analysis (due Aug 9)- Aug 6
**Live Session 8**NA