Principles and Techniques of Data Science

UC Berkeley, Fall 2022

Lecture Zoom Discussion Sign-Up Office Hour Queue

Jump to current week: here.

  • Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
  • The Syllabus contains a detailed explanation of how each course component will work this Fall, please take time to take a look.
  • Note: The schedule of lectures and assignments is subject to change.

Schedule

Week 1

Aug 25

Lecture 1 Introduction

Quick Check 1 Quick Check 1 (due Aug 29)

Aug 26

Lab 1 Prerequisite Coding (due Aug 30)

walkthrough, solution

Homework 1 Prerequisite Math (due Sep 1)

Week 2

Aug 30

Lecture 2 Pandas I

Ch. 6.1, 6.5

Textbook: Pandas Reference Table

Reference: Pandas API Documentation

Discussion 1 Prerequisite

Solution, Recording

Sep 1

Lecture 3 Pandas II

Ch. 6.2-6.4

Quick Check 2 Quick Check 2 (due Sep 6)

Sep 2

Lab 2 Pandas (due Sep 7)

Homework 2 Food Safety (due Sep 9)

Week 3

Sep 6

Lecture 4 Data Cleaning and EDA

Ch. 8-9

Discussion 2 Pandas written questions, coding questions

written sol pdf, written sol notebook, coding sol pdf, coding sol notebook, recording

Sep 8

Lecture 5 Regex

Ch. 13

Quick Check 3 Quick Check 3 (due Sep 12)

Sep 9

Exam prep 1 Pandas and Linear Algebra

Solution

Lab 3 Data Cleaning and EDA and Regex (due Sep 13)

Homework 3 Tweets (due Sep 15)

Week 4

Sep 13

Lecture 6 Visualization I

Ch. 11.1-11.3

Textbook: Seaborn Reference Table

Textbook: Matplotlib Reference Table

Discussion 3 EDA and Regex written questions, coding questions

written sol pdf, coding sol pdf, coding sol notebook, recording

Sep 15

Lecture 7 Visualization II

Ch. 11.4-11.6

Quick Check 4 Quick Check 4

Sep 16

Exam prep 2 EDA and Regex

Solution

Lab 4 Transformation and KDEs (due Sep 20)

Homework 4 Bike Sharing (due Sep 22)

Week 5

Sep 20

Lecture 8 Sampling and probability

Ch. 1, 2, 3.1

Discussion 4 Visualization and Transformation written questions, coding questions

written sol pdf, coding sol pdf, coding sol notebook, recording

Sep 22

Lecture 9 Modeling, SLR

Ch. 15.1-15.2

Quick Check 5 Quick Check 5 (due Sep 26)

Sep 23

Exam prep 3 Visualization

Solution

Lab 5 Modeling, Summary Statistics, and Loss Functions(due Sep 27)

Homework 5 Modeling (due Sep 29)

Week 6

Sep 27

Lecture 10 Constant model, loss, and transformations

Ch. 4

Discussion 5 Probability, Sampling, and SLR

Solution, Recording

Sep 29

Lecture 11 OLS (multiple regression)

Ch. 15.3-15.4

Quick Check 6 Quick Check 6 (due Oct 3)

Sep 30

Exam prep 4 Probability & SLR

Solution

Lab 6 OLS (due Oct 4)

Homework 6 Regression (due Oct 6)

Week 7

Oct 4

Lecture 12 Gradient descent / sklearn

Ch. 20

Discussion 6 Models and OLS

Solution, Recording

Oct 6

Lecture 13 Feature engineering

Ch. 15.5

Quick Check 7 Quick Check 7 (due Oct 10)

Oct 7

Exam prep 5 OLS

Solution

Lab 7 Gradient descent / sklearn (due Oct 11)

Project 1A Housing I (due Oct 13)

Week 9

Oct 18

Lecture 16 Climate & Physical Data

Discussion 8 CV and Regularization

Solution, Recording

Oct 19

midterm Midterm Exam (7-9 PM)

Oct 20

Lecture 17 Probability I

Ch. 3.2-3.5, 16.3

Quick Check 9 Quick Check 9 (due Oct 24)

Oct 21

Lab 9 Climate (due Oct 25)

Week 10

Oct 25

Lecture 18 Probability II

Ch. 16.1, Ch. 16.4, 19.2

Discussion 9 Housing II and Probability I, CCAO factsheet

Solution, Recording

Oct 27

Lecture 19 Causal Inference and Confounding

Quick Check 10 Quick Check 10 (due Oct 31)

Oct 28

Lab 10 Probability & Modeling (due Nov 1)

Homework 7 Probability (due Nov 3)

Exam prep 6 Probability

Solution

Week 11

Nov 1

Lecture 20 SQL I

Ch. 7.1-7.2, 7.5

Discussion 10 Bias and Variance

Solution, Recording

Nov 3

Lecture 21 SQL II and Cloud Data

Ch. 7.3-7.4

Quick Check 11 Quick Check 11 (due Nov 7)

Nov 4

Lab 11 SQL (due Nov 8)

Homework 8 SQL (due Nov 10)

Exam prep 7 Bias and Variance

Solution

Week 12

Nov 8

Lecture 22 PCA

Ch. 22

Discussion 11 SQL

Solution, Recording

Nov 10

Lecture 23 Environmental DS

Quick Check 12 Quick Check 12

Nov 11

Lab 12 PCA (due Nov 15)

Homework 9 PCA (due Nov 17)

Exam prep 8 Pandas2SQL

Solution

Week 13

Nov 15

Lecture 24 Logistic regression I

Ch. 19.1-19.3

Discussion 12 PCA

Nov 17

Lecture 25 Logistic regression II

Ch. 19.4-19.8

Quick Check 13 Quick Check 13

Nov 18

Lab 13 Logistic regression

Project 2A Spam & Ham I

Week 14

Nov 22

Lecture 26 Decision Trees

Ch. 23

Discussion 13 TBD

Lab 14 Decision Trees & Random Forests

Project 2B Spam & Ham II

Nov 24

No Lecture THANKSGIVING

Week 15

Nov 29

Lecture 27 Clustering

Ch. 24

Discussion 14 TBD

Lab 15 Clustering

Dec 1

Lecture 28 Closing Lecture

Week 16

Dec 5

RRR

Dec 6

RRR

Dec 7

RRR

Dec 8

RRR

Dec 9

RRR

Week 17

Dec 13

final Final Exam (3-6 PM)