Principles and Techniques of Data Science

UC Berkeley, Summer 2022

Anirudhan Badrinath

Anirudhan Badrinath

abadrinath@berkeley.edu

Dominic Liu

Dominic Liu

he/him

hangxingliu@berkeley.edu

  • Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
  • The Syllabus contains a detailed explanation of how each course component will work this summer, given that the course is being taught entirely online.
  • Textbook readings are optional and actively in development. See the Resources for more details.
  • Note: The schedule of lectures and assignments is subject to change.

Schedule

Week 1

Jun 21

Lecture 1 Course Overview, Sampling and Probability

Ch. 1, 2, 3.1

Lab 1 Prerequisite Coding (due Jun 27)

Lab 2 Pandas (due Jun 27)

Jun 22

Lecture 2 Pandas I

Ch. 6.1, 6.5

Textbook: Pandas Reference Table

Reference: Pandas API Documentation

Jun 23

Lecture 3 Pandas II

Ch. 6.2-6.4

Discussion 0 (Optional) Fundamentals

Solution, Recording

Discussion 1 Sampling and Probability, Pandas, code

Solution, Recording

Homework 1 Intro + Prerequisites (due Jun 27)

Jun 23

Exam Prep 1 Sampling and Probability, Pandas

Solution, Recording

Week 2

Jun 27

Lecture 4 Data Cleaning, EDA

Ch. 8-9

Weekly Check 2 Weekly Check 2

Lab 3 Data Cleaning and EDA (due Jul 2)

Lab 4 Transformations and KDE (due Jul 2)

Homework 2 Food Safety (due Jun 30)

Jun 28

Lecture 5 Regex

Ch. 12

Discussion 2 Pandas, Data Cleaning

Solution, Recording

Jun 29

Lecture 6 Visualization I

Ch. 11.1-11.3

Textbook: Seaborn Reference Table

Textbook: Matplotlib Reference Table

Jun 30

Lecture 7 Visualization II

Ch. 11.4-11.6

Discussion 3 Regex, Visualization

Solution, Recording

Homework 3 Tweets (due Jul 5)

Jul 1

Exam Prep 2 Pandas, Visualization, Regex

Solution, Recording

Catch-up section 1

Recording

Week 3

Jul 4

Independence Day

Weekly Check 3 Weekly Check 3

Lab 5 Modeling, Loss Functions, and Summary Statistics (due Jul 9)

Lab 6 Linear Regression (due Jul 9)

Homework 4 Bike Sharing (due Jul 7)

Jul 5

Lecture 8 Intro to Modeling, Simple Linear Regression

Ch. 15

Discussion 4 Modeling and Simple Linear Regression

Solution, Recording

Jul 6

Lecture 9 Constant Model, Loss, and Transformations

Ch. 4

Jul 7

Lecture 10 Ordinary Least Squares (Multiple Linear Regression)

Ch. 15.4, 19.1

Discussion 5 Ordinary Least Squares

Solution, Recording

Homework 5 Regression (On paper) (due Jul 11)

Jul 8

Exam Prep 2 TBD

Solution, Recording

Catch-up section 1 TBD

Recording

Week 4

Jul 11

Lecture 11 Gradient Descent, sklearn

Ch. 17

Lab 7 Gradient Descent and sklearn (due Jul 16)

Lab 8 Cross-Validation and Regularization (due Jul 16)

Proj 1A Housing I (due Jul 14)

Jul 12

Lecture 12 Feature Engineering

Ch. 20

Discussion 6 HCE, Gradient Descent

Jul 13

Lecture 13 Cross-Validation and Regularization

Ch. 22, 21.3

Jul 14

Lecture 14 Case Study (HCE): Fairness in Housing Appraisal

Discussion 7 HCE, Regularization, and Cross-Validation

Proj 1B Housing II (due Jul 25)

Week 5

Jul 18

Midterm Midterm Exam

Lab 9 Probability and Modeling (due Jul 23)

Jul 19

Break (No Lecture)

Jul 20

Lecture 15 Probability I: Random Variables

Ch. 3.2-3.5, 16.3

Jul 21

Lecture 16 Probability II: Estimators, Bias, and Variance

Ch. 16.1, Ch. 16.4, 19.2

Discussion 8 Probability

Week 6

Jul 25

Lecture 17 SQL I

Ch. 7.1-7.2, 7.5

Lab 10 SQL (due Jul 30)

Lab 11 PCA (due Jul 30)

Homework 6 Probability and Estimators (due Jul 28)

Jul 26

Lecture 18 SQL II

Ch. 7.3-7.4

Discussion 9 SQL

Jul 27

Lecture 19 PCA

Ch. 26

Jul 28

Break (No Lecture)

Discussion 10 PCA

Homework 7 SQL and PCA (due Aug 1)

Week 7

Aug 1

Lecture 20 Classification and Logistic Regression I

Ch. 24.1-24.3

Lab 12 Logistic Regression (due Aug 6)

Lab 13 Decision Trees & Random Forests (due Aug 6)

Proj 2A Spam I (due Aug 4)

Aug 2

Lecture 21 Logistic Regression II

Ch. 24.4-24.8

Discussion 11 Logistic Regression

Aug 3

Lecture 22 Decision Trees

Ch. 27

Aug 4

Lecture 23 Clustering

Ch. 28

Discussion 12 Decision Trees, Clustering

Proj 2B Spam II (due Aug 8)

Week 8

Aug 8

Review

Aug 9

Review

Aug 10

Review

Aug 11

Review

Aug 12

Final Final Exam