⚠️ This content is archived as of March 2026 and is retained exclusively for reference. Find current offerings.

Principles and Techniques of Data Science

UC Berkeley, Summer 2022

Anirudhan Badrinath

Anirudhan Badrinath

abadrinath@berkeley.edu

Dominic Liu

Dominic Liu

he/him

hangxingliu@berkeley.edu

Jump to current week: here.

  • Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
  • The Syllabus contains a detailed explanation of how each course component will work this summer, given that the course is being taught entirely online.
  • Textbook readings are optional and actively in development. See the Resources for more details.
  • Note: The schedule of lectures and assignments is subject to change.

Schedule

Week 1

Jun 21

Lecture 1 Course Overview, Sampling and Probability

Ch. 1, 2, 3.1

Lab 1 Prerequisite Coding (due Jun 27)

Lab 2 Pandas (due Jun 27)

Jun 22

Lecture 2 Pandas I

Ch. 6.1, 6.5

Textbook: Pandas Reference Table

Reference: Pandas API Documentation

Jun 23

Lecture 3 Pandas II

Ch. 6.2-6.4

Discussion 0 (Optional) Fundamentals

Solution, Recording

Discussion 1 Sampling and Probability, Pandas, code

Solution, Recording

Homework 1 Intro + Prerequisites (due Jun 27)

Jun 23

Exam Prep 1 Sampling and Probability, Pandas

Solution, Recording

Week 2

Jun 27

Lecture 4 Data Cleaning, EDA

Ch. 8-9

Weekly Check 2 Weekly Check 2

Lab 3 Data Cleaning and EDA (due Jul 2)

Lab 4 Transformations and KDE (due Jul 2)

Homework 2 Food Safety (due Jun 30)

Jun 28

Lecture 5 Regex

Ch. 13

Discussion 2 Pandas, Data Cleaning

Solution, Recording

Jun 29

Lecture 6 Visualization I

Ch. 11.1-11.3

Textbook: Seaborn Reference Table

Textbook: Matplotlib Reference Table

Jun 30

Lecture 7 Visualization II

Ch. 11.4-11.6

Discussion 3 Regex, Visualization

Solution, Recording

Homework 3 Tweets (due Jul 5)

Jul 1

Exam Prep 2 Pandas, Visualization, Regex

Solution, Recording

Catch-up section 1

Recording

Week 3

Jul 4

Independence Day

Weekly Check 3 Weekly Check 3

Lab 5 Modeling, Loss Functions, and Summary Statistics (due Jul 9)

Lab 6 Linear Regression (due Jul 9)

Homework 4 Bike Sharing (due Jul 7)

Jul 5

Lecture 8 Intro to Modeling, Simple Linear Regression

Ch. 15.1-15.2

Discussion 4 Modeling and Visualization

Solution, Recording

Jul 6

Lecture 9 Constant Model, Loss, and Transformations

Ch. 4

Jul 7

Lecture 10 Ordinary Least Squares (Multiple Linear Regression)

Ch. 15.3-15.4

Discussion 5 Linear Model and Loss Function

Solution, Recording

Homework 5 Regression (On paper) (due Jul 11)

Jul 8

Exam Prep 3 SLR & OLR

Solution, Recording

Catch-up section 2

Recording

Week 5

Jul 18

Midterm Midterm Exam

Reference Sheet

Weekly Check 5 Weekly Check 5

Lab 9 Probability and Modeling (due Jul 23)

Jul 19

Break (No Lecture)

Jul 20

Lecture 15 Probability I: Random Variables

Ch. 3.2-3.5, 16.3

Jul 21

Lecture 16 Probability II: Estimators, Bias, and Variance

Ch. 16.1, Ch. 16.4, 19.2

Discussion 8 Probability and Bias-Variance Trade-off

Solution,Recording

Jul 22

Exam Prep 5 CV, Probability, BVT

Solution, no recording

Catch-up section 4 TBD

Recording

Week 6

Jul 25

Lecture 17 SQL I

Ch. 7.1-7.2, 7.5

Weekly Check 6 Weekly Check 6

Lab 10 SQL (due Jul 30)

Lab 11 PCA (due Jul 30)

Homework 6 Probability and Estimators coding, written pdf, written latex (due Jul 28)

Jul 26

Lecture 18 SQL II and PCA I

Ch. 7.3-7.4

Discussion 9 BVT & SQL

Solution, Recording

Jul 27

Lecture 19 PCA II

Ch. 22

Jul 28

Break (No Lecture)

Discussion 10 SQL & PCA

Solution, Recording

Homework 7 SQL and PCA (due Aug 1)

Jul 29

Exam Prep 6 SQL & PCA

Solution, Recording

Catch-up section 5

Recording

Week 7

Aug 1

Lecture 20 Classification and Logistic Regression I

Ch. 19.1-19.3

Weekly Check 7 Weekly Check 7

Lab 12 Logistic Regression (due Aug 6)

Lab 13 Decision Trees & Random Forests (due Aug 6)

Proj 2A Spam I (due Aug 4)

Aug 2

Lecture 21 Logistic Regression II

Ch. 19.4-19.8

Discussion 11 Logistic Regression

Solution, Recording

Aug 3

Lecture 22 Decision Trees

Ch. 23

Aug 4

Lecture 23 Clustering

Ch. 24

Discussion 12 Decision Trees, Clustering

Solution, Recording

Proj 2B Spam II (due Aug 8)

Aug 5

Exam Prep 7 Classifier and Clustering

Solution, Recording

Catch-up section 6

Recording

Week 8

Aug 8

Weekly Check 8 Weekly Check 8

Topical Review Session

Aug 9

Topical Review Session

Aug 10

Topical Review Session

Aug 11

Optional Lecture Neural Networks

Review

Aug 12

Final Final Exam

Reference Sheet