Principles and Techniques of Data Science

UC Berkeley, Summer 2021

  • Please read our course FAQ before contacting staff with questions that might be answered there.
  • The Syllabus contains a detailed explanation of how each course component will work this summer, given that the course is being taught entirely online.
  • The scheduling of all weekly events is in the Calendar.
  • The Zoom links for all live events are in @6 on Piazza.
  • Note:The schedule of lectures and assignments is subject to change.


Week 1

Jun 21

Lecture 1 Introduction, Course Overview

Homework 1 Prerequisites (due Jun 24)

Lab 1 Prerequisite Coding (due Jun 26)

Lab 2 Pandas (due Jun 26)

Jun 22

Lecture 2 Data Sampling and Probability

Jun 23

Lecture 3 Estimators and Bias

Discussion 1 Prerequisites, Probability (solutions)

Jun 24

Lecture 4 Pandas I

Homework 2 Trump Sampling (due Jun 29)

Jun 25

Live Session 1 (recording)

Week 2

Jun 28

Lecture 5 Pandas II

Homework 3 Food Safety (due Jul 5)

Lab 3 Data Cleaning and EDA (due Jul 3)

Lab 4 SQL (due Jul 3)

Discussion 2 Probability, Pandas (videos) (solutions)

Jun 29

Lecture 6 Data Cleaning and EDA

Jun 30

Lecture 7 Regular Expressions

Discussion 3 Pandas (code) (solutions)

Jul 1

Lecture 8 SQL

Jul 2

Live Session 2 (recording)

Week 3

Jul 5

Independence Day

Homework 4 Tweets (due Jul 8)

Lab 5 Transformations and KDEs (due Jul 10)

Lab 6 Modeling, Summary Statistics, and Loss Functions (due Jul 10)

Discussion 4 SQL, Regex (videos) (solutions)

Jul 6

Lecture 9 Visualization I

Jul 7

Lecture 10 Visualization II

Discussion 5 Visualization (solutions)

Jul 8

Lecture 11 Modeling

Homework 5 Bike Sharing (due Jul 12)

Jul 9

Live Session 3 (recording) (code)

Week 4

Jul 12

Lecture 12 Simple Linear Regression

Homework 6 Regression (due Jul 19)

Lab 7 Simple Linear Regression (due Jul 17)

Discussion 6 Modeling and Simple Linear Regression (solutions)

Jul 13

Lecture 13 Ordinary Least Squares + Geometric Interpretation

Jul 14

Lecture 14 Feature Engineering

Discussion 7 Midterm Review

Jul 15

Midterm Midterm Exam (9:30 AM - 11:00 AM) (blank) (solutions) (video)

Jul 16

Live Session 4 (recording)

Week 5

Jul 19

Lecture 15 Modeling in Context: Fairness in Housing Appraisal

Homework 7 Housing I (due Jul 22)

Lab 8 Multiple Linear Regression and Feature Engineering (due Jul 24)

Lab 9 Feature Engineering and Cross-Validation (due Jul 24)

Discussion 8 Multiple Linear Regression (solutions)

Jul 20

Lecture 16 Bias and Variance

Jul 21

Lecture 17 Cross-Validation + Regularization

Discussion 9 Bias-Variance, HCE (solutions)

Jul 22

Lecture 18 Gradient Descent

Homework 8 Housing II (due Jul 26)

Jul 23

Live Session 5 (video pt. 1) (code)

Week 6

Jul 26

Lecture 19 Logistic Regression I

Homework 9 Gradient Descent (due Jul 29)

Lab 10 Logistic Regression (due Jul 31)

Lab 11 Decision Trees (due Jul 31)

Discussion 10 Regularization, Cross-Validation, Gradient Descent (solutions)

Jul 27

Lecture 20 Logistic Regression II and Classification

Jul 28

Lecture 21 Decision Trees

Discussion 11 Logistic Regression, Classification (solutions)

Jul 29

Lecture 22 Inference for Modeling

Homework 10 Spam/Ham I(due Aug 2)

Jul 30

Live Session 6 (video) (code)

Week 7

Aug 2

Lecture 23 Principal Component Analysis

Homework 11 Spam/Ham II (due Aug 5)

Lab 12 Principal Component Analysis (due Aug 7)

Lab 13 Clustering (due Aug 7)

Discussion 12 Decision Trees, Inference (solutions)

Aug 3

Lecture 24 Clustering I

Aug 4

Lecture 25 Clustering II

Discussion 13 Principal Component Analysis, Clustering (solutions)

Aug 5

Lecture 26 Conclusion (video) (slides)

Homework 12 Principal Component Analysis (due Aug 9)

Aug 6

Live Session 8 NA

Week 8

Aug 9

Review

Aug 10

Review

Aug 11

Review

Aug 12

Final Final Exam (9:30 AM - 12:30 PM) (blank) (solutions)