Principles and Techniques of Data Science

UC Berkeley, Summer 2020

  • All announcements are on Piazza. Make sure you are enrolled and active there.
  • The Syllabus contains a detailed explanation of how each course component will work this summer, given that the course is being taught entirely online.
  • The scheduling of all weekly events is in the Calendar.
  • Zoom links for live events: @11 on Piazza.


Week 1

Jun 22

Lecture 1 Course Overview (slides) (video) (code)

Ch. 1

Discussion 1 Prerequisite Review (video) (solutions)

Homework 1 Prerequisites (due Jun. 24)

Survey 1 Week 1 Survey (due Jun. 24)

Jun 23

Lecture 2 Data Sampling and Probability

Ch. 2

Lab 1 Prerequisite Coding (due Jun. 23)

Jun 24

Lecture 3 Random Variables

Ch. 12.1-12.2

Discussion 2 Random Variables (video) (solutions)

Jun 25

Lecture 4 SQL

Ch. 9

Homework 2 Trump Sampling (due Jun. 28)

Lab 2 SQL (due Jun. 25)

Jun 26

Live Session 1 Random Variables, SQL (video) (notes)

Week 2

Jun 29

Lecture 5 Pandas I

Ch. 3

Discussion 3 SQL (video) (solutions)

Project 1 Food Safety (due Jul. 6)

Survey 2 Week 2 Survey (due Jul. 1)

Jun 30

Lecture 6 Pandas II

Ch. 3

Lab 3 Pandas I (due Jun. 30)

Jul 1

Lecture 7 Data Cleaning and EDA

Ch. 4.1, Ch. 5

Discussion 4 Pandas II (video) (solutions)

Jul 2

Lecture 8 Regular Expressions

Ch. 8

Lab 4 Data Cleaning and EDA (due Jul. 2)

Live Session 2 Pandas Demo (video) (code) (code HTML)

Jul 3

N/A (Holiday)

Week 3

Jul 6

Lecture 9 Visualization I

Ch. 6.1-6.3

Discussion 5 Regex (video) (solutions)

Homework 3 Bike Sharing (due Jul. 12)

Survey 3 Week 3 Survey (due Jul. 8)

Jul 7

Lecture 10 Visualization II

Ch. 6.4-6.6

Lab 5 Transformations and KDEs (due Jul. 7)

Jul 8

Lecture 11 Modeling

Ch. 10

Discussion 6 Visualizations and Transformations (video) (notebook)(solutions)

Jul 9

Exam Midterm 1 (7-8:30PM)

Lab 6 Modeling and Loss Functions (due Jul. 12)

Jul 10

Live Session 3 Ask Me Anything

Homework 4 Trump Tweets (due Jul. 15)

Week 4

Jul 13

Lecture 12 Simple Linear Regression

Ch. 13.1-13.3

Discussion 7 Correlation (video) (solutions)

Survey 4 Week 4 Survey + Midterm 2 Alt. Request (due Jul. 15)

Jul 14

Lecture 13 Ordinary Least Squares

Ch. 13.4

Lab 7 Regression (due Jul. 14)

Jul 15

Lecture 14 Feature Engineering

Ch. 14

Discussion 8 Least Squares (video) (solutions)

Jul 16

Lecture 15 Bias and Variance

12.3, 15.1-15.2

Homework 5 Regression (due Jul. 19)

Lab 8 Feature Engineering (due Jul. 16)

Jul 17

Live Session 4 Linear Models (video) (code) (code HTML)

Week 5

Jul 20

Lecture 16 Regularization & Cross-Validation

Ch. 16, Ch. 15.3

Discussion 9 Bias-Variance & Cross-Validation (video) (solutions)

Homework 6 Predicting Housing Prices (due Jul. 22)

Survey 5 Week 5 Survey (due Jul. 22)

Jul 21

Lecture 17 Gradient Descent

Ch. 11

Lab 9 Cross-Validation (due Jul. 21)

Jul 22

Lecture 18 Logistic Regression I

Ch. 17.1-17.3

Discussion 10 Gradient Descent & Logistic Regression (video) (solutions)

Jul 23

Lecture 19 Logistic Regression II, Classification

Ch. 17.4-17.7

Homework 7 Gradient Descent & Logistic Regression (due Jul. 29)

Lab 10 Logistic Regression (due Jul. 23)

Jul 24

Live Session 5 Midterm 2 Review Session (12-2PM)

Week 6

Jul 27

Exam Midterm 2 (7-8:30PM)

Discussion 11 Cross-Entropy Loss and Classification (video) (solutions)

Survey 6 Week 6 Survey (due Jul. 31)

Jul 28

Lecture 20 Decision Trees

Lab 11 Decision Trees & Random Forests (due Jul. 30)

Homework 9 Predicting Taxi Ride Durations (due Aug. 9)

Jul 29

Lecture 21 Inference for Modeling

Ch. 18.1, 18.3

Discussion 12 Decision Trees & Random Forests (video) (solutions)

Project 2 Spam/Ham Classification (due Aug. 5)

Jul 30

Lecture 22 Dimensionality Reduction

Lab 12 Bootstrap (due Jul. 31)

Jul 31

Live Session 6 Midterm 2 Recap (video)

Week 7

Aug 3

Lecture 23 Principal Component Analysis

Discussion 13 PCA (video) (solutions)

Survey 7 Week 7 Survey (due Aug. 5)

Aug 4

Lecture 24 Clustering

Lab 13 PCA (due Aug. 4)

Aug 5

Lecture 25 Big Data

Discussion 14 Clustering & Big Data (video) (solutions)

Homework 8 PCA (due Aug. 9)

Aug 6

Lecture 26 Conclusion (slides) (video) (code) (code HTML)

Lab 14 Clustering (due Aug. 6)

Aug 7

Live Session 7 Decision Trees and PCA (video) (notes) (code) (code HTML)

Week 8

Aug 10

Guest Lecture Prof. Ziad Obermeyer (slides) (video)

Discussion Topical Review

Survey 8 Week 8 Survey (due Aug. 11)

Aug 11

Discussion Topical Review

Aug 12

Exam Final Part 1 (7-8:30PM)

Aug 13

Exam Final Part 2 (7-8:30PM)

Aug 14

Guest Lecture Careers in Data Science Panel (video)