Principles and Techniques of Data Science
UC Berkeley, Summer 2021
- Please read our course FAQ before contacting staff with questions that might be answered there.
- The Syllabus contains a detailed explanation of how each course component will work this summer, given that the course is being taught entirely online.
- The scheduling of all weekly events is in the Calendar.
- The Zoom links for all live events are in @6 on Piazza.
- Note:The schedule of lectures and assignments is subject to change.
Week 1
- Jun 21
Lecture 1 Introduction, Course Overview
Homework 1 Prerequisites (due Jun 24)
Lab 1 Prerequisite Coding (due Jun 26)
Lab 2 Pandas (due Jun 26)
- Jun 22
Lecture 2 Data Sampling and Probability
- Jun 23
Lecture 3 Estimators and Bias
Discussion 1 Prerequisites, Probability (solutions)
- Jun 24
Lecture 4 Pandas I
Homework 2 Trump Sampling (due Jun 29)
- Jun 25
Live Session 1 (recording)
Week 2
- Jun 28
Lecture 5 Pandas II
Homework 3 Food Safety (due Jul 5)
Lab 3 Data Cleaning and EDA (due Jul 3)
Lab 4 SQL (due Jul 3)
Discussion 2 Probability, Pandas (videos) (solutions)
- Jun 29
Lecture 6 Data Cleaning and EDA
- Jun 30
Lecture 7 Regular Expressions
Discussion 3 Pandas (code) (solutions)
- Jul 1
Lecture 8 SQL
- Jul 2
Live Session 2 (recording)
Week 3
- Jul 5
Independence Day
Homework 4 Tweets (due Jul 8)
Lab 5 Transformations and KDEs (due Jul 10)
Lab 6 Modeling, Summary Statistics, and Loss Functions (due Jul 10)
Discussion 4 SQL, Regex (videos) (solutions)
- Jul 6
Lecture 9 Visualization I
- Jul 7
Lecture 10 Visualization II
Discussion 5 Visualization (solutions)
- Jul 8
Lecture 11 Modeling
Homework 5 Bike Sharing (due Jul 12)
- Jul 9
Live Session 3 (recording) (code)
Week 4
- Jul 12
Lecture 12 Simple Linear Regression
Homework 6 Regression (due Jul 19)
Lab 7 Simple Linear Regression (due Jul 17)
Discussion 6 Modeling and Simple Linear Regression (solutions)
- Jul 13
Lecture 13 Ordinary Least Squares + Geometric Interpretation
- Jul 14
Lecture 14 Feature Engineering
Discussion 7 Midterm Review
- Jul 15
Midterm Midterm Exam (9:30 AM - 11:00 AM) (blank) (solutions) (video)
- Jul 16
Live Session 4 (recording)
Week 5
- Jul 19
Lecture 15 Modeling in Context: Fairness in Housing Appraisal
Homework 7 Housing I (due Jul 22)
Lab 8 Multiple Linear Regression and Feature Engineering (due Jul 24)
Lab 9 Feature Engineering and Cross-Validation (due Jul 24)
Discussion 8 Multiple Linear Regression (solutions)
- Jul 20
Lecture 16 Bias and Variance
- Jul 21
Lecture 17 Cross-Validation + Regularization
Discussion 9 Bias-Variance, HCE (solutions)
- Jul 22
Lecture 18 Gradient Descent
Homework 8 Housing II (due Jul 26)
- Jul 23
Live Session 5 (video pt. 1) (code)
Week 6
- Jul 26
Lecture 19 Logistic Regression I
Homework 9 Gradient Descent (due Jul 29)
Lab 10 Logistic Regression (due Jul 31)
Lab 11 Decision Trees (due Jul 31)
Discussion 10 Regularization, Cross-Validation, Gradient Descent (solutions)
- Jul 27
Lecture 20 Logistic Regression II and Classification
- Jul 28
Lecture 21 Decision Trees
Discussion 11 Logistic Regression, Classification (solutions)
- Jul 29
Lecture 22 Inference for Modeling
Homework 10 Spam/Ham I(due Aug 2)
- Jul 30
Week 7
- Aug 2
Lecture 23 Principal Component Analysis
Homework 11 Spam/Ham II (due Aug 5)
Lab 12 Principal Component Analysis (due Aug 7)
Lab 13 Clustering (due Aug 7)
Discussion 12 Decision Trees, Inference (solutions)
- Aug 3
Lecture 24 Clustering I
- Aug 4
Lecture 25 Clustering II
Discussion 13 Principal Component Analysis, Clustering (solutions)
- Aug 5
Homework 12 Principal Component Analysis (due Aug 9)
- Aug 6
Live Session 8 NA