⚠️ This content is archived as of March 2026 and is retained exclusively for reference. Find current offerings.
Principles and Techniques of Data Science
UC Berkeley, Spring 2022
Lecture Zoom Discussion Sign-Up/Zoom Office Hour/Lab Help
Jump to current week: here.
- Lecture is hybrid: in-person in Li Ka Shing 245 and online via Zoom (see link above). Recordings will be posted within 12 hours of live lecture.
- Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
- Join Ed: here.
- Textbook readings are optional and actively in development. See the Resources for more details.
- Note: The schedule of lectures and assignments is subject to change.
Schedule
Week 1
- Jan 18
Lecture 1 Course Overview
Weekly Check 1 Weekly Check 1 (due Jan 24)
- Jan 20
Lecture 2 Sampling and Probability
- Jan 21
Lab 1 Prerequisite Coding (due Jan 25)
Homework 1 Intro + Prerequisites (due Jan 27)
Week 2
- Jan 24
Weekly Check 2 Weekly Check 2 (due Jan 31)
- Jan 25
Lecture 3 Pandas I
Textbook: Pandas Reference Table
Reference: Pandas API Documentation
- Jan 27
Lecture 4 Pandas II
- Jan 28
Discussion 2 Sampling and Probability, Pandas (code) (solutions) (recording)
Lab 2 Pandas (due Feb 1)
Homework 2 Food Safety (due Feb 3)
Week 3
- Jan 31
Weekly Check 3 Weekly Check 3 (due Feb 7)
- Feb 1
Lecture 5 Data Wrangling, EDA
- Feb 3
Lecture 6 Regex
- Feb 4
Discussion 3 Pandas, Data Cleaning (code) (solutions) (recording)
Lab 3 Data Cleaning (due Feb 8)
Homework 3 Tweets (due Feb 10)
Week 4
- Feb 7
Weekly Check 4 Weekly Check 4 (due Feb 14)
- Feb 8
Lecture 7 Visualization I
Textbook: Seaborn Reference Table
Textbook: Matplotlib Reference Table
- Feb 10
Lecture 8 Visualization II
- Feb 11
Discussion 4 Regex, Visualization (solutions) (recording)
Lab 4 Transformations and KDE (due Feb 15)
Homework 4 Bike Sharing (due Feb 17)
Week 5
- Feb 14
Weekly Check 5 Weekly Check 5 (due Feb 21)
- Feb 15
- Feb 17
Lecture 10 Constant Model, Loss, and Transformations
- Feb 18
Discussion 5 Modeling and Simple Linear Regression (solutions) (recording)
Lab 5 Modeling, Summary Statistics, Loss Functions (due Feb 22)
Homework 5 Regression (on paper) (LaTeX Template) (due Mar 3)
Week 6
- Feb 21
Weekly Check 6 Weekly Check 6 (due Feb 28)
- Feb 22
Lecture 11 Ordinary Least Squares (Multiple Linear Regression)
- Feb 24
Midterm Midterm 1 (8-10 pm) (No Lecture)
- Feb 25
Discussion 6 Ordinary Least Squares (solutions) (recording)
Lab 6 Ordinary Least Squares (due Mar 1)
Week 7
- Feb 28
Weekly Check 7 Weekly Check 7 (due Mar 7)
- Mar 1
Lecture 12 Gradient Descent, sklearn
- Mar 3
Lecture 13 Feature Engineering
- Mar 4
Discussion 7 Human Contexts in Engineering and Feature Engineering (solutions) (recording)
Lab 7 Gradient Descent and sklearn (due Mar 8)
Proj 1A Housing I (due Mar 10)
Week 8
- Mar 7
Weekly Check 8 Weekly Check 8 (due Mar 14)
- Mar 8
- Mar 10
Lecture 15 Cross-Validation and Regularization
- Mar 11
Discussion 8 HCE Part 2, Regularization (Budget Fact Sheet) (solutions) (recording)
Lab 8 Model Selection, Regularization, and Cross-Validation (due Mar 15)
Proj 1B Housing II (due
Mar 17Mar 18, Ed post)
Week 9
- Mar 14
Weekly Check 9 Weekly Check 9 (due Mar 28)
- Mar 15
Lecture 16 Probability I: Random Variables
- Mar 17
- Mar 18
Discussion 9 Cross-Validation + Probability I (solutions) (recording)
Lab 9 Probability and Modeling (due Mar 29)
Homework 6 Probability and Estimators (due Mar 31)
Spring Break
- Mar 22
Spring Break
- Mar 24
Spring Break
Week 10
- Mar 28
Weekly Check 10 Weekly Check 10 (due Apr 4)
- Mar 29
Lecture 18 SQL I
- Mar 31
Lecture 19 SQL II and PCA
- Apr 1
Discussion 10 Probability II + SQL I (solutions) (recording)
Lab 10 SQL (due
Apr 5Apr 6)Homework 7 SQL and PCA (due Apr 14)
- Apr 2
Grad Project Grad Project Released
Week 11
- Apr 4
Weekly Check 11 Weekly Check 11 (due Apr 11)
- Apr 5
Lecture 20 PCA II
- Apr 7
Midterm Midterm 2 (7-8:30 pm) (No Lecture)
- Apr 8
Discussion 11 SQL II + PCA (solutions) (Cancelled, No Live Discussion)
Lab 11 PCA (due Apr 12)
Week 12
- Apr 11
Weekly Check 12 Weekly Check 12 (due Apr 18)
- Apr 12
Lecture 21 Classification and Logistic Regression I
- Apr 14
Lecture 22 Logistic Regression II
- Apr 15
Discussion 12 Logistic Regression I (solutions) (recording)
Lab 12 Logistic Regression (due Apr 19)
Proj 2A Spam & Ham I (due
Apr 21Apr 22)
Week 13
- Apr 18
Weekly Check 13 Weekly Check 13 (due Apr 25)
- Apr 19
Lecture 23 Decision Trees
- Apr 21
Lecture 24 Clustering
- Apr 22
Discussion 13 Decision Trees and Random Forests (solutions) (recording)
Lab 13 Decision Trees & Random Forests (due Apr 26)
Proj 2B Spam & Ham II (due Apr 28)
Week 14
- Apr 26
- Apr 27
Grad Project First Draft for Grad Project Due
- Apr 28
Lecture 26 Guest Speaker: Matei Zaharia - Parallel Data Analytics; Conclusion
Weekly Check 14 Weekly Check 14 (due
May 2May 5)- Apr 29
Discussion 14 Clustering and Final Review (solutions)
Lab 14 Clustering (Optional, no due date)
Homework 8 Taxis (Optional, no due date)
Week 15
- May 3
RRR Week
- May 5
RRR Week
Week 16
- May 9
Grad Project Final Draft of Grad Project Due
- May 13
Final Final Exam (7-10 pm)