Principles and Techniques of Data Science
UC Berkeley, Summer 2020
- All announcements are on Piazza. Make sure you are enrolled and active there.
- The Syllabus contains a detailed explanation of how each course component will work this summer, given that the course is being taught entirely online.
- The scheduling of all weekly events is in the Calendar.
- Zoom links for live events: @11 on Piazza.
Week 1
- Jun 22
Discussion 1 Prerequisite Review (video) (solutions)
Homework 1 Prerequisites (due Jun. 24)
Survey 1 Week 1 Survey (due Jun. 24)
- Jun 23
Lecture 2 Data Sampling and Probability
Lab 1 Prerequisite Coding (due Jun. 23)
- Jun 24
Lecture 3 Random Variables
Discussion 2 Random Variables (video) (solutions)
- Jun 25
Lecture 4 SQL
Homework 2 Trump Sampling (due Jun. 28)
Lab 2 SQL (due Jun. 25)
- Jun 26
Week 2
- Jun 29
Lecture 5 Pandas I
Project 1 Food Safety (due Jul. 6)
Survey 2 Week 2 Survey (due Jul. 1)
- Jun 30
Lecture 6 Pandas II
Lab 3 Pandas I (due Jun. 30)
- Jul 1
Lecture 7 Data Cleaning and EDA
- Jul 2
Lecture 8 Regular Expressions
Lab 4 Data Cleaning and EDA (due Jul. 2)
- Jul 3
N/A (Holiday)
Week 3
- Jul 6
Lecture 9 Visualization I
Homework 3 Bike Sharing (due Jul. 12)
Survey 3 Week 3 Survey (due Jul. 8)
- Jul 7
Lecture 10 Visualization II
Lab 5 Transformations and KDEs (due Jul. 7)
- Jul 8
Lecture 11 Modeling
Discussion 6 Visualizations and Transformations (video) (notebook)(solutions)
- Jul 9
Exam Midterm 1 (7-8:30PM)
Lab 6 Modeling and Loss Functions (due Jul. 12)
- Jul 10
Live Session 3 Ask Me Anything
Homework 4 Trump Tweets (due Jul. 15)
Week 4
- Jul 13
Lecture 12 Simple Linear Regression
Discussion 7 Correlation (video) (solutions)
Survey 4 Week 4 Survey + Midterm 2 Alt. Request (due Jul. 15)
- Jul 14
Lecture 13 Ordinary Least Squares
Lab 7 Regression (due Jul. 14)
- Jul 15
Lecture 14 Feature Engineering
Discussion 8 Least Squares (video) (solutions)
- Jul 16
Lecture 15 Bias and Variance
Homework 5 Regression (due Jul. 19)
Lab 8 Feature Engineering (due Jul. 16)
- Jul 17
Week 5
- Jul 20
Lecture 16 Regularization & Cross-Validation
Discussion 9 Bias-Variance & Cross-Validation (video) (solutions)
Homework 6 Predicting Housing Prices (due Jul. 22)
Survey 5 Week 5 Survey (due Jul. 22)
- Jul 21
Lecture 17 Gradient Descent
Lab 9 Cross-Validation (due Jul. 21)
- Jul 22
Lecture 18 Logistic Regression I
Discussion 10 Gradient Descent & Logistic Regression (video) (solutions)
- Jul 23
Lecture 19 Logistic Regression II, Classification
Homework 7 Gradient Descent & Logistic Regression (due Jul. 29)
Lab 10 Logistic Regression (due Jul. 23)
- Jul 24
Live Session 5 Midterm 2 Review Session (12-2PM)
Week 6
- Jul 27
Exam Midterm 2 (7-8:30PM)
Discussion 11 Cross-Entropy Loss and Classification (video) (solutions)
Survey 6 Week 6 Survey (due Jul. 31)
- Jul 28
Lecture 20 Decision Trees
Lab 11 Decision Trees & Random Forests (due Jul. 30)
Homework 9 Predicting Taxi Ride Durations (due Aug. 9)
- Jul 29
Lecture 21 Inference for Modeling
Discussion 12 Decision Trees & Random Forests (video) (solutions)
Project 2 Spam/Ham Classification (due Aug. 5)
- Jul 30
Lecture 22 Dimensionality Reduction
Lab 12 Bootstrap (due Jul. 31)
- Jul 31
Live Session 6 Midterm 2 Recap (video)
Week 7
- Aug 3
Lecture 23 Principal Component Analysis
Survey 7 Week 7 Survey (due Aug. 5)
- Aug 4
Lecture 24 Clustering
Lab 13 PCA (due Aug. 4)
- Aug 5
Lecture 25 Big Data
Discussion 14 Clustering & Big Data (video) (solutions)
Homework 8 PCA (due Aug. 9)
- Aug 6
Lab 14 Clustering (due Aug. 6)
- Aug 7
Live Session 7 Decision Trees and PCA (video) (notes) (code) (code HTML)
Week 8
- Aug 10
Discussion Topical Review
Survey 8 Week 8 Survey (due Aug. 11)
- Aug 11
Discussion Topical Review
- Aug 12
Exam Final Part 1 (7-8:30PM)
- Aug 13
Exam Final Part 2 (7-8:30PM)
- Aug 14
Guest Lecture Careers in Data Science Panel (video)