Principles and Techniques of Data Science

UC Berkeley, Spring 2023

Lecture Zoom Discussions Office Hour/Lab Help

Narges Norouzi

Narges Norouzi

norouzi@berkeley.edu

Jump to current week: here.

  • Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
  • The Syllabus contains a detailed explanation of how each course component will work this Spring, please take time to take a look.
  • Note: The schedule of lectures and assignments is subject to change.

Schedule

Week 1

Jan 17

Lecture 1 Introduction

Note 1

Lecture Participation 1 Lecture Participation 1

Survey Pre-Semester Survey (due 1/20)

Jan 19

Lecture 2 Pandas I

Note 2

Lecture Participation 2 Lecture Participation 2

Jan 20

Lab 1 Prerequisite Refresher (due Jan 24)

Homework 1A Plotting and the Permutation Test (due Jan 26)

Homework 1B Prerequisite Math (due Jan 26)

Week 2

Jan 24

Lecture 3 Pandas II

Note 3

Discussion 1 Prerequisites

Solution

Lecture Participation 3 Lecture Participation 3

Jan 26

Lecture 4 Pandas III; EDA and Data Cleaning, Part 1

Note 4

Lecture Participation 4 Lecture Participation 4

Jan 27

Lab 2 Pandas (due Jan 31)

Homework 2 Food Safety (due Feb 2)

Week 3

Jan 31

Lecture 5 Data Cleaning and EDA, Part 2

Note 5

Discussion 2 Pandas worksheet, worksheet notebook, groupwork notebook

Sheet Sol, Group Sol

Lecture Participation 5 Lecture Participation 5

Feb 2

Lecture 6 Text Wrangling and Regex

Note 6

Lecture Participation 6 Lecture Participation 6

Feb 3

Exam prep 1 Pandas

Sol, Notebook

Lab 3 Data Cleaning, EDA, Regex (due Feb 7)

Homework 3 Tweets (due Feb 9)

Week 4

Feb 7

Lecture 7 Visualization I

Note 7

Discussion 3 EDA and Regex worksheet, worksheet notebook

Sol, Notebook Sol

Lecture Participation 7 Lecture Participation 7

Feb 9

Lecture 8 Visualization II

Note 8

Lecture Participation 8 Lecture Participation 8

Feb 10

Exam prep 2 Data Cleaning, EDA and Regex

Solution, Notebook

Lin Alg Review 1 Linear Algebra Review #1

Lab 4 Visualization, Transformations and KDEs (due Feb 14)

Homework 4 Bike Sharing (due Feb 16)

Week 5

Feb 14

Lecture 9 Sampling

Note 9

Discussion 4 Visualization and Transformation worksheet, worksheet notebook

Solution

Lecture Participation 9 Lecture Participation 9

Feb 16

Lecture 10 Intro to Modeling, Simple Linear Regression

Note 10

Lecture Participation 10 Lecture Participation 10

Feb 17

Exam prep 3 Visualizations

Solution

Lin Alg Review 2 Linear Algebra Review #2

Lab 5 Modeling, Loss Functions, and Summary Statistics (due Feb 21)

Homework 5A Modeling (due Feb 23)

Homework 5B Modeling Handwritten (LaTeX template) (due Feb 23)

Week 6

Feb 21

Lecture 11 Constant model, loss, and transformations

Note 11

Discussion 5 Sampling and SLR worksheet, notebook

Sol, Notebook Sol

Lecture Participation 11Lecture Participation 11

Feb 23

Lecture 12 Ordinary Least Squares (Multiple Linear Regression)

Note 12

Lecture Participation 12Lecture Participation 12

Feb 24

Exam prep 4 Sampling and SLR

Solution

Lin Alg Review 3 Linear Algebra Review #3 (Linear Regression)

Lab 6 OLS (due Feb 28)

Homework 6Regression (due Mar 2)

Week 7

Feb 28

Lecture 13 sklearn / Gradient descent I

Note 13

Discussion 6 Models and OLS worksheet, notebook

Sol, Notebook Sol

Lecture Participation 13Lecture Participation 13

Mar 2

Lecture 14 Gradient descent II / Feature Engineering

Note 14

Lecture Participation 14 Lecture Participation 14

Mar 3

Exam prep 5 Ordinary Least Squares

Solution

Lab 7 Gradient descent / sklearn (due Mar 7)

Week 8

Mar 7

Lecture 15 Cross-Validation / Regularization

Note 15

Discussion 7 Gradient Descent, Feature Engineering

Solution

Lecture Participation 15Lecture Participation 15

Mar 9

Lecture 16 Regularization + Random Variables

Note 16

Lecture Participation 16 Lecture Participation 16

midterm Midterm Exam (7-9 PM)

Mar 10

Lab 8 Model Selection (due Mar 14)

Project A1 Housing (due Mar 16)

Week 9

Mar 14

Lecture 17 Estimators, Bias, and Variance

Note 17

Discussion 8 Cross-Validation, Regularization and Random Variables

Solution

Lecture Participation 17Lecture Participation 17

Mar 16

Lecture 18 Case study (HCE): CCAO

Note 18

Lecture Participation 18 Lecture Participation 18

Mar 17

Exam prep 6 Cross Validation and Random Variables

Solutions

Lab 9 Probability (due Mar 21)

Project A2 Housing II (due Mar 23)

Week 10

Mar 21

Lecture 19 Case Study: Climate & Physical Data

Note 19

Discussion 9 Housing II and Probability I worksheet, factsheet

Solutions

Lecture Participation 19 Lecture Participation 19

Mar 23

Lecture 20 Bias, Variance, and Inference

Note 20

Lecture Participation 20 Lecture Participation 20

Mar 24

Exam prep 7 Bias-Variance Tradeoff

Solution

Lab 10 Climate & Inference (due April 4)

Homework 7 Probability (due April 6)

Spring Break

Mar 28

Spring Break

Mar 30

Spring Break

Week 11

Apr 4

Lecture 21 SQL I

Note 21

Discussion 10 Bias and Variance

Solution

Lecture Participation 21Lecture Participation 21

Apr 6

Lecture 22 SQL II / Data Serialization

Note 22

Lecture Participation 22 Lecture Participation 22

Apr 7

Exam prep 8 SQL

Solution

Lab 11 SQL (due April 11)

Homework 8 SQL (due April 13)

Week 12

Apr 11

Lecture 23 Classification and Logistic Regression I

Note 23

Discussion 11 SQL worksheet, notebook

Solution

Lecture Participation 23 Lecture Participation 23

Apr 13

Lecture 24 Logistic Regression II

Note 24

Lecture Participation 24 Lecture Participation 24

Apr 14

Exam prep 9 Logistic Regression

Solution

Lab 12 Logistic regression (due Apr 18)

Project B1 Spam & Ham I (due Apr 20)

Week 13

Apr 18

Lecture 25 PCA I

Note 25

Discussion 12 Logistic Regression

Solution

Lecture Participation 25 Lecture Participation 25

Apr 20

Lecture 26 PCA II

Note 26

Lecture Participation 26 Lecture Participation 26

Apr 21

Exam prep 10 Logistic Regression II

Solution

Lab 13 PCA (due Apr 25)

Project B2 Spam & Ham II (due Apr 27)

Week 14

Apr 25

Lecture 27 Clustering

Discussion 13 PCA worksheet, notebook

Solution

Lecture Participation 27 Lecture Participation 27

Apr 27

Exam prep 11 PCA and Clustering

Solution

Lecture 28 Guest Lecture and Conclusion

Lecture Participation 28 Lecture Participation 28

Apr 28

Lab 14 Clustering (due May 2)

Week 16

May 1

RRR

May 2

RRR

May 3

RRR

May 4

RRR

May 5

RRR

Week 17

May 11

final Final Exam (8-11 AM)