Principles and Techniques of Data Science

UC Berkeley, Spring 2023

Lecture Zoom Discussions Office Hour/Lab Help

Narges Norouzi

Narges Norouzi

norouzi@berkeley.edu

Jump to current week: here.

  • Frequently Asked Questions: Before posting on the class Ed, please read the class FAQ page.
  • The Syllabus contains a detailed explanation of how each course component will work this Spring, please take time to take a look.
  • Note: The schedule of lectures and assignments is subject to change.

Schedule

Week 1

Jan 17

Lecture 1 Introduction

Note 1

Lecture Participation 1 Lecture Participation 1

Survey Pre-Semester Survey (due 1/20)

Jan 19

Lecture 2 Pandas I

Note 2

Lecture Participation 2 Lecture Participation 2

Jan 20

Lab 1 Prerequisite Refresher (due Jan 24)

Homework 1A Plotting and the Permutation Test (due Jan 26)

Homework 1B Prerequisite Math (due Jan 26)

Week 2

Jan 24

Lecture 3 Pandas II

Note 3

Discussion 1 Prerequisites

Solution

Lecture Participation 3 Lecture Participation 3

Jan 26

Lecture 4 Pandas III; EDA and Data Cleaning, Part 1

Note 4

Lecture Participation 4 Lecture Participation 4

Jan 27

Lab 2 Pandas (due Jan 31)

Homework 2 Food Safety (due Feb 2)

Week 3

Jan 31

Lecture 5 Data Cleaning and EDA, Part 2

Note 5

Discussion 2 Pandas worksheet, worksheet notebook, groupwork notebook

sheet sol, group sol

Lecture Participation 5 Lecture Participation 5

Feb 2

Lecture 6 Text Wrangling and Regex

Note 6

Lecture Participation 6 Lecture Participation 6

Feb 3

Exam prep 1 Pandas

Solution

Lab 3 Data Cleaning, EDA, Regex (due Feb 7)

Homework 3 Tweets (due Feb 9)

Week 4

Feb 7

Lecture 7 Visualization I

Discussion 3 EDA and Regex worksheet, worksheet notebook

Lecture Participation 7 Lecture Participation 7

Feb 9

Lecture 8 Probability & Visualization II

Lecture Participation 8

Feb 10

Exam prep 2 Data Cleaning, EDA and Regex

Lin Alg Review Linear Algebra Review #1

Lab 4 Transformations

Homework 4 Bike Sharing

Week 5

Feb 14

Lecture 9 Sampling and probability II

Discussion 4 Visualization and Transformation

Lecture Participation 9

Feb 16

Lecture 10 Modeling, SLR

Lecture Participation 10

Feb 17

Lab 5 Modeling, Summary Statistics, and Loss Functions

Homework 5 Modeling

Week 6

Feb 21

Lecture 11 Constant model, loss, and transformations

Discussion 5 Probability, Sampling, and SLR

Lecture Participation 11

Feb 23

Lecture 12 OLS (multiple regression)

Lecture Participation 12

Feb 24

Lab 6 OLS

Homework 6 Regression

Week 7

Feb 28

Lecture 13 Gradient descent / sklearn

Discussion 6 Models and OLS

Lecture Participation 13

Mar 2

Lecture 14 Feature Engineering

Lecture Participation 14

Mar 3

Lab 7 Gradient descent / sklearn

Week 8

Mar 7

Lecture 15 Cross-Validation / Regularization

Discussion 7 Gradient Descent, Feature Engineering, Housing I

Lecture Participation 15

Mar 9

Lecture 16 Probability I

Lecture Participation 16

midterm Midterm Exam (7-9 PM)

Mar 10

Lab 8 Model Selection

Project 1A Housing

Week 9

Mar 14

Lecture 17 Probability II

Discussion 8 Cross-Validation and Regularization

Lecture Participation 17

Mar 16

Lecture 18 Case study (HCE): CCAO

Lecture Participation 18

Mar 17

Lab 9 Climate

Project 1B Housing

Week 10

Mar 21

Lecture 19 Case Study: Climate & Physical Data

Discussion 9 Housing II and Probability I

Lecture Participation 19

Mar 23

Lecture 20 Causal Inference and Confounding

Lecture Participation 20

Mar 24

Lab 10 Probability & Inference

Homework 7 Probability

Spring Break

Mar 28

Spring Break

Mar 30

Spring Break

Week 11

Apr 4

Lecture 21 SQL I

Discussion 10 Bias and Variance

Lecture Participation 21

Apr 6

Lecture 22 SQL II / Cloud Data

Lecture Participation 22

Apr 7

Lab 11 SQL

Homework 8 SQL

Week 12

Apr 11

Lecture 23 Logistic regression I

Discussion 11 SQL

Lecture Participation 23

Apr 13

Lecture 24 Logistic regression II

Lecture Participation 24

Apr 14

Lab 12 Logistic regression

Project 2A Spam & Ham I

Week 13

Apr 18

Lecture 25 PCA I

Discussion 12 Logistic Regression

Lecture Participation 25

Apr 20

Lecture 26 PCA II

Lecture Participation 26

Apr 21

Lab 12 PCA

Project 2B Spam & Ham II

Week 14

Apr 25

Lecture 27 KMeans Clustering

Discussion 13 PCA

Lecture Participation 27

Apr 27

Lecture 28 Guest + Closing

Lecture Participation 28

Apr 28

Lab 14 Clustering

Week 16

May 1

RRR

Lab 15 Decision Trees (Optional, no due date)

May 2

RRR

May 3

RRR

May 4

RRR

May 5

RRR

Week 17

May 11

final Final Exam (8-11 AM)