Principles and Techniques of Data Science

Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand their world. This intermediate level class bridges between Data8 and upper division computer science and statistics courses as well as methods courses in other fields. In this class, we explore key areas of data science including question formulation, data collection and cleaning, visualization, statistical inference, predictive modeling, and decision making.​ Through a strong emphasizes on data centric computing, quantitative critical thinking, and exploratory data analysis this class covers key principles and techniques of data science. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing.

This class is listed as STAT C100 and as COMPSCI C100.

Important Information:

Lab, Section, and Office Hours Schedules

For official holidays see the academic calendar.

Goals

Prerequisites

While we are working to make this class widely accessible we currently require the following (or equivalent) prerequisites :

  1. Foundations of Data Science: Data8 covers much of the material in DS100 but at an introductory level. Data8 provides basic exposure to python programming and working with tabular data as well as visualization, statistics, and machine learning.

  2. Computing: The Structure and Interpretation of Computer Programs CS61A or Computational Structures in Data Science CS88. These courses provide additional background in python programming (e.g., for loops, lambdas, debugging, and complexity) that will enable DS100 to focus more on the concepts in Data Science and less on the details of programming in python.

  3. Math: Linear Algebra (Math 54, EE 16a, or Stat89a): We will need some basic concepts like linear operators, eigenvectors, derivatives, and integrals to enable statistical inference and derive new prediction algorithms. This may be satisfied concurrently to DS100.

Instructors

Sandrine Dudoit
Sandrine Dudoit

OH: M 1 - 2 (327 Evans)

OH: Th 2 - 3 (327 Evans)

(email)

John DeNero
John DeNero

OH: M 9 - 10 (781 Soda)

OH: W 10 - 11 (781 Soda)

(email)

TAs

Allen Shen
Allen Shen

Disc: 2 - 3 (Dwinelle 251)

Disc: 3 - 4 (Dwinelle 255)

Lab: 2 - 3 (SDH 254)

Lab: 3 - 4 (SDH 254)

OH: Th 2 - 4 (411 Soda)

(email)

Aman Dhar
Aman Dhar

Disc: 1 - 2 (Etch 3105)

Disc: 2 - 3 (Etch 3105)

Lab: 11 - 12 (Evans 458)

Lab: 12 - 1 (Evans 458)

OH: M 10 - 12 (411 Soda)

(email)

Ananth Agarwal
Ananth Agarwal

Disc: 12 - 1 (Dwinelle 242)

Lab: 4 - 5 (Evans 458)

OH: M 3 - 4 (411 Soda)

(email)

Ashley Chien
Ashley Chien

Disc: 4 - 5 (Dwinelle 243)

Lab: 9 - 10 (Evans B6)

OH: W 10 - 11 (411 Soda)

(email)

Dan Crankshaw
Dan Crankshaw

Disc: 10 - 11 (Wheeler 200)

Lab: 3 - 4 (Evans B6)

OH: F 11 - 12 (611 Soda)

(email)

Daniel Zhu
Daniel Zhu

Disc: 12 - 1 (Latimer 105)

Lab: 10 - 11 (Cory 105)

OH: M 2 - 3 (411 Soda)

(email)

Hatim Ezbakhe
Hatim Ezbakhe

Disc: 2 - 3 (Dwinelle 243)

Lab: 3 - 4 (Evans 458)

OH: W 2 - 3 (411 Soda)

(email)

Janaki Vivrekar
Janaki Vivrekar

Disc: 2 - 3 (Wheeler 200)

Lab: 4 - 5 (Evans B6)

OH: M 12 - 1 (411 Soda)

(email)

Jinkyu Kim
Jinkyu Kim

Disc: 3 - 4 (Soda 310)

Disc: 5 - 6 (Evans 9)

Lab: 9 - 10 (Evans 458)

Lab: 3 - 4 (Cory 105)

OH: M 9 - 10 (411 Soda)

OH: W 11 - 12 (411 Soda)

(email)

Junseo Park
Junseo Park

Disc: 11 - 12 (Etch 3109)

Lab: 2 - 3 (Evans B6)

OH: M 11 - 12 (411 Soda)

(email)

Manana Hakobyan
Manana Hakobyan

Disc: 11 - 12 (Cory 289)

Lab: 10 - 11 (Evans B6)

OH: W 12 - 1 (411 Soda)

(email)

Maxwell Murphy
Maxwell Murphy

Disc: 10 - 11 (GPB 103)

Lab: 2 - 3 (Cory 105)

OH: M 5 - 6 (651 Soda)

(email)

Neil Shah
Neil Shah

Disc: 10 - 11 (Etch 3113)

Lab: 10 - 11 (SDH 254)

(email)

Philippe Boileau
Philippe Boileau

Disc: 1 - 2 (Evans 9)

Disc: 2 - 3 (Latimer 105)

Lab: 11 - 12 (Cory 105)

Lab: 12 - 1 (Cory 105)

OH: M 9 - 10 (411 Soda)

OH: M 4 - 5 (651 Soda)

(email)

Samir Naqvi
Samir Naqvi

Disc: 3 - 4 (Dwinelle 242)

Lab: 11 - 12 (Evans B6)

OH: M 1 - 2 (411 Soda)

(email)

Sasank Chaganty
Sasank Chaganty

Disc: 11 - 12 (Etch 3113)

Lab: 10 - 11 (Evans 458)

OH: M 4 - 5 (651 Soda)

(email)

Shrishti (Sona) Jeswani
Shrishti (Sona) Jeswani

Disc: 10 - 11 (Etch 3111)

Disc: 11 - 12 (Etch 3111)

Lab: 1 - 2 (Evans 458)

Lab: 2 - 3 (Evans 458)

OH: F 8 - 10 (341A Soda)

(email)

Simon Mo
Simon Mo

Disc: 3 - 4 (Latimer 105)

Disc: 4 - 5 (Latimer 105)

Lab: 12 - 1 (SDH 254)

Lab: 1 - 2 (SDH 254)

OH: M 1 - 3 (411 Soda)

(email)

Sumukh Shivakumar
Sumukh Shivakumar

Disc: 2 - 3 (Wheeler 220)

Lab: 1 - 2 (Cory 105)

OH: F 10 - 11 (611 Soda)

(email)

Suraj Rampure
Suraj Rampure

Disc: 12 - 1 (Etch 3119)

Disc: 1 - 2 (Etch 3119)

Lab: 12 - 1 (Evans B6)

Lab: 1 - 2 (Evans B6)

OH: Th 12 - 2 (411 Soda)

(email)

Tony Hsu
Tony Hsu

Disc: 12 - 1 (Dwinelle 130)

Lab: 11 - 12 (SDH 254)

OH: W 1 - 2 (411 Soda)

(email)