Principles and Techniques of Data Science

Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand their world. This intermediate level class bridges between Data8 and upper division computer science and statistics courses as well as methods courses in other fields. In this class, we explore key areas of data science including question formulation, data collection and cleaning, visualization, statistical inference, predictive modeling, and decision making.​ Through a strong emphasis on data centric computing, quantitative critical thinking, and exploratory data analysis, this class covers key principles and techniques of data science. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing.

This class is listed as STAT C100.

Important Information:

Goals

Prerequisites

While we are working to make this class widely accessible we currently require the following (or equivalent) prerequisites:

  1. Foundations of Data Science: Data8 covers much of the material in DS100 but at an introductory level. Data8 provides basic exposure to python programming and working with tabular data as well as visualization, statistics, and machine learning.

  2. Computing: The Structure and Interpretation of Computer Programs CS61A or Computational Structures in Data Science CS88. These courses provide additional background in python programming (e.g., for loops, lambdas, debugging, and complexity) that will enable DS100 to focus more on the concepts in Data Science and less on the details of programming in python.

  3. Math: Linear Algebra (Math 54, EE 16a, or Stat89a): We will need some basic concepts like linear operators, eigenvectors, derivatives, and integrals to enable statistical inference and derive new prediction algorithms. This may be satisfied concurrently to DS100.

Instructors

Sam Lau
Sam Lau

OH: M 11 - 12 (355 Evans)

OH: M 1 - 2 (355 Evans)

(email)

TAs

Manana Hakobyan (Head TA)
Manana Hakobyan (Head TA)

OH: Tu 11 - 12 (Evans 426)

OH: We 11 - 12 (Evans 426)

OH: Th 11 - 12 (Evans 426)

OH: Fr 10 - 11 (Evans 426)

(email)

Stephanie Djajadi
Stephanie Djajadi

Disc: MW 12 - 1 (Evans 334)

Lab: TuTh 12-1 (Evans 458)

OH: M 1 - 2 (Evans 426)

OH: Th 1 - 2 (Evans 426)

OH: F 12 - 2 (Evans 426)

(email)

Raguvir Kunani
Raguvir Kunani

Disc: MW 11 - 12 (Evans 334)

Lab: TuTh 11 - 12 (Evans 458)

OH: Tu 3 - 5 (Evans 426)

OH: We 2 - 4 (Evans 426)

(email)

Leo Li
Leo Li

Disc: MW 11 - 12 (Evans 340)

Lab: TuTh 11 - 12 (Cory 105)

OH: Fri 2 - 6 (Evans 426)

(email)

Ishaan Srivastava
Ishaan Srivastava

Disc: MW 1 - 2 (Evans 334)

Lab: TuTh 1 - 2 (Evans 458)

OH: Tu 2 - 4 (Evans 426)

OH: Th 2 - 4 (Evans 426)

(email)