Data 100: Principles and Techniques of Data Science

Summer 2025 Frequently Asked Questions

Data 100 Logo

Course Description

In UC Berkley Data 100, students will explore the data science lifecycle, including question formulation, data collection and cleaning, exploratory data analysis and visualization, statistical inference and prediction​, and decision-making.​ This class will focus on quantitative critical thinking​ and key principles and techniques needed to carry out this cycle. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing.

Offerings

  1. Summer 2025
  2. Spring 2025
  3. Fall 2024
  4. Summer 2024
  5. Spring 2024
  6. Fall 2023
  7. Summer 2023
  8. Spring 2023
  9. Fall 2022
  10. Summer 2022
  11. Spring 2022
  12. Fall 2021
  13. Summer 2021
  14. Spring 2021
  15. Fall 2020
  16. Summer 2020
  17. Spring 2020
  18. Fall 2019
  19. Summer 2019
  20. Spring 2019
  21. Fall 2018
  22. Spring 2018
  23. Fall 2017
  24. Spring 2017

Goals

Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand their world. This intermediate-level class bridges Data 8 and upper-division computer science and statistics courses as well as methods courses in other fields.

  • Prepare students for advanced Berkeley courses in data-management (CS 186), machine learning (CS 189), and statistics (Stat 154), by providing the necessary foundation and context

  • Enable students to start careers as data scientists by providing experience working with real-world data, tools, and techniques

  • Empower students to apply computational and inferential thinking to tackle real-world problems

Prerequisites

While we are working to make this class widely accessible, we currently require the following (or equivalent) prerequisites:

  1. Foundations of Data Science: Data 8 covers much of the material in Data 100 but at an introductory level. Data8 provides basic exposure to Python programming and working with tabular data, as well as visualization, statistics, and machine learning. Alternate course: Introduction to Probability and Statistics: Stat 20

  2. Computing: The Structure and Interpretation of Computer Programs: CS 61A or Computational Structures in Data Science: CS 88. These courses provide additional background in Python programming (e.g., for loops, lambdas, debugging, and complexity) that will enable Data 100 to focus more on the concepts in Data Science and less on the details of programming in Python. Alternate course: Introduction to Computer Programming and Numerical Methods: ENGIN 7

  3. Linear Algebra: Math 54, Math 56, Math 110, EE 16A, or Physics 89. We will need some basic concepts like linear operators and derivatives to enable statistical inference and derive new prediction algorithms. This may be satisfied concurrently with Data 100.