Here is a collection of resources that will help you learn more about various concepts and skills covered in the class. Learning by reading is a key part of being a well rounded data scientist. We will not assign mandatory reading but instead encourage you to look at these and other materials. If you find something helpful, post it on EdStem, and consider contributing it to the course website.

Jump to:

Optional Supplementary Textbook

Alongside each lecture are optional textbook readings to the Data 100 textbook, Principles and Techniques of Data Science. Textbook readings are purely supplementary, and may contain material that is not in scope (and may also not be comprehensive). The textbook is actively in development during Spring 2022! Some readings may become out-of-date or reordered as the semester progresses. If you see a reading on our schedule that no longer exists, don’t hesitate to send a pull request to our course GitHub (see below).

Exam Resources

Semester Midterm (1) Midterm 2 Final
Spring 2022 Exam (Solutions) Exam (Solutions) Exam (Solutions)
Fall 2021 Exam (Solutions)    
Summer 2021 Exam (Solutions) [Video]   Exam (Solutions)
Spring 2021 Exam (Solutions)   Exam (Solutions)
Fall 2020 Exam (Solutions)   Exam (Solutions)
Summer 2020 Exam (Solutions) Exam (Solutions) Exam (Solutions)
Spring 2020 Checkpoint (Solutions)   N/A
Fall 2019 Exam (Solutions) Exam (Solutions) Exam (Solutions)
Summer 2019 Exam (Solutions) [Video]   Exam (Solutions)
Spring 2019 Exam (Solutions) [Video] Exam (Solutions) [Video] Exam (Solutions)
Fall 2018 Exam (Solutions)   Exam (Solutions)
Spring 2018 Exam (Solutions)   Exam (Solutions) [Video]
Fall 2017 Exam (Solutions) [Video]   Exam (Solutions)
Spring 2017 Exam (Solutions)   Exam (Solutions)

Spring 2022 Final Reference Sheet

Spring 2022 Midterm 2 Reference Sheet

Spring 2022 Midterm 1 Reference Sheet

Spring 2020 Checkpoint Reference Sheet

Fall 2019 Midterm 1 Reference Sheet

Spring 2019 Midterm 1 Reference Sheet

Course Website

We will be posting all lecture materials on the course syllabus. In addition, they will also be listed in the following publicly visible Github Repo.

You can send us changes to the course website by forking and sending a pull request to the course website github repository. You will then become part of the history of Data 100 at Berkeley.

Local Setup

NOTE: This section is out of date and no longer supported by the course staff.

Click here to read our guide on how to set up our development environment locally (as an alternative to using DataHub). Please note that any autograder tests will only work on DataHub.

Coding and Probability Resources


SQL Resources

  • We’ve assembled some SQL Review Slides to help you brush up on SQL.
  • We’ve also compiled a list of SQL practice problems, which can be found here, along with their solutions.
  • This SQL Cheat Sheet is an awesome resource that was created by Luke Harrison, a former Data 100 student.

Probability Practice

  • We’ve compiled a few practice probability problems that we believe may help in understanding the ideas covered in the course. They can be found here, along with their solutions.
  • We’d also like to point you to the textbook for Stat 88, an introductory probability course geared towards data science students at Berkeley.

Regex Practice

  • We’ve organized some regular expressions(regex) problems to help you get extra practice on regex in a notebook format. They can be found here, along with their solutions.
  • The official Python3 regex guide is good: link
  • DS100 Reference Sheet

LaTeX Tips

Other Web References

As a data scientist you will often need to search for information on various libraries and tools. In this class we will be using several key python libraries. Here are their documentation pages:


Because data science is a relatively new and rapidly evolving discipline there is no single ideal textbook for this subject. Instead we plan to use reading from a collection of books all of which are free. However, we have listed a few optional books that will provide additional context for those who are interested.

Data Science Education

Interested in bringing the Data Science major or curriculum to your academic institution? Please fill out this form if you would like support from Berkeley in offering some variant of our Data Science courses at your institution (or just to let us know that you’re interested). Information about the courses appear at and Please note that this form is only for instructors. If you are only interested in learning Python or data science, please look at our Data 8 or Data 100 websites mentioned above.