Graduate Project

Introduction

The graduate project is offered only to students enrolled in Data 200 or Data 200S. Other students are welcome to explore the questions and datasets in the project for personal learning, but their work will not be graded or counted towards their final grades.

The purpose of the project is to give students experience in both open-ended data science analysis and research in general.

Teamwork

You must work in groups of two or three students. In order to give everyone experience in collaborating on a data science project, individual projects are not allowed. Everyone in the same group will receive the same grade (except for exceptional circumstances).

Milestones and Grading Breakdown

Milestones Deadline (11:59 PM Pacific) Event Deliverables Submission Link Grading Weight
Milestone 1 March 4 Group Formation + Research Proposal Project Proposal Pensive 5%
Milestone 2 March 18 EDA EDA Write-Up + Notebook Pensive 10%
Milestone 3 April 1 Mandatory Check-In Progress Report + Meeting Booking Pensive 10%
Milestone 4 April 15 Project Report First Draft Final Report Draft Write-Up Pensive 20%
Milestone 5 April 22 External Peer Review External Peer Review Pensive 7%
Final Submission May 6 Final Project Report Final Project Report + Presentation Video Project Report Pensive
CV Predictions Pensive
42%
Weekly Internal Peer Reviews Due along with Milestones 2, 3, 4, and Final Submission Internal Peer Review Internal Peer Review Pensive (Please refer to corresponding link each week) 6%

For each milestone listed above, detailed expectations can be found in the โ€œMilestoneโ€ section under the Computer Vision project page. Please refer to these sections for specific requirements and guidelines.

Late Policy

  • No Extensions for Milestones: Must be submitted on time; no extensions are permitted. Milestones cannot be submitted late as they are crucial for the peer review process.
  • Final Report and Presentation Video: Late submissions incur a 10% daily penalty, up to a maximum of two days. Submissions are rounded to the nearest day (e.g., 2 minutes late counts as 1 day late).

Accessing Datasets

All of the provided datasets can be found in the Datahub directory _shared/bcourses-1551658-readonly. You can access the data directly from Datahub. If you wish to work on the project locally, you can also download the files containing the datasets for each topic by right-clicking on the file in JupyterLab and select โ€œCopy Download Linkโ€. If you choose to train more complex models, DataHub might not have enough hardware resources or memory, in which case you can use Google Colab or your local machine. If you would like to use Google Colab, feel free to check out this link to get started.

Computer Vision

You are expected to complete all elements of the Computer Vision project, which can be found here.


Back to Top

Accessibility Nondiscrimination

Copyright ยฉ2026, Regents of the University of California and respective authors.

This site is built following the Berkeley Class Site template, which is generously based on the Just the Class, and Just the Docs templates.

View all course offerings