About Data 100
Combining data, computation, and inferential thinking, data science is redefining how people and organizations solve challenging problems and understand their world. This intermediate level class bridges between Data 8 and upper division computer science and statistics courses as well as methods courses in other fields. In this class, we explore key areas of data science including question formulation, data collection and cleaning, visualization, statistical inference, predictive modeling, and decision making. Through a strong emphasis on data centric computing, quantitative critical thinking, and exploratory data analysis, this class covers key principles and techniques of data science. These include languages for transforming, querying and analyzing data; algorithms for machine learning methods including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing.
- Prepare students for advanced Berkeley courses in data-management, machine learning, and statistics, by providing the necessary foundation and context
- Enable students to start careers as data scientists by providing experience working with real-world data, tools, and techniques
- Empower students to apply computational and inferential thinking to address real-world problems
While we are working to make this class widely accessible, we currently require the following (or equivalent) prerequisites. Unlike past semesters, prerequisites will be enforced in this class. It is your responsibility to know the material in the prerequisites.
- Foundations of Data Science: Data 8 covers much of the material in Data 100 but at an introductory level. Data8 provides basic exposure to python programming and working with tabular data as well as visualization, statistics, and machine learning.
- Computing: The Structure and Interpretation of Computer Programs (CS 61A) or Computational Structures in Data Science (CS 88). These courses provide additional background in python programming (e.g., for loops, lambdas, debugging, and complexity) that will enable Data 100 to focus more on the concepts in Data Science and less on the details of programming in python.
- Math: Linear Algebra (Math 54, EE 16A, or Stat 89A): We will need some basic concepts like linear operators, eigenvectors, derivatives, and integrals to enable statistical inference and derive new prediction algorithms. This may be satisfied concurrently to Data 100.
Students taking Data C100/C200 come from a wide range of backgrounds. We hope to foster an inclusive and safe learning environment based on curiosity rather than competition. All members of the course community—the instructor, students, and GSIs—are expected to treat each other with courtesy and respect. Some of the responsibility for that lies with the staff, but a lot of it ultimately rests with you, the students.
Be Aware of Your Actions
Sometimes, the little things add up to creating an unwelcoming culture to some students. For example, you and a friend may think you are sharing in a private joke about other races, genders, or cultures, but this can have adverse effects on classmates who overhear it. There is a great deal of research on something called “stereotype threat,” which finds simply reminding someone that they belong to a particular culture or share a particular identity (on whatever dimension) can interfere with their course performance.
Stereotype threat works both ways: you can assume that a student will struggle based on who they appear to be, or you can assume that a student is doing great based on who they appear to be. Both are potentially harmful.
Bear in mind that diversity has many facets, some of which are not visible. Your classmates may have medical conditions (physical or mental), personal situations (financial, family, etc.), or interests that aren’t common to most students in the course. Another aspect of professionalism is avoiding comments that (likely unintentionally) put down colleagues for situations they cannot control. Bragging in open space that an assignment is easy or “crazy,” for example, can send subtle cues that discourage classmates who are dealing with issues that you can’t see. Please take care, so we can create a class in which all students feel supported and respected.
Beyond the slips that many of us make unintentionally are a host of behaviors that the course staff, department, and university do not tolerate. These are generally classified under the term harassment; sexual harassment is a specific form that is governed by federal laws known as Title IX.
UC Berkeley’s Title IX website provides many resources for understanding the terms, procedures, and policies around harassment. Make sure you are aware enough of these issues to avoid crossing a line in your interactions with other students. For example, repeatedly asking another student out on a date after they have said no can cross this line.
Your reaction to this topic might be to laugh it off, or to make or think snide remarks about “political correctness” or jokes about consent or other things. You might think people just need to grow a thicker skin or learn to take a joke. This isn’t your decision to make. Research shows the consequences (emotional as well as physical) on people who experience harassment. When your behavior forces another student to focus on something other than their education, you have crossed a line. You have no right to take someone else’s education away from them.
This issue is very important to Data 100’s course staff. Therefore, if we cannot appeal to your decency and collegiality, let us at least appeal to your self-interest. Do not mess around on this matter. It will not go well for you.
Issues with Course Staff
Professionalism and respect for diversity are not just matters between students; they also apply to how the course staff treat the students. The staff of this course will treat you in a way that respects our differences. However, despite our best efforts, we might slip up, hopefully inadvertently. If you are concerned about classroom environment issues created by the staff or overall class dynamic, please feel free to talk to us about it. The instructors and the head GSIs in particular welcome any comments or concerns regarding conduct of the course and the staff.
We are committed to creating a learning environment welcoming of all students that supports a diversity of thoughts, perspectives and experiences and respects your identities and backgrounds (including race, ethnicity, nationality, gender identity, socioeconomic class, sexual orientation, language, religion, ability, and more.) To help accomplish this:
- If your name and/or pronouns differ from those that appear in your official records, please let us know.
- If you feel like your performance in the class is being affected by your experiences outside of class (e.g., family matters, current events), please don’t hesitate to come and talk with us. We want to be resources for you.
- We (like many people) are still in the process of learning about diverse perspectives and identities. If something was said in class (by anyone) that made you feel uncomfortable, please talk to us about it.
- While the course staff understands that improving diversity, equity, and inclusion (DEI) are not enough to overcome systemic issues in academia such as racism, queerphobia, and other forms of discrimination and hatred, we also recognize the importance of DEI work.
- The Data Science Department has some resources available at https://data.berkeley.edu/about/diversity-equity-and-inclusion.
- There’s also a great set of resources available at https://eecs.berkeley.edu/resources/students/grievances.
- If there are other resources you think we should list here, let us know!
We will take all complaints about unprofessional or discriminatory behavior seriously.
This fall, Data 100 will be run in a hybrid format. This section details exactly how each component of the course will operate. But here’s a nice high-level “typical week in the course”:
|Office Hours||Office Hours||Office Hours||Office Hours||Office Hours|
|Lecture released||Lecture released|
|Lab Section||Discussion Section|
|Homework due||Homework released|
|Lab due||Lab released|
Note that these deadlines are subject to change.
- To see when any live events are scheduled, check the Calendar.
- To see when lectures, discussions, and assignments are released (and due), check the Home Page.
Note: In-person meetings are fully dependent on public health guidelines. We are prepared to hold all course activities online should circumstances demand.
- There are 2 lectures per week.
- Regular lectures will be pre-recorded, in a format that is optimized for online learning (short 5-10 minute videos with optional conceptual problems in between). Lecture videos will be released on the mornings of Tuesday and Thursday.
- Many of these will be from previous semesters, but some will be recorded this fall by the instructors.
- Lecture videos will be posted on YouTube. Each “lecture” will be an html page linked on the course website, containing videos and links to slides and code.
- There are “Quick Check” conceptual questions in between each lecture video, linked on the lecture webpage. See below for more details.
- Each lecture will also have a Piazza thread for students to ask questions.
We will have some guest speakers this term, on topics including Human Context and Ethics of Data Science and applications to Climate Change. These lectures will be held live on Zoom, and we strongly encourage you to attend them!
This course has discussion sections on Fridays, lasting for one hour. The goal of these sessions is to work through problems, hone your skills, and flesh out your understanding as part of a team. The problems that you solve and present as part of discussion are important in understanding this material.
To encourage attendance and participation in live discussion, we will offer the option of having discussion contribute to your grade. Specifically, points you earn from attending/participating in discussion can reduce the weighting of homeworks on your overall course grade. See the course policies for more details.
- You must be assigned to a discussion section, even if you don’t intend on attending the section.
- In a typical week, we will release the discussion worksheet on Friday morning.
- We will be holding live discussion sections on Fridays. You will sign up for a section, but attendance will not be required.
- Unlike past semesters, live discussions will not provide physical handouts.
- Attendance points will only be given for the section you are assigned to, but you may attend any of the online sections. However, due to room restrictions, you may not attend an in-person section that you were not assigned to.
- We will release discussion recordings or walkthroughs the week after the discussion.
- These will be videos from past semesters, so they may not be up-to-date with the current content. Unfortunately, we do not have the capacity to record walkthroughs this semester.
Quick Checks, as mentioned above, are short conceptual questions embedded into each lecture, in the form of Google Forms. Quick Checks are not graded. These are meant for you to check your understanding of the concepts that were just introduced.
Homeworks are week-long assignments that are designed to help students develop an in-depth understanding of both the theoretical and practical aspects of ideas presented in lecture.
- In a typical week, the homework is released on Friday and is due the following Thursday at 11:59PM.
- One or two homeworks will be on-paper written assignments; the rest will be Jupyter notebooks.
- Homeworks have both visible and hidden autograder tests. The visible tests are mainly sanity checks, e.g. a probability is <= 1, and are visible to students while they do the assignment. The hidden tests generally check for correctness, and are invisible to students while they are doing the assignment.
- The primary form of support students will have for homeworks and projects are the office hours we’ll host, and Piazza.
- Homeworks must be completed individually.
Labs are shorter programming assignments designed to give students familiarity with new ideas.
- In a typical week, the lab is released on Friday and is due the following Tuesday at 11:59PM.
- All lab autograder tests are visible.
- To help with lab, we will host live lab sections on Tuesday. You will sign up for a section, but attendance will not be required.
- Students can get help with labs in lab section, office hours, and on Piazza. However, the best place to get help is lab section.
- The office hours are listed on the Calendar, and will be held both virtually and in-person.
- Students can come to office hours for any questions on course assignments or material.
- In-person office hours will be held in various locations specified in the Calendar. To adhere to public health guidelines, we ask that students leave the OH room after their questions have been answered.
- Virtual OH can be accessed via oh.ds100.org, where students add themselves to the “queue” and specify the assignment they need help on.
- The instructors will also be hosting office hours. These will be reflected on the Calendar.
There will be one midterm exam, on November 1st (7-9PM PDT).
The exam will be primarily virtual and zoom proctored, following campus guidelines. We will have the option to be proctored in-person, but you will still be completing the exam online (just being in-person proctored instead of zoom proctored). In-person spots will be given on a case by case basis by only those who necessarily need it (form coming soon).
- Two time options will be offered to cover various timezones. No further alternates will be offered.
- DSP students will be offered on-campus exam taking as per their accommodations.
In lieu of a final exam, we will have a final project. The details of this project will be released on TBD, and it will be due TBD (during RRR or finals week).
- You will create 3-person groups, where all the group members are in your discussion section.
- We will offer a few (2-3) dataset options, as well as suggestions for the project.
Students will be allowed to submit regrade requests for the autograded and written portions of assignments in cases in which the rubric was incorrectly applied or the autograder scored their submission incorrectly. Regrades for the written portions of assignments will be handled through Gradescope, and autograder regrades via a Google Form.
Always check that the autograder executes correctly! Gradescope will show you the output of the public tests, and you should see the same results as you did on DataHub. If you see a discrepancy, ensure that you have exported the assignment correctly and, if there is still an issue, post on Piazza as soon as possible.
Regrade requests will not be considered in cases in which:
- a student uploads the incorrect file to the autograder
- the autograder fails to execute and the student does not notify the course staff before the assignment deadline
- a student fails to save their notebook before exporting and uploads an old version to the autograder
- a situation arises in which the course staff cannot ensure that the student’s work was done before the assignment deadline
- a students submits without following the steps outlined in @13
Data C100 Grading Scheme
|Homeworks||35%||13 + 1 optional, with 2 drops|
|Labs||10%||14, with 3 drops|
Data C200 Grading Scheme
|Homeworks||35%||13 + 1 optional, with 2 drops|
For C100/C200 optional: You may shift 5% of your HW grade to discussion attendance via this Google form. There will be 3 drops.
All assignments are due at 11:59 pm on the due date specified on the syllabus. Gradescope is where all assignments are submitted. Homeworks and labs will not be accepted late. Gradescope may allow you to make late submissions, but you will later be given a 0.
Extensions are only provided to students with DSP accommodations, or in the case of exceptional circumstances. If you have DSP accommodations, you should expect to receive an email from us. Otherwise, email [firstname.lastname@example.org](mailto:email@example.com to request an extension. If you make a request close to the deadline, we can not guarantee that you will receive a response before the deadline. Additionally, simply submitting a request does not guarantee you will receive an extension. Even if your work is incomplete, please submit before the deadline so you can receive credit for the work you did complete.
Note that extension requests will not be granted in cases where a student’s local (DataHub) tests are not passing. It is the student’s responsibility to solve such problems in advance of the deadline.
- Projects are marked down by 10% per day, up to two days. After two days, project submissions will not be accepted.
- Submission times are rounded up to the next day. That is, 2 minutes late = 1 day late.
Collaboration Policy and Academic Dishonesty
We will be following the EECS departmental policy on Academic Honesty, which states that using work or resources that are not your own or not permitted by the course may lead to disciplinary actions, up to and including a failing grade in the course.
Data science is a collaborative activity. While you may talk with others about the homework, we ask that you write your solutions individually in your own words. If you do discuss the assignments with others please include their names at the top of your notebook. Keep in mind that content from assignments will likely be covered on both the midterm and final.
If we suspect that you have submitted plagiarized work, we will call you in for a meeting. If we then determine that plagiarism has occurred, we reserve the right to give you a negative full score (-100%) or lower on the assignments in question, along with reporting your offense to the Center of Student Conduct.
Rather than copying someone else’s work, ask for help. You are not alone in this course! The entire staff is here to help you succeed. If you invest the time to learn the material and complete the assignments, you won’t need to copy any answers. (taken from 61A)
We also ask that you do not post your assignment solutions publicly.
Cheating on exams is a serious offense. We have methods of detecting cheating on exams – so don’t do it! Students caught cheating on any exam will fail this course.
We want you to succeed!
If you are feeling overwhelmed, visit our office hours and talk with us. We know college can be stressful – and especially so during the COVID-19 pandemic – and we want to help you succeed.
Important Note: We are committed to being a resource to you, but it is important to note that all members of the teaching staff for this course are responsible employees, meaning that we must disclose any incidents of sexual harassment or violence to campus authorities. If you would like to speak to a confidential advocate, please consider reaching out to the Berkeley PATH to Care Center.
Basic Needs Center
The Basic Needs Center (lower level of MLK Student Union, Suite 72) provides support with all the essential resources needed to not only survive, but thrive here at UC Berkeley. Their mission is to support you and work together towards justice and belonging for all. They define Basic Needs as the essential resources that impact your health, belonging, persistence, and overall well being. It is an ecosystem that includes: nutritious food, stable housing, hygiene, transportation, healthcare, mental wellness, financial sustainability, sleep, and emergency dependent services. They refuse to accept hunger, homelessness, and all other basic needs injustices as part of our university.
Berkeley International Office (BIO)
The mission of the Berkeley International Office (2299 Piedmont Avenue, 510-642-2818) is to provide support with all the essential resources needed to not only survive, but thrive here at UC Berkeley. Their mission is to support you and work together towards justice and belonging for all. They define Basic Needs as the essential resources that impact your health, belonging, persistence, and overall well being. It is an ecosystem that includes: nutritious food, stable housing, hygiene, transportation, healthcare, mental wellness, financial sustainability, sleep, and emergency dependent services. They refuse to accept hunger, homelessness, and all other basic needs injustices as part of our university.
Center for Access to Engineering Excellence (CAEE)
The Center for Access to Engineering Excellence (Bechtel Engineering Center 227) is an inclusive center that offers study spaces, nutritious snacks, and tutoring in >50 courses for Berkeley engineers and other majors across campus. The Center also offers a wide range of professional development, leadership, and wellness programs, and loans iclickers, laptops, and professional attire for interviews.
Counseling and Psychological Services (CAPS)
The staff of the UHS Counseling and Psychological Services (Tang Center, 2222 Bancroft Way; 510-642-9494; for after-hours support, please call the 24/7 line at 855-817-5667) provides confidential, brief counseling and crisis intervention to students with personal, academic and career stress. Services are provided by a multicultural group of professional counselors including psychologists, social workers, and advanced level trainees. All undergraduate and graduate students are eligible for CAPS services, regardless of insurance coverage.
To improve access for engineering students, a licensed psychologist from the Tang Center also holds walk-in appointments for confidential counseling in Bechtel Engineering Center 241 (check here for schedule).
COVID-19 Resources and Support
You can find UC Berkeley’ COVID-19 resources and support here.
Disabled Students’ Program (DSP)
The Disabled Students’ Program (260 César Chávez Student Center #4250; 510-642-0518) serves students with disabilities of all kinds, including mobility impairments, blind or low vision, deaf or hard of hearing; chronic illnesses (chronic pain, repetitive strain injuries, brain injuries, AIDS/HIV, cancer, etc.) psychological disabilities (bipolar disorder, severe anxiety or depression, etc.), Attention Deficit Disorder/Attention Deficit Hyperactivity Disorder, and Learning Disabilities. Services are individually designed and based on the specific needs of each student as identified by DSP’s Specialists. The Program’s official website includes information on DSP staff, UCB’s disabilities policy, application procedures, campus access guides for most university buildings, and portals for students and faculty.
Educational Opportunity Program (EOP)
The Educational Opportunity Program (Cesar Chavez Student Center 119; 510-642-7224) at Cal has provided first generation and low income college students with the guidance and resources necessary to succeed at the best public university in the world. EOP’s individualized academic counseling, support services, and extensive campus referral network help students develop the unique gifts and talents they each bring to the university while empowering them to achieve.
Gender Equity Resource Center (GenEq)
The Gender Equity Resource Center, fondly referred to as GenEq, is a UC Berkeley campus community center committed to fostering an inclusive Cal experience for all. GenEq is the campus location where students, faculty, staff and Alumni connect for resources, services, education and leadership programs related to gender and sexuality. The programs and services of the Gender Equity Resource Center are focused into four key areas: women; lesbian, gay, bisexual, and transgender (LGBT); sexual and dating violence; and hate crimes and bias driven incidents. GenEq strives to provide a space for respectful dialogue about sexuality and gender; illuminate the interrelationship of sexism, homophobia and gender bias and violence; create a campus free of violence and hate; provide leadership opportunities; advocate on behalf of survivors of sexual, hate, dating and gender violence; foster a community of women and LGBT leaders; and be a portal to campus and community resources on LGBT, Women, and the many intersections of identity (e.g., race, class, ability, etc.).
Multicultural Education Program
The Multicultural Education Program (MEP) is one of six initiatives funded by the Evelyn and Walter Haas, Jr. Fund to work towards institutional change and to create a positive campus climate for diversity. The MEP is a five-year initiative to establish a sustainable infrastructure for activities like educational consultation and diversity workshops for the campus that address both specific topics, and to cater to group needs across the campus.
Ombudsperson for Students
The Ombudsperson for Students (Sproul Hall 102; 510-642-5754) provides a confidential service for students involved in a University-related problem (academic or administrative), acting as a neutral complaint resolver and not as an advocate for any of the parties involved in a dispute. The Ombudsperson can provide information on policies and procedures affecting students, facilitate students’ contact with services able to assist in resolving the problem, and assist students in complaints concerning improper application of University policies or procedures. All matters referred to this office are held in strict confidence. The only exceptions, at the sole discretion of the Ombudsperson, are cases where there appears to be imminent threat of serious harm.
PATH to Care Center
The PATH to Care Center (510-642-1998) offers Confidential Care Advocates providing affirming, empowering, and confidential support for survivors and those who have experienced gendered violence, including sexual harassment, dating and intimate partner violence, sexual assault, stalking, and sexual exploitation. Advocates bring a non-judgmental, caring approach to exploring all options, rights, and resources.
Care Line (PATH to Care Center)
The Care Line (510-643-2005) is a 24/7, confidential, free, campus-based resource for urgent support around sexual assault, sexual harassment, interpersonal violence, stalking, and invasion of sexual privacy. The Care Line will connect you with a confidential advocate for trauma-informed crisis support including time-sensitive information, securing urgent safety resources, and accompaniment to medical care or reporting.
Student Advocate’s Office
The Student Advocate’s Office (SAO) is an executive, non-partisan office of the ASUC. We offer free, confidential casework services and resources to any student(s) navigating issues with the University, including academic, conduct, financial aid, and grievance concerns. All support is centered around students and aims for an equity-based approach.
Social Services provides confidential services and counseling to help students with managing problems that can emerge from illness such as financial, academic, legal, family concerns, and more. They specialize in helping students with pregnancy resources and referrals; alcohol/drug problems related to one’s own or a family member’s use; sexual assault/rape; relationship or other violence; and support for health concerns-new diagnoses or ongoing conditions. Social Services staff will assess a student’s immediate needs, work with the student to develop a plan to meet those needs, and facilitate arrangements with academic departments and advocate for the student with other campus offices and community agencies, as well as coordinate services within UHS.
Student Learning Center (SLC)
As the primary academic support service for undergraduates at UC Berkeley, the Student Learning Center (510-642-7332) assists students in transitioning to Cal, navigating the academic terrain, creating networks of resources, and achieving academic, personal, and professional goals. Through various services including tutoring, study groups, workshops, and courses, SLC supports undergraduate students in Biological and Physical Sciences, Business Administration, Computer Science, Economics, Mathematics, Social Sciences, Statistics, Study Strategies, and Writing.
Student Technology Equity Program (STEP)
The Student Technology Equity Program connects laptops, Wi-Fi hotspots, and other required technology to students in need.
UC Berkeley Food Pantry
The UC Berkeley Food Pantry (#68 Martin Luther King Student Union) aims to reduce food insecurity among students and staff at UC Berkeley, especially the lack of nutritious food. Students and staff can visit the pantry as many times as they need and take as much as they need while being mindful that it is a shared resource. The pantry operates on a self-assessed need basis; there are no eligibility requirements. The pantry is not for students and staff who need supplemental snacking food, but rather, core food support.
Undocumented Students Program (USP)
The Undocumented Students Program (119 Cesar Chavez Center; 642-7224) practices a holistic, multicultural and solution-focused approach that delivers individualized service for each student. The academic counseling, legal support, financial aid resources and extensive campus referral network provided by USP helps students develop the unique gifts and talents they each bring to the university, while empowering a sense of belonging. The program’s mission is to support the advancement of undocumented students within higher education and promote pathways for engaged scholarship.
Course Culture and Resources inspired and adapted with permission from Dr. Sarah Chasins’ Fall 2021 CS 164 Syllabus.