import numpy as np
import pandas as pd
df = pd.read_csv("enrollments.csv")
df.head()
df = df[["Term", "Subject", "Number", "Title", "Enrollment Cnt", "Instructor"]]
df.head()
Try to find all Spring offerings of this course. Note, this dataset only contains Spring offerings, so there's no need to filter based on semester. The official "Number" for this class is "C100".
df[df["Number"] == "C100"]
Create a series where each row correspond to one subject (e.g. English), and each value corresponds to the average number of students for courses in that subject. For example, your series might have a row saying that the average number of students in a Computer Science class is 88.
enrollment_grouped_by_subject = df["Enrollment Cnt"].groupby(df["Subject"])
enrollment_grouped_by_subject.mean()
Create a multi-indexed series where each row corresponds to one subject (e.g. English) offered during one semester (e.g. Spring 2017), and each value corresponds to the maximum number of students for courses in that subject during that semester. For example, you might have a row saying that the maximum number of students in a computer science course during Spring 2012 was 575.
enrollment_grouped_by_subject_and_term = df["Enrollment Cnt"].groupby([df["Subject"], df["Term"]])
enrollment_grouped_by_subject_and_term.max()
Try to compute the size of the largest class ever taught by each instructor. This challenge is stated more vaguely on purpose. You'll have to decide what the data structure looks like. Your result should be sorted in decreasing order of class size.
enrollment_grouped_by_instructor = df["Enrollment Cnt"].groupby(df["Instructor"])
enrollment_grouped_by_instructor.max().sort_values(ascending=False)