In [1]:
import numpy as np
import pandas as pd
In [2]:
df = pd.read_csv("enrollments.csv")
df.head()
Out[2]:
Term Subject Number CCN Title Type Section# Instructor Meeting Times Rejected Student Cnt Enrollment Limit Waitlist Limit Cnt Enrollment Cnt % Class Filled % Class Filled.1 Waitlist Cnt Include Sb Week Desc
0 2018 Spring Computer Science 9A 35350 Programmers' Matlab Self-paced 1 Carol Marshall; Paul Hilfinger 12:00a-12:00a 23.0 10 100.0 8 80.00000 80.00000 0 Week 04
1 2018 Spring Computer Science 9C 35351 Programmers' C Self-paced 1 Carol Marshall; Paul Hilfinger 12:00a-12:00a 7.0 11 100.0 11 100.00000 100.00000 0 Week 04
2 2018 Spring Computer Science 9E 35352 Productive Unix Use Self-paced 1 Carol Marshall; Paul Hilfinger 12:00a-12:00a 25.0 36 100.0 36 100.00000 100.00000 0 Week 04
3 2018 Spring Computer Science 9F 35353 Programmers' C++ Self-paced 1 Carol Marshall; Paul Hilfinger 12:00a-12:00a 22.0 38 100.0 33 86.84211 86.84211 0 Week 04
4 2018 Spring Computer Science 9G 35354 Programmers' Java Self-paced 1 Carol Marshall; Paul Hilfinger 12:00a-12:00a 8.0 12 100.0 11 91.66667 91.66667 0 Week 04
In [3]:
df = df[["Term", "Subject", "Number", "Title", "Enrollment Cnt", "Instructor"]]
df.head()
Out[3]:
Term Subject Number Title Enrollment Cnt Instructor
0 2018 Spring Computer Science 9A Programmers' Matlab 8 Carol Marshall; Paul Hilfinger
1 2018 Spring Computer Science 9C Programmers' C 11 Carol Marshall; Paul Hilfinger
2 2018 Spring Computer Science 9E Productive Unix Use 36 Carol Marshall; Paul Hilfinger
3 2018 Spring Computer Science 9F Programmers' C++ 33 Carol Marshall; Paul Hilfinger
4 2018 Spring Computer Science 9G Programmers' Java 11 Carol Marshall; Paul Hilfinger

Challenge One

Try to find all Spring offerings of this course. Note, this dataset only contains Spring offerings, so there's no need to filter based on semester. The official "Number" for this class is "C100".

In [4]:
df[df["Number"] == "C100"]
Out[4]:
Term Subject Number Title Enrollment Cnt Instructor
20 2018 Spring Computer Science C100 Princ&Tech Data Sci 610 Fernando Perez; Joseph Gonzalez
387 2017 Spring Computer Science C100 Princ&Tech Data Sci 67 Bin Yu; Deborah Nolan; Joseph Gonzalez; Joseph...
670 2017 Spring Statistics C100 Princ&Tech Data Sci 28 Bin Yu; Deborah Nolan; Joseph Gonzalez; Joseph...

Challenge Two

Create a series where each row correspond to one subject (e.g. English), and each value corresponds to the average number of students for courses in that subject. For example, your series might have a row saying that the average number of students in a Computer Science class is 88.

In [5]:
enrollment_grouped_by_subject = df["Enrollment Cnt"].groupby(df["Subject"])
In [6]:
enrollment_grouped_by_subject.mean()
Out[6]:
Subject
Computer Science                 88.073620
Electrical Eng & Computer Sci    21.333333
Electrical Engineering           36.339426
English                          24.386286
Philosophy                       41.516949
Statistics                       60.301724
Name: Enrollment Cnt, dtype: float64

Challenge Three

Create a multi-indexed series where each row corresponds to one subject (e.g. English) offered during one semester (e.g. Spring 2017), and each value corresponds to the maximum number of students for courses in that subject during that semester. For example, you might have a row saying that the maximum number of students in a computer science course during Spring 2012 was 575.

In [7]:

In [7]:
enrollment_grouped_by_subject_and_term = df["Enrollment Cnt"].groupby([df["Subject"], df["Term"]])
In [8]:
enrollment_grouped_by_subject_and_term.max()
Out[8]:
Subject                        Term       
Computer Science               2012 Spring     575
                               2013 Spring     529
                               2014 Spring     944
                               2015 Spring     981
                               2016 Spring    1156
                               2017 Spring    1328
                               2018 Spring    1379
Electrical Eng & Computer Sci  2016 Spring      18
                               2017 Spring      30
                               2018 Spring     237
Electrical Engineering         2012 Spring     204
                               2013 Spring     238
                               2014 Spring     384
                               2015 Spring     383
                               2016 Spring     698
                               2017 Spring     775
                               2018 Spring     825
English                        2012 Spring     156
                               2013 Spring     160
                               2014 Spring     109
                               2015 Spring     120
                               2016 Spring     103
                               2017 Spring     141
                               2018 Spring     106
Philosophy                     2012 Spring     160
                               2013 Spring     187
                               2014 Spring     172
                               2015 Spring     197
                               2016 Spring     200
                               2017 Spring     242
                               2018 Spring     236
Statistics                     2012 Spring     316
                               2013 Spring     285
                               2014 Spring     299
                               2015 Spring     444
                               2016 Spring     450
                               2017 Spring     393
                               2018 Spring    1043
Name: Enrollment Cnt, dtype: int64

Challenge Four

Try to compute the size of the largest class ever taught by each instructor. This challenge is stated more vaguely on purpose. You'll have to decide what the data structure looks like. Your result should be sorted in decreasing order of class size.

In [9]:

In [9]:
enrollment_grouped_by_instructor = df["Enrollment Cnt"].groupby(df["Instructor"])
In [10]:
enrollment_grouped_by_instructor.max().sort_values(ascending=False)
Out[10]:
Instructor
Joshua Hug                                                                        1379
Joshua Hug; Joshua Hug                                                            1328
John DeNero                                                                       1210
Anindita Adhikari; Scott Lee                                                      1043
John Denero                                                                        981
Paul Hilfinger                                                                     965
Jonathan Shewchuk                                                                  944
Grace Zhang; Hannah Li; Laura Waller; Vladimir Stojanovic                          825
Babak Ayazifar; Satish Rao                                                         812
Babak Ayazifar; Vladimir Stojanovic                                                775
Satish Rao                                                                         766
Jean Walrand; Satish Rao                                                           707
Babak Ayazifar; Elad Alon; Reia Cho                                                698
Anca Dragan; Mohamad Baroudi; Sergey Levine                                        673
Anca Dragan; Daniel Klein; Pieter Abbeel                                           645
Arda Sahiner; Jaijeet Roychowdhury; Karina Chang; Michel Maharbiz; Nikunj Jain     631
Alessandro Chiesa; Umesh Vazirani                                                  629
Fernando Perez; Joseph Gonzalez                                                    610
Raluca Popa                                                                        602
Prasad Raghavendra; Sanjam Garg                                                    600
Pieter Abbeel                                                                      589
Anant Sahai                                                                        570
Christos Papadimitriou; Prasad Raghavendra                                         544
Umesh Vazirani                                                                     525
Gerald Friedland; Nicholas Weaver; Rebecca Herman; Stephan Liu                     515
Dylan Dreyer; John Wawrzynek; Nicholas Weaver; Peijie Li                           514
David Wagner; Raluca Popa                                                          507
Joseph Gonzalez; Joseph Hellerstein                                                496
Amir Kamil                                                                         479
Todd Green                                                                         472
                                                                                  ... 
Katherine O'Brien O'Keeffe                                                           6
Jasjeet Sekhon                                                                       6
Lisa Wu                                                                              6
Adam Wolisz; John Wawrzynek                                                          6
John Wawrzynek; Nicholas Weaver; Taehwan Kim                                         5
Joshua Cohen; Kinch Hoekstra                                                         5
David Goldschmidt; Philip Stark                                                      5
Margaret Kolb                                                                        5
Omur Ozel                                                                            4
Cameron Baradar; John Canny; Sam Kirschner                                           4
Joshua Cohen; Nicholas Kolodny                                                       4
Anthony Long                                                                         4
Gautam Premnath                                                                      4
Alan Hubbard; Mark Van Der Laan                                                      4
Ryan Perry                                                                           4
Moriel Vandsburger                                                                   4
Kannan Ramchandran; Sinho Chewi                                                      3
James Demmel; Ming Gu                                                                3
Christopher Hull                                                                     2
Mimi Koehl; Robert Full                                                              2
Andrew Packard; Francesco Borrelli                                                   2
Christopher Kutz; Joshua Cohen                                                       2
Christopher Kutz                                                                     1
Robert Dudley; Robert Full; Ronald Fearing                                           1
Michael Clancy                                                                       1
Carol Marshall; John DeNero                                                          1
Rajarshi Mukherjee                                                                   1
Robert Full                                                                          1
Katherine Yelick                                                                     1
Lotfi Zadeh                                                                          1
Name: Enrollment Cnt, Length: 745, dtype: int64