In this notebook we review some basic python functionality for working with JSON data.
Python includes a json library which provides basic JSON functionality:
import json
In the following we create a python list and then save that dictionary as JSON:
data = [
  {
    "Prof": "Gonzalez",
    "Classes": [
      "CS186", 
      { "Name": "Data100", "Year": [2017,2018] }
    ],
    "Tenured": False
  },
  {
    "Prof": "Nolan",
    "Classes": [
      "Stat133", "Stat153", "Stat198", "Data100"
    ],
    "Tenured": True
  }
]
data
json_str = json.dumps(data, indent=2)
print(json_str)
with open("bla.json", "w") as f:
    json.dump(data, f, indent=2)
from utils import head
head("bla.json", lines=100)
obj = json.loads(json_str)
obj
with open("bla.json", "r") as f:
    obj = json.load(f)
    
obj
type(obj)
len(obj)
first_obj = obj[0]
first_obj.keys()
We could build the dataframe by constructing one field at a time:
import pandas as pd
df = pd.DataFrame()
df['Names'] = [p['Prof'] for p in obj]
df['Tenured'] = [p['Tenured'] for p in obj]
df
Notice things get tricky with irregular nesting ...
import pandas as pd
df = pd.DataFrame()
df['Names'] = [p['Prof'] for p in obj]
df['Tenured'] = [p['Tenured'] for p in obj]
df['Classes'] = [p['Classes'] for p in obj]
df
pd.DataFrame(obj)