In this notebook we review some basic python functionality for working with JSON data.
Python includes a json
library which provides basic JSON functionality:
import json
In the following we create a python list and then save that dictionary as JSON:
data = [
{
"Prof": "Gonzalez",
"Classes": [
"CS186",
{ "Name": "Data100", "Year": [2017,2018] }
],
"Tenured": False
},
{
"Prof": "Nolan",
"Classes": [
"Stat133", "Stat153", "Stat198", "Data100"
],
"Tenured": True
}
]
data
json_str = json.dumps(data, indent=2)
print(json_str)
with open("bla.json", "w") as f:
json.dump(data, f, indent=2)
from utils import head
head("bla.json", lines=100)
obj = json.loads(json_str)
obj
with open("bla.json", "r") as f:
obj = json.load(f)
obj
type(obj)
len(obj)
first_obj = obj[0]
first_obj.keys()
We could build the dataframe by constructing one field at a time:
import pandas as pd
df = pd.DataFrame()
df['Names'] = [p['Prof'] for p in obj]
df['Tenured'] = [p['Tenured'] for p in obj]
df
Notice things get tricky with irregular nesting ...
import pandas as pd
df = pd.DataFrame()
df['Names'] = [p['Prof'] for p in obj]
df['Tenured'] = [p['Tenured'] for p in obj]
df['Classes'] = [p['Classes'] for p in obj]
df
pd.DataFrame(obj)