Before we look at SQL syntax in detail, let's first get ourselves set up to run SQL queries in Jupyter.
1. Load the sql
Module.
Load %%sql
cell magic.
%load_ext sql
The sql extension is already loaded. To reload it, use: %reload_ext sql
2. Connect to a database.
Here, we connect to the SQLite database basic_examples.db
.
%%sql
sqlite:///data/basic_examples.db
3. Run a simple SQL query.
Note the %%sql
lets Jupyter parse the rest of the lines as a SQL command.
%%sql
SELECT * FROM Dragon;
* sqlite:///data/basic_examples.db Done.
name | year | cute |
---|---|---|
hiccup | 2010 | 10 |
drogon | 2011 | -100 |
dragon 2 | 2019 | 0 |
puff | 2010 | 100 |
smaug | 2011 | None |
Simple query, this time on two different lines.
%%sql
SELECT *
FROM Dragon;
* sqlite:///data/basic_examples.db Done.
name | year | cute |
---|---|---|
hiccup | 2010 | 10 |
drogon | 2011 | -100 |
dragon 2 | 2019 | 0 |
puff | 2010 | 100 |
smaug | 2011 | None |
pd.read_sql
¶It turns out that pandas
has a special-purpose function to parse SQL queries. We can pass in a SQL query as a string to return a pandas
DataFrame. To achieve the same result as we did using cell magic above, we can do the following.
1. Connect to a database
import sqlalchemy
import pandas as pd
engine = sqlalchemy.create_engine("sqlite:///data/basic_examples.db")
connection = engine.connect()
2. Run a simple SQL query
query = """
SELECT *
FROM Dragon;
"""
pd.read_sql(query, engine)
name | year | cute | |
---|---|---|---|
0 | hiccup | 2010 | 10.0 |
1 | drogon | 2011 | -100.0 |
2 | dragon 2 | 2019 | 0.0 |
3 | puff | 2010 | 100.0 |
4 | smaug | 2011 | NaN |
A database contains a collection of SQL tables. Let's connect to our "toy" database basic_examples.db
and explore the tables it stores.
%%sql
sqlite:///data/basic_examples.db
%%sql
SELECT * FROM sqlite_master WHERE type='table'
* sqlite:///data/basic_examples.db Done.
type | name | tbl_name | rootpage | sql |
---|---|---|---|---|
table | sqlite_sequence | sqlite_sequence | 7 | CREATE TABLE sqlite_sequence(name,seq) |
table | Dragon | Dragon | 2 | CREATE TABLE Dragon ( name TEXT PRIMARY KEY, year INTEGER CHECK (year >= 2000), cute INTEGER ) |
table | Dish | Dish | 4 | CREATE TABLE Dish ( name TEXT PRIMARY KEY, type TEXT, cost INTEGER CHECK (cost >= 0) ) |
table | Scene | Scene | 6 | CREATE TABLE Scene ( id INTEGER PRIMARY KEY AUTOINCREMENT, biome TEXT NOT NULL, city TEXT NOT NULL, visitors INTEGER CHECK (visitors >= 0), created_at DATETIME DEFAULT (DATETIME('now')) ) |
Every SQL query must contain a SELECT
and FROM
clause.
SELECT
: specify the column(s) to return in the outputFROM
: specify the database table from which to extract data%%sql
SELECT * FROM Dragon;
* sqlite:///data/basic_examples.db Done.
name | year | cute |
---|---|---|
hiccup | 2010 | 10 |
drogon | 2011 | -100 |
dragon 2 | 2019 | 0 |
puff | 2010 | 100 |
smaug | 2011 | None |
%%sql
SELECT cute, year FROM Dragon;
* sqlite:///data/basic_examples.db Done.
cute | year |
---|---|
10 | 2010 |
-100 | 2011 |
0 | 2019 |
100 | 2010 |
None | 2011 |
Aliasing with AS
%%sql
SELECT cute AS cuteness,
year AS birth
FROM Dragon;
* sqlite:///data/basic_examples.db Done.
cuteness | birth |
---|---|
10 | 2010 |
-100 | 2011 |
0 | 2019 |
100 | 2010 |
None | 2011 |
Uniqueness with DISTINCT
%%sql
SELECT DISTINCT year
FROM Dragon;
* sqlite:///data/basic_examples.db Done.
year |
---|
2010 |
2011 |
2019 |
Filtering with WHERE
%%sql
SELECT name, year
FROM Dragon
WHERE cute > 0;
* sqlite:///data/basic_examples.db Done.
name | year |
---|---|
hiccup | 2010 |
puff | 2010 |
%%sql
SELECT name, cute, year
FROM Dragon
WHERE cute > 0 OR year > 2013;
* sqlite:///data/basic_examples.db Done.
name | cute | year |
---|---|---|
hiccup | 10 | 2010 |
dragon 2 | 0 | 2019 |
puff | 100 | 2010 |
%%sql
SELECT name, year
FROM Dragon
WHERE name IN ("puff", "hiccup");
* sqlite:///data/basic_examples.db Done.
name | year |
---|---|
hiccup | 2010 |
puff | 2010 |
%%sql
SELECT name, cute
FROM Dragon
WHERE cute IS NOT NULL;
* sqlite:///data/basic_examples.db Done.
name | cute |
---|---|
hiccup | 10 |
drogon | -100 |
dragon 2 | 0 |
puff | 100 |
Ordering data using ORDER BY
%%sql
SELECT *
FROM Dragon
ORDER BY cute DESC;
* sqlite:///data/basic_examples.db Done.
name | year | cute |
---|---|---|
puff | 2010 | 100 |
hiccup | 2010 | 10 |
dragon 2 | 2019 | 0 |
drogon | 2011 | -100 |
smaug | 2011 | None |
Restricting output with LIMIT
and OFFSET
%%sql
SELECT *
FROM Dragon
LIMIT 2;
* sqlite:///data/basic_examples.db Done.
name | year | cute |
---|---|---|
hiccup | 2010 | 10 |
drogon | 2011 | -100 |
%%sql
SELECT *
FROM Dragon
LIMIT 2
OFFSET 1;
* sqlite:///data/basic_examples.db Done.
name | year | cute |
---|---|---|
drogon | 2011 | -100 |
dragon 2 | 2019 | 0 |
GROUP BY
¶%%sql
SELECT *
FROM Dish;
* sqlite:///data/basic_examples.db Done.
name | type | cost |
---|---|---|
ravioli | entree | 10 |
ramen | entree | 13 |
taco | entree | 7 |
edamame | appetizer | 4 |
fries | appetizer | 4 |
potsticker | appetizer | 4 |
ice cream | dessert | 5 |
A small note: the fact that type
is highlighted in green below is a consequence of Jupyter assuming that we are writing Python code (where type
is a built-in keyword). type
does not have a special meaning in SQL, so the color below does not indicate any special functionality. When we run the cell, Jupyter realizes it should recognize the code as SQL.
%%sql
SELECT type
FROM Dish;
* sqlite:///data/basic_examples.db Done.
type |
---|
entree |
entree |
entree |
appetizer |
appetizer |
appetizer |
dessert |
%%sql
SELECT type
FROM Dish
GROUP BY type;
* sqlite:///data/basic_examples.db Done.
type |
---|
appetizer |
dessert |
entree |
%%sql
SELECT type, SUM(cost)
FROM Dish
GROUP BY type;
* sqlite:///data/basic_examples.db Done.
type | SUM(cost) |
---|---|
appetizer | 12 |
dessert | 5 |
entree | 30 |
%%sql
SELECT type,
SUM(cost),
MIN(cost),
MAX(name)
FROM Dish
GROUP BY type;
* sqlite:///data/basic_examples.db Done.
type | SUM(cost) | MIN(cost) | MAX(name) |
---|---|---|---|
appetizer | 12 | 4 | potsticker |
dessert | 5 | 5 | ice cream |
entree | 30 | 7 | taco |
%%sql
SELECT year, COUNT(cute)
FROM Dragon
GROUP BY year;
* sqlite:///data/basic_examples.db Done.
year | COUNT(cute) |
---|---|
2010 | 2 |
2011 | 1 |
2019 | 1 |
%%sql
SELECT year, COUNT(*)
FROM Dragon
GROUP BY year;
* sqlite:///data/basic_examples.db Done.
year | COUNT(*) |
---|---|
2010 | 2 |
2011 | 2 |
2019 | 1 |