In [1]:
import pandas as pd
In [2]:
import sqlalchemy
engine = sqlalchemy.create_engine("sqlite:///nautical.db")
conn = engine.connect()

For lecture I am also going to use a postgres database.

In [3]:
import sqlalchemy
engine = sqlalchemy.create_engine("postgres://jegonzal:@localhost:5432/data100")
conn = engine.connect()

Create Tables

In [4]:
script = """
DROP TABLE IF EXISTS sailors;
DROP TABLE IF EXISTS boats;
DROP TABLE IF EXISTS reserves;


CREATE TABLE sailors (
   sid INTEGER,   
   name CHAR(20), 
   PRIMARY KEY (sid));

CREATE TABLE boats (
   bid INTEGER,
   name CHAR (20), 
   PRIMARY KEY (bid));

 CREATE TABLE reserves (
   bid INTEGER, 
   sid INTEGER,      
   day DATE);

INSERT INTO sailors VALUES
(22, 'Alice'),
(31, 'Bob'),
(95, 'Nora');

INSERT INTO boats VALUES
(101, 'Titanic'),
(102, 'Beagle'),
(103, 'Bismarck'),
(104, 'Clipper');

INSERT INTO reserves VALUES 
(101, 22,  '10/10/96'),
(103, 95, '11/12/96'),
(41, 38,  '8/11/18');
"""
for line in script.split(";"):
    if len(line.strip()) > 0: conn.execute(line)

View Tables

In [5]:
pd.read_sql("""
    SELECT * FROM sailors
""", conn)
Out[5]:
sid name
0 22 Alice
1 31 Bob
2 95 Nora
In [6]:
boats = pd.read_sql("""
    SELECT * FROM boats
""", conn)
boats
Out[6]:
bid name
0 101 Titanic
1 102 Beagle
2 103 Bismarck
3 104 Clipper
In [7]:
reserves = pd.read_sql("""
    SELECT * FROM reserves
""", conn)
reserves
Out[7]:
bid sid day
0 101 22 1996-10-10
1 103 95 1996-11-12
2 41 38 2018-08-11

Inner Join

The normal join only keeps rows that match the join predicate.

Inner Join

Boats
bid name
0 101 Titanic
1 102 Beagle
2 103 Bismarck
3 104 Clipper
Reserves
bid sid day
0 101 22 1996-10-10
1 103 95 1996-11-12
2 41 38 2018-08-11
In [8]:
pd.read_sql("""
    SELECT * 
    FROM boats b, reserves r
    WHERE b.bid = r.bid
""", conn)
Out[8]:
bid name bid sid day
0 101 Titanic 101 22 1996-10-10
1 103 Bismarck 103 95 1996-11-12
In [9]:
pd.read_sql("""
    SELECT * 
    FROM boats b INNER JOIN reserves r
        ON b.bid = r.bid
""", conn)
Out[9]:
bid name bid sid day
0 101 Titanic 101 22 1996-10-10
1 103 Bismarck 103 95 1996-11-12

Left Join

Same as inner join but if there is a row on the left table that failed to join with a row on the right table it is also included with null values in place of all entries in the right table.

Left Join

Boats
bid name
0 101 Titanic
1 102 Beagle
2 103 Bismarck
3 104 Clipper
Reserves
bid sid day
0 101 22 1996-10-10
1 103 95 1996-11-12
2 41 38 2018-08-11
In [10]:
pd.read_sql("""
    SELECT * 
    FROM boats b LEFT JOIN reserves r
        ON b.bid = r.bid
""", conn, )
Out[10]:
bid name bid sid day
0 101 Titanic 101.0 22.0 1996-10-10
1 103 Bismarck 103.0 95.0 1996-11-12
2 104 Clipper NaN NaN None
3 102 Beagle NaN NaN None

Right Join

Just like left join but keeping values on the right side.

Right Join

SQLite Note Right joins are not support in SQLite but you can use a LEFT JOIN instead.
Boats
bid name
0 101 Titanic
1 102 Beagle
2 103 Bismarck
3 104 Clipper
Reserves
bid sid day
0 101 22 1996-10-10
1 103 95 1996-11-12
2 41 38 2018-08-11
In [11]:
pd.read_sql("""
    SELECT * 
    FROM boats b RIGHT JOIN reserves r
        ON b.bid = r.bid
""", conn)
Out[11]:
bid name bid sid day
0 101.0 Titanic 101 22 1996-10-10
1 103.0 Bismarck 103 95 1996-11-12
2 NaN None 41 38 2018-08-11
In [12]:
pd.read_sql("""
    SELECT * 
    FROM reserves r LEFT JOIN boats b 
        ON b.bid = r.bid
""", conn)
Out[12]:
bid sid day bid name
0 101 22 1996-10-10 101.0 Titanic
1 103 95 1996-11-12 103.0 Bismarck
2 41 38 2018-08-11 NaN None

Full Join

The outer ensures that at least one row from both tables appears in the final result even if there are no matches by adding null values.

Full Join

SQLite Note Full outer joins are not support in sqlite.

Boats
bid name
0 101 Titanic
1 102 Beagle
2 103 Bismarck
3 104 Clipper
Reserves
bid sid day
0 101 22 1996-10-10
1 103 95 1996-11-12
2 41 38 2018-08-11
In [13]:
pd.read_sql("""
    SELECT *
    FROM boats b FULL JOIN reserves r
        ON b.bid = r.bid
""", conn)
Out[13]:
bid name bid sid day
0 101.0 Titanic 101.0 22.0 1996-10-10
1 103.0 Bismarck 103.0 95.0 1996-11-12
2 NaN None 41.0 38.0 2018-08-11
3 104.0 Clipper NaN NaN None
4 102.0 Beagle NaN NaN None