Matplotlib: Beyond the basics

Hopefully after this notebook you will:

  • Know how to polish matplotlib figures to the point where they can go to a journal.
  • Understand matplotlib's internal model enough to:
    • know where to look for knobs to fine-tune
    • better understand the help and examples online
    • use it as a development platform for complex visualization

Resources

  • A detailed tutorial by Nicolas Rougier, similar in style to the ones we saw for Numpy.
  • The fantastic Python Graph Gallery, which provides a large collection of plots with emphasis on statistical visualizations. It uses Seaborn extensively.
  • In this tutorial we'll focus on "raw" matplotlib, but for a wide variety of statistical visualization tasks, using Seaborn makes life much easier. We'll dive into its tutorial later on.

Matplotlib's main APIs: pyplot and object-oriented

Matplotlib is a library that can be thought of as having two main ways of being used:

  • via pyplot calls, as a high-level, matlab-like library that automatically manages details like figure creation.

  • via its internal object-oriented structure, that offers full control over all aspects of the figure, at the cost of slightly more verbose calls for the common case.

The pyplot api:

  • Easiest to use.
  • Sufficient for simple and moderately complex plots.
  • Does not offer complete control over all details.

Before we look at our first simple example, we must activate matplotlib support in the notebook:

In [1]:
%matplotlib inline

import matplotlib.pyplot as plt
import numpy as np
# a few widely used tools from numpy
from numpy import sin, cos, exp, sqrt, pi, linspace, arange

Change some of matplotlib's plotting defaults for better class presentation and set a nice Seaborn style.

In [2]:
plt.style.use('seaborn-white')
plt.rc('figure', dpi=100, figsize=(7, 5))
plt.rc('font', size=12)
In [3]:
x = linspace(0, 2 * pi)
y = sin(x)
plt.plot(x, y, label='sin(x)')
plt.legend()
plt.title('Harmonic')
plt.xlabel('x')
plt.ylabel('y')

# Add one line to that plot
z = cos(x)
plt.plot(x, z, label='cos(x)')

# Make a second figure with a simple plot
plt.figure()
plt.plot(x, sin(2*x), label='sin(2x)')
plt.legend();

Here is how to create the same two plots, using explicit management of the figure and axis objects:

In [4]:
f, ax = plt.subplots()  # we manually make a figure and axis
ax.plot(x,y, label='sin(x)')  # it's the axis who plots
ax.legend()
ax.set_title('Harmonic')  # we set the title on the axis
ax.set_xlabel('x')  # same with labels
ax.set_ylabel('y')

# Make a second figure with a simple plot.  We can name the figure with a
# different variable name as well as its axes, and then control each
f1, ax1 = plt.subplots()
ax1.plot(x, sin(2*x), label='sin(2x)')
ax1.legend()

# Since we now have variables for each axis, we can add back to the first
# figure even after making the second
ax.plot(x, z, label='cos(x)');

It’s important to understand the existence of these objects, even if you use mostly the top-level pyplot calls most of the time. Many things can be accomplished in MPL with mostly pyplot and a little bit of tweaking of the underlying objects. We’ll revisit the object-oriented API later.

Important commands to know about, and which matplotlib uses internally a lot:

gcf()  # get current figure
gca()  # get current axis

Making subplots

The simplest command is:

f, ax = plt.subplots()

which is equivalent to:

f = plt.figure()
ax = f.add_subplot(111)

By passing arguments to subplots, you can easily create a regular plot grid:

In [5]:
x = linspace(0, 2*pi, 400)
y = sin(x**2)

# Just a figure and one subplot
f, ax = plt.subplots()
ax.plot(x, y)
ax.set_title('Simple plot')

# Two subplots, unpack the output array immediately
f, (ax1, ax2) = plt.subplots(1, 2)
ax1.plot(x, y)
ax1.set_title('lines')
ax2.scatter(x, y)
ax2.set_title('dots')

# Put a figure-level title
f.suptitle('Two plots')

# Ask matplotlib to auto-adjust whitespace surrounding axes
plt.tight_layout()

And finally, an arbitrarily complex grid can be made with subplot2grid:

In [6]:
f = plt.figure()
ax1 = plt.subplot2grid((3,3), (0,0), colspan=3)
ax2 = plt.subplot2grid((3,3), (1,0), colspan=2)
ax3 = plt.subplot2grid((3,3), (1, 2), rowspan=2)
ax4 = plt.subplot2grid((3,3), (2, 0))
ax5 = plt.subplot2grid((3,3), (2, 1))

# Let's turn off visibility of all tick labels here
for ax in f.axes:
   for t in ax.get_xticklabels()+ax.get_yticklabels():
       t.set_visible(False)

# And add a figure-level title at the top
f.suptitle('Subplot2grid')

# Plot something at the bottom right
ax3.plot([1, 2, 3]);

Manipulating properties across matplotlib

In matplotlib, most properties for lines, colors, etc, can be set directly in the call:

In [7]:
plt.plot([1,2,3], linestyle='--', color='r')
Out[7]:
[<matplotlib.lines.Line2D at 0x11979dc18>]

But for finer control you can get a hold of the returned line object (more on these objects later):

In [1]: line, = plot([1,2,3])

These line objects have a lot of properties you can control, a full list is seen here by tab-completing in IPython:

In [2]: line.set
line.set                     line.set_drawstyle           line.set_mec
line.set_aa                  line.set_figure              line.set_mew
line.set_agg_filter          line.set_fillstyle           line.set_mfc
line.set_alpha               line.set_gid                 line.set_mfcalt
line.set_animated            line.set_label               line.set_ms
line.set_antialiased         line.set_linestyle           line.set_picker
line.set_axes                line.set_linewidth           line.set_pickradius
line.set_c                   line.set_lod                 line.set_rasterized
line.set_clip_box            line.set_ls                  line.set_snap
line.set_clip_on             line.set_lw                  line.set_solid_capstyle
line.set_clip_path           line.set_marker              line.set_solid_joinstyle
line.set_color               line.set_markeredgecolor     line.set_transform
line.set_contains            line.set_markeredgewidth     line.set_url
line.set_dash_capstyle       line.set_markerfacecolor     line.set_visible
line.set_dashes              line.set_markerfacecoloralt  line.set_xdata
line.set_dash_joinstyle      line.set_markersize          line.set_ydata
line.set_data                line.set_markevery           line.set_zorder


But the setp call (short for set property) can be very useful, especially while working interactively because it contains introspection support, so you can learn about the valid calls as you work:

In [7]: line, = plot([1,2,3])

In [8]: setp(line, 'linestyle')
  linestyle: [ ``'-'`` | ``'--'`` | ``'-.'`` | ``':'`` | ``'None'`` | ``' '`` | ``''`` ]         and any drawstyle in combination with a linestyle, e.g. ``'steps--'``.         

In [9]: setp(line)
  agg_filter: unknown
  alpha: float (0.0 transparent through 1.0 opaque)         
  animated: [True | False]         
  antialiased or aa: [True | False]
  ...
  ... much more output elided
  ...

In the first form, it shows you the valid values for the 'linestyle' property, and in the second it shows you all the acceptable properties you can set on the line object. This makes it very easy to discover how to customize your figures to get the visual results you need.

Furthermore, setp can manipulate multiple objects at a time:

In [8]:
x = linspace(0, 2*pi)
y1 = sin(x)
y2 = sin(2*x)
lines = plt.plot(x, y1, x, y2)

# We will set the width and color of all lines in the figure at once:
plt.setp(lines, linewidth=4, color='b')
Out[8]:
[None, None, None, None]

Finally, if you know what properties you want to set on a specific object, a plain set call is typically the simplest form:

In [9]:
line, = plt.plot([1,2,3])
line.set(lw=2, c='red', ls='--')
Out[9]:
[None, None, None]

Understanding what matplotlib returns: lines, axes and figures

Lines

In a simple plot:

In [10]:
plt.plot([1,2,3])
Out[10]:
[<matplotlib.lines.Line2D at 0x119ca9e80>]

The return value of the plot call is a list of lines, which can be manipulated further. If you capture the line object (in this case it's a single line so we use a one-element tuple):

In [11]:
line, = plt.plot([1,2,3])
line.set_color('r')

One line property that is particularly useful to be aware of is set_data:

In [12]:
# Create a plot and hold the line object
line, = plt.plot([1,2,3], label='my data')
plt.grid()
plt.title('My title')

# ... later, we may want to modify the x/y data but keeping the rest of the
# figure intact, with our new data:
x = linspace(0, 1)
y = x**2

# This can be done by operating on the data object itself
line.set_data(x, y)

# Now we must set the axis limits manually. Note that we can also use xlim
# and ylim to set the x/y limits separately.
plt.axis([0,1,0,1])

# Note, alternatively this can be done with:
ax = plt.gca()  # get currently active axis object
ax.relim()
ax.autoscale_view()

The next important component, axes

The axis call above was used to set the x/y limits of the axis. And in previous examples we called .plot directly on axis objects. Axes are the main object that contains a lot of the user-facing functionality of matplotlib:

In [16]: fig, ax = plt.subplots()

In [17]: ax.
Display all 299 possibilities? (y or n)
ax.acorr                                 ax.hitlist
ax.add_artist                            ax.hlines
ax.add_callback                          ax.hold
ax.add_collection                        ax.ignore_existing_data_limits
ax.add_line                              ax.images
ax.add_patch                             ax.imshow

... etc.

Many of the commands in plt.<command> are nothing but wrappers around axis calls, with machinery to automatically create a figure and add an axis to it if there wasn't one to begin with. The output of most axis actions that draw something is a collection of lines (or other more complex geometric objects).

Enclosing it all, the figure

The enclosing object is the figure, that holds all axes:

In [12]: fig, ax = plt.subplots(2, 1)

In [13]: ax.shape
Out[13]: (2,)

In [14]: fig.axes
Out[14]: 
[<matplotlib.axes._subplots.AxesSubplot at 0x117c2c048>,
 <matplotlib.axes._subplots.AxesSubplot at 0x115739f98>]


The basic view of matplotlib is: a figure contains one or more axes, axes draw and return collections of one or more geometric objects (lines, patches, etc).

For all the gory details on this topic, see the matplotlib artist tutorial, or the chapter about matplotlib by its original author, John Hunter and core dev Michael Droetboom, in "The Architecture of Open Source Applications", which contains these useful diagrams:

Aribitrary text and LaTeX support

In matplotlib, text can be added either relative to an individual axis object or to the whole figure.

These commands add text to the Axes:

  • title() - add a title
  • xlabel() - add an axis label to the x-axis
  • ylabel() - add an axis label to the y-axis
  • text() - add text at an arbitrary location
  • annotate() - add an annotation, with optional arrow

And these act on the whole figure:

  • figtext() - add text at an arbitrary location
  • suptitle() - add a title

And any text field can contain LaTeX expressions for mathematics, as long as they are enclosed in $ signs.

This example illustrates all of them:

In [13]:
fig = plt.figure()
fig.suptitle('bold figure suptitle', fontsize=14, fontweight='bold')

ax = fig.add_subplot(111)
fig.subplots_adjust(top=0.85)
ax.set_title('axes title')

ax.set_xlabel('xlabel')
ax.set_ylabel('ylabel')

ax.text(3, 8, 'boxed italics text in data coords', style='italic',
        bbox={'facecolor':'red', 'alpha':0.5, 'pad':10})

ax.text(2, 6, 'an equation: $E=mc^2$', fontsize=15)

ax.text(3, 2, 'unicode: Institut für Festkörperphysik')

ax.text(0.95, 0.01, 'colored text in axes coords',
        verticalalignment='bottom', horizontalalignment='right',
        transform=ax.transAxes,
        color='green', fontsize=15)


ax.plot([2], [1], 'o')
ax.annotate('annotate', xy=(2, 1), xytext=(3, 4),
            arrowprops=dict(facecolor='black', shrink=0.05))

ax.axis([0, 10, 0, 10]);