Hopefully after this notebook you will:
pyplot
and object-oriented¶Matplotlib is a library that can be thought of as having two main ways of being used:
via pyplot
calls, as a high-level, matlab-like library that automatically
manages details like figure creation.
via its internal object-oriented structure, that offers full control over all aspects of the figure, at the cost of slightly more verbose calls for the common case.
The pyplot api:
Before we look at our first simple example, we must activate matplotlib support in the notebook:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
# a few widely used tools from numpy
from numpy import sin, cos, exp, sqrt, pi, linspace, arange
x = linspace(0, 2 * pi)
y = sin(x)
plt.plot(x, y, label='sin(x)')
plt.legend()
plt.title('Harmonic')
plt.xlabel('x')
plt.ylabel('y')
# Add one line to that plot
z = cos(x)
plt.plot(x, z, label='cos(x)')
# Make a second figure with a simple plot
plt.figure()
plt.plot(x, sin(2*x), label='sin(2x)')
plt.legend();
Here is how to create the same two plots, using explicit management of the figure and axis objects:
f, ax = plt.subplots() # we manually make a figure and axis
ax.plot(x,y, label='sin(x)') # it's the axis who plots
ax.legend()
ax.set_title('Harmonic') # we set the title on the axis
ax.set_xlabel('x') # same with labels
ax.set_ylabel('y')
# Make a second figure with a simple plot. We can name the figure with a
# different variable name as well as its axes, and then control each
f1, ax1 = plt.subplots()
ax1.plot(x, sin(2*x), label='sin(2x)')
ax1.legend()
# Since we now have variables for each axis, we can add back to the first
# figure even after making the second
ax.plot(x, z, label='cos(x)');
It’s important to understand the existence of these objects, even if you use mostly the top-level pyplot calls most of the time. Many things can be accomplished in MPL with mostly pyplot and a little bit of tweaking of the underlying objects. We’ll revisit the object-oriented API later.
Important commands to know about, and which matplotlib uses internally a lot:
gcf() # get current figure
gca() # get current axis
The simplest command is:
f, ax = plt.subplots()
which is equivalent to:
f = plt.figure()
ax = f.add_subplot(111)
By passing arguments to subplots
, you can easily create a regular plot grid:
x = linspace(0, 2*pi, 400)
y = sin(x**2)
# Just a figure and one subplot
f, ax = plt.subplots()
ax.plot(x, y)
ax.set_title('Simple plot')
# Two subplots, unpack the output array immediately
f, (ax1, ax2) = plt.subplots(1, 2)
ax1.plot(x, y)
ax2.scatter(x, y)
# Put a figure-level title
f.suptitle('Two plots');
And finally, an arbitrarily complex grid can be made with subplot2grid
:
f = plt.figure()
ax1 = plt.subplot2grid((3,3), (0,0), colspan=3)
ax2 = plt.subplot2grid((3,3), (1,0), colspan=2)
ax3 = plt.subplot2grid((3,3), (1, 2), rowspan=2)
ax4 = plt.subplot2grid((3,3), (2, 0))
ax5 = plt.subplot2grid((3,3), (2, 1))
# Let's turn off visibility of all tick labels here
for ax in f.axes:
for t in ax.get_xticklabels()+ax.get_yticklabels():
t.set_visible(False)
# And add a figure-level title at the top
f.suptitle('Subplot2grid')
# Plot something at the bottom right
ax3.plot([1, 2, 3])
In matplotlib, most properties for lines, colors, etc, can be set directly in the call:
plt.plot([1,2,3], linestyle='--', color='r')
But for finer control you can get a hold of the returned line object (more on these objects later):
In [1]: line, = plot([1,2,3])
These line objects have a lot of properties you can control, a full list is seen here by tab-completing in IPython:
In [2]: line.set
line.set line.set_drawstyle line.set_mec
line.set_aa line.set_figure line.set_mew
line.set_agg_filter line.set_fillstyle line.set_mfc
line.set_alpha line.set_gid line.set_mfcalt
line.set_animated line.set_label line.set_ms
line.set_antialiased line.set_linestyle line.set_picker
line.set_axes line.set_linewidth line.set_pickradius
line.set_c line.set_lod line.set_rasterized
line.set_clip_box line.set_ls line.set_snap
line.set_clip_on line.set_lw line.set_solid_capstyle
line.set_clip_path line.set_marker line.set_solid_joinstyle
line.set_color line.set_markeredgecolor line.set_transform
line.set_contains line.set_markeredgewidth line.set_url
line.set_dash_capstyle line.set_markerfacecolor line.set_visible
line.set_dashes line.set_markerfacecoloralt line.set_xdata
line.set_dash_joinstyle line.set_markersize line.set_ydata
line.set_data line.set_markevery line.set_zorder
But the setp
call (short for set property) can be very useful, especially
while working interactively because it contains introspection support, so you
can learn about the valid calls as you work:
In [7]: line, = plot([1,2,3])
In [8]: setp(line, 'linestyle')
linestyle: [ ``'-'`` | ``'--'`` | ``'-.'`` | ``':'`` | ``'None'`` | ``' '`` | ``''`` ] and any drawstyle in combination with a linestyle, e.g. ``'steps--'``.
In [9]: setp(line)
agg_filter: unknown
alpha: float (0.0 transparent through 1.0 opaque)
animated: [True | False]
antialiased or aa: [True | False]
...
... much more output elided
...
In the first form, it shows you the valid values for the 'linestyle' property, and in the second it shows you all the acceptable properties you can set on the line object. This makes it very easy to discover how to customize your figures to get the visual results you need.
Furthermore, setp can manipulate multiple objects at a time:
x = linspace(0, 2*pi)
y1 = sin(x)
y2 = sin(2*x)
lines = plt.plot(x, y1, x, y2)
# We will set the width and color of all lines in the figure at once:
plt.setp(lines, linewidth=2, color='r')
Finally, if you know what properties you want to set on a specific object, a
plain set
call is typically the simplest form:
line, = plt.plot([1,2,3])
line.set(lw=2, c='red',ls='--')
plt.plot([1,2,3])
The return value of the plot call is a list of lines, which can be manipulated further. If you capture the line object (in this case it's a single line so we use a one-element tuple):
line, = plt.plot([1,2,3])
line.set_color('r')
One line property that is particularly useful to be aware of is set_data
:
# Create a plot and hold the line object
line, = plt.plot([1,2,3], label='my data')
plt.grid()
plt.title('My title')
# ... later, we may want to modify the x/y data but keeping the rest of the
# figure intact, with our new data:
x = linspace(0, 1)
y = x**2
# This can be done by operating on the data object itself
line.set_data(x, y)
# Now we must set the axis limits manually. Note that we can also use xlim
# and ylim to set the x/y limits separately.
plt.axis([0,1,0,1])
# Note, alternatively this can be done with:
ax = plt.gca() # get currently active axis object
ax.relim()
ax.autoscale_view()
# as well as requesting matplotlib to draw
plt.draw()
The axis
call above was used to set the x/y limits of the axis. And in
previous examples we called .plot
directly on axis objects. Axes are the
main object that contains a lot of the user-facing functionality of matplotlib:
In [15]: f = plt.figure()
In [16]: ax = f.add_subplot(111)
In [17]: ax.
Display all 299 possibilities? (y or n)
ax.acorr ax.hitlist
ax.add_artist ax.hlines
ax.add_callback ax.hold
ax.add_collection ax.ignore_existing_data_limits
ax.add_line ax.images
ax.add_patch ax.imshow
... etc.
Many of the commands in plt.<command>
are nothing but wrappers around axis
calls, with machinery to automatically create a figure and add an axis to it if
there wasn't one to begin with. The output of most axis actions that draw
something is a collection of lines (or other more complex geometric objects).
The enclosing object is the figure
, that holds all axes:
In [17]: f = plt.figure()
In [18]: f.add_subplot(211)
Out[18]: <matplotlib.axes.AxesSubplot object at 0x9d0060c>
In [19]: f.axes
Out[19]: [<matplotlib.axes.AxesSubplot object at 0x9d0060c>]
In [20]: f.add_subplot(212)
Out[20]: <matplotlib.axes.AxesSubplot object at 0x9eacf0c>
In [21]: f.axes
Out[21]:
[<matplotlib.axes.AxesSubplot object at 0x9d0060c>,
<matplotlib.axes.AxesSubplot object at 0x9eacf0c>]
The basic view of matplotlib is: a figure contains one or more axes, axes draw and return collections of one or more geometric objects (lines, patches, etc).
For all the gory details on this topic, see the matplotlib artist tutorial.
<img src="http://www.aosabook.org/images/matplotlib/artists_figure.png", width="75%">
<img src="http://www.aosabook.org/images/matplotlib/artists_tree.png", width="75%">
Let's make a simple plot that contains a few commonly used decorations
f, ax = plt.subplots()
# Three simple polyniomials
x = linspace(-1, 1)
y1,y2,y3 = [x**i for i in [1,2,3]]
# Plot each with a label (for a legend)
ax.plot(x, y1, label='linear')
ax.plot(x, y2, label='cuadratic')
ax.plot(x, y3, label='cubic')
# Make all lines drawn so far thicker
plt.setp(ax.lines, linewidth=2)
# Add a grid and a legend that doesn't overlap the lines
ax.grid(True)
ax.legend(loc='lower right')
# Add black horizontal and vertical lines through the origin
ax.axhline(0, color='black')
ax.axvline(0, color='black')
# Set main text elements of the plot
ax.set_title('Some polynomials')
ax.set_xlabel('x')
ax.set_ylabel('p(x)');
# example data
x = arange(0.1, 4, 0.5)
y = exp(-x)
# example variable error bar values
yerr = 0.1 + 0.2*sqrt(x)
xerr = 0.1 + yerr
# First illustrate basic pyplot interface, using defaults where possible.
plt.figure()
plt.errorbar(x, y, xerr=0.2, yerr=0.4)
plt.title("Simplest errorbars, 0.2 in x, 0.4 in y")
Now a more elaborate one, using the OO interface to exercise more features.
# same data/errors as before
x = arange(0.1, 4, 0.5)
y = exp(-x)
yerr = 0.1 + 0.2*sqrt(x)
xerr = 0.1 + yerr
fig, axs = plt.subplots(nrows=2, ncols=2)
ax = axs[0,0]
ax.errorbar(x, y, yerr=yerr, fmt='o')
ax.set_title('Vert. symmetric')
# With 4 subplots, reduce the number of axis ticks to avoid crowding.
ax.locator_params(nbins=4)
ax = axs[0,1]
ax.errorbar(x, y, xerr=xerr, fmt='o')
ax.set_title('Hor. symmetric')
ax = axs[1,0]
ax.errorbar(x, y, yerr=[yerr, 2*yerr], xerr=[xerr, 2*xerr], fmt='--o', label='foo')
ax.legend()
ax.set_title('H, V asymmetric')
ax = axs[1,1]
ax.set_yscale('log')
# Here we have to be careful to keep all y values positive:
ylower = np.maximum(1e-2, y - yerr)
yerr_lower = y - ylower
ax.errorbar(x, y, yerr=[yerr_lower, 2*yerr], xerr=xerr,
fmt='o', ecolor='g')
ax.set_title('Mixed sym., log y')
# Fix layout to minimize overlap between titles and marks
# https://matplotlib.org/users/tight_layout_guide.html
plt.tight_layout()
A simple log plot
x = linspace(-5, 5)
y = exp(-x**2)
f, (ax1, ax2) = plt.subplots(2, 1)
ax1.plot(x, y)
ax2.semilogy(x, y)
A more elaborate log plot using 'symlog', that treats a specified range as linear (thus handling values near zero) and symmetrizes negative values:
x = linspace(-50, 50, 100)
y = linspace(0, 100, 100)
# Create the figure and axes
f, (ax1, ax2, ax3) = plt.subplots(3, 1)
# Symlog on the x axis
ax1.plot(x, y)
ax1.set_xscale('symlog')
ax1.set_ylabel('symlogx')
# Grid for both axes
ax1.grid(True)
# Minor grid on too for x
ax1.xaxis.grid(True, which='minor')
# Symlog on the y axis
ax2.plot(y, x)
ax2.set_yscale('symlog')
ax2.set_ylabel('symlogy')
# Symlog on both
ax3.plot(x, sin(x / 3.0))
ax3.set_xscale('symlog')
ax3.set_yscale('symlog')
ax3.grid(True)
ax3.set_ylabel('symlog both')
plt.tight_layout()
N = 5
catMeans = (20, 35, 30, 31, 27)
catStd = (2, 3, 4, 1, 2)
ind = arange(N) # the x locations for the groups
width = 0.35 # the width of the bars
fig, ax = plt.subplots()
rects1 = ax.bar(ind, catMeans, width, color='r', yerr=catStd, label='Cats')
dogMeans = (25, 32, 34, 21, 29)
dogStd = (3, 5, 2, 3, 3)
rects2 = ax.bar(ind+width, dogMeans, width, color='y', yerr=dogStd, label='Dogs')
# add some
ax.set_ylabel('Scores')
ax.set_title('Scores by group and species')
ax.set_xticks(ind+width)
ax.set_xticklabels( ('G1', 'G2', 'G3', 'G4', 'G5') )
ax.legend();
The scatter
command produces scatter plots with arbitrary markers.
from matplotlib import cm
t = linspace(0.0, 6*pi, 100)
y = exp(-0.1*t)*cos(t)
phase = t % 2*pi
f, ax = plt.subplots()
ax.scatter(t, y, s=100*abs(y), c=phase, cmap=cm.viridis)
ax.set_ylim(-1,1)
ax.grid()
ax.axhline(0, color='k');
Matplotlib has a built-in command for histograms.
# Some normally-distributed data
mu, sigma = 60, 10
x = np.random.normal(mu, sigma, 10000)
# the histogram of the data
n, bins, patches = plt.hist(x, bins=50, normed=True, facecolor='g', alpha=0.75)
plt.xlabel('Score')
plt.ylabel('Probability')
plt.title('Histogram of Test Scores')
plt.text(75, .032, rf'$\mu={mu},\ \sigma={sigma}$')
plt.grid(True)
In matplotlib, text can be added either relative to an individual axis object or to the whole figure.
These commands add text to the Axes:
And these act on the whole figure:
And any text field can contain LaTeX expressions for mathematics, as long as
they are enclosed in $
signs.
This example illustrates all of them:
fig = plt.figure()
fig.suptitle('bold figure suptitle', fontsize=14, fontweight='bold')
ax = fig.add_subplot(111)
fig.subplots_adjust(top=0.85)
ax.set_title('axes title')
ax.set_xlabel('xlabel')
ax.set_ylabel('ylabel')
ax.text(3, 8, 'boxed italics text in data coords', style='italic',
bbox={'facecolor':'red', 'alpha':0.5, 'pad':10})
ax.text(2, 6, r'an equation: $E=mc^2$', fontsize=15)
ax.text(3, 2, 'unicode: Institut für Festkörperphysik')
ax.text(0.95, 0.01, 'colored text in axes coords',
verticalalignment='bottom', horizontalalignment='right',
transform=ax.transAxes,
color='green', fontsize=15)
ax.plot([2], [1], 'o')
ax.annotate('annotate', xy=(2, 1), xytext=(3, 4),
arrowprops=dict(facecolor='black', shrink=0.05))
ax.axis([0, 10, 0, 10])
Some statistically-oriented plots to visualize data distributions: boxplots and violin plots (this paper by Hadley Wickham is a good overview of boxplots).
Note that often Seaborn will have a simpler API for rich statistical plots, atop matplotlib's engine. This shows how to do plots of this type without seaborn:
# Random test data
np.random.seed(123)
all_data = [np.random.normal(0, std, 100) for std in range(1, 4)]
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(11, 5))
# Box plots
bplots = []
for ax, notch in zip(axes[:2], (False, True)):
b = ax.boxplot(all_data,
notch=notch,
vert=True, # vertical box aligmnent
patch_artist=True) # fill with color
bplots.append(b)
axes[0].set_title('box plot')
axes[1].set_title('notched box plot')
# Violin plot
vplot = axes[2].violinplot(all_data,
showmeans=False,
showmedians=True)
axes[2].set_title('violin plot')
# fill with colors
colors = ['pink', 'lightblue', 'lightgreen']
for bplot in bplots:
for patch, color in zip(bplot['boxes'], colors):
patch.set_facecolor(color)
# adding horizontal grid lines
for i, ax in enumerate(axes):
ax.yaxis.grid(True)
ax.set_xticks([y+1 for y in range(len(all_data))], )
ax.set_xlabel('xlabel')
if i: ax.set_yticklabels([])
# add x-tick labels
plt.setp(axes, xticks=[y+1 for y in range(len(all_data))],
xticklabels=['x1', 'x2', 'x3']);