Lecture 9 – Visualization, Part 1
Presented by Fernando Perez
Content by Fernando Perez, Suraj Rampure, Ani Adhikari, Sam Lau, Yifan Wu
A reminder – the right column of the table below contains Quick Checks. These are not required but suggested to help you check your understanding.
Formal definition of visualization. The purpose of visualization in the data science lifecycle.
Different ways we can map from data to properties of a visualization.
Defining distributions, and determining whether or not given visualizations contain a distribution.
Bar plots as a means of displaying the distribution of a qualitative variable, as well as for plotting a quantitative variable across several different categories.
Rug plots. Histograms, where areas are proportions. Reviewing histogram calculations from Data 8. Density curves as smoothed versions of histograms.
Describing distributions of quantitative variables using terms such as modes, skew, tails, and outliers.
Using box plots and violin plots to visualize quantitative distributions. Using overlaid histograms and density curves, and side by side box plots and violin plots, to compare multiple quantitative distributions.
Using scatter plots, hex plots, and contour plots to visualize the relationship between pairs of quantitative variables. Summary of visualization thus far.