Lecture 9 – Visualization, Part 1
Presented by Fernando Perez
Content by Fernando Perez, Suraj Rampure, Ani Adhikari, Sam Lau, Yifan Wu
A random one of the following Google Forms will give you an alphanumeric code once you submit; you should take this code and enter it into the “Lecture 9” question in the “Quick Check Codes” assignment on Gradescope to get credit for submitting this Quick Check. You must submit this by Monday, September 28th at 11:59PM to get credit for it.
Formal definition of visualization. The purpose of visualization in the data science lifecycle.
Different ways we can map from data to properties of a visualization.
Defining distributions, and determining whether or not given visualizations contain a distribution.
Bar plots as a means of displaying the distribution of a qualitative variable, as well as for plotting a quantitative variable across several different categories.
Rug plots. Histograms, where areas are proportions. Reviewing histogram calculations from Data 8. Density curves as smoothed versions of histograms.
Describing distributions of quantitative variables using terms such as modes, skew, tails, and outliers.
Using box plots and violin plots to visualize quantitative distributions. Using overlaid histograms and density curves, and side by side box plots and violin plots, to compare multiple quantitative distributions.
Using scatter plots, hex plots, and contour plots to visualize the relationship between pairs of quantitative variables. Summary of visualization thus far.