Lecture 2 – Data Sampling and Probability
Presented by Isaac Schmidt, Suraj Rampure
Content by Fernando Perez, Suraj Rampure, Ani Adhikari, and Joseph Gonzalez
A reminder – the right column of the table below contains Quick Checks. These are not required but suggested to help you check your understanding.
Introduction to lecture format.
Data science lifecycle, case study on squirrels.
Censuses and surveys. Issues with the US Census.
Samples. Drawbacks to convenience and quota samples.
A case study in sampling bias (1936 election).
Sources of bias, and a formal definition of sampling frames.
Probability samples, and why we need them.
Introducing binomial and multinomial probability calculations.
Generalizing binomial and trinomial probability calculations.
Using permutations and combinations to derive the binomial coefficient.
Example usages of the binomial coefficient.