Lecture 2 – Data Sampling and Probability
Presented by Andrew Bray, Fernando Perez, Suraj Rampure
Content by Fernando Perez, Suraj Rampure, Ani Adhikari, and Joseph Gonzalez
A reminder – the right column of the table below contains Quick Checks. These are not required but suggested to help you check your understanding.
Weekly course schedule. Introduction to the data science life cycle.
Censuses and surveys. Issues with the US Census.
Samples. Drawbacks to convenience and quota samples.
A case study in sampling bias (1936 election).
Sources of bias, and a formal definition of sampling frames.
Probability samples, and why we need them.
Introducing binomial and multinomial probability calculations.
Generalizing binomial and trinomial probability calculations.
Using permutations and combinations to derive the binomial coefficient.
Example usages of the binomial coefficient.