This lab is intended to provide practice and insight in applying Chi-squared tests (for goodness of fit as well as association) to real data. Due to time constraints, it will be slightly shorter than previous labs.
Directions (Please read before starting)
\(~\)
The Washington Post manages a comprehensive database of instances where police officers have used deadly force on an suspect dating back to 2015.
police <- read.csv("https://remiller1450.github.io/data/Police2019.csv")
police_complete <- police[police$race != "", ] ## Filter to exclude individuals with missing race data
These data contain the following variables:
Although an argument can be made that these data are a population (since they contain all incidents from 2015-2019), a useful alternative is to view the data as a representative sample of an underlying random process (since new police-involved deaths continue to occur over time). This alternative view means that we can use the tools of statistical inference (hypothesis tests and confidence intervals) to better understand the uncertainty inherent to underlying random process that gave rise to the observed data.
Question #1: The US Census Bureau estimates that the racial composition of the US as 61.5% Non-Hispanic White, 17.6% Hispanic (of any race), 12.3% Black, 5.3% Asian, 0.7% Native American, and 2.6% other (source). Using the US Census numbers as the basis for a null hypothesis, evaluate whether certain racial groups are disproportionately killed by the police.
Question #2: The widespread adoption of police body cameras has been an area of debate of past decade. To explore this question, perform a Chi-squared test to determine whether there is an association between the presence of a police-worn body camera and the race of individual who was killed. Perform the test “by hand” so that you can recognize and comment upon the largest contributor to the \(X^2\) test statistic.
Question #3: Calculate and interpret the odds ratio relating the odds of an individual in the racial/ethnic group with the largest \(X^2\) contribution being killed by an officer wearing a body camera to the odds of a white individual being killed by an officer wearing a body camera.
Question #4 Use the fisher.test
function to find a 95% CI estimate for the odds ratio you found in Question #3. What is the importance of this entire confidence interval being above 1?