Directions:

  • My main expectation is that you thoughtfully work through labs collaboratively with your group, discussing the embedded questions and recording your responses in a shared document.
    • At times you might be asked to add screenshots to your write-up. If you are on a Windows PC, an easy way to do this is the “snipping tool”, which you can find using the search bar along the bottom of your screen. If you are on a Mac, you can find instructions on how to take a screenshot at this link.
  • Everyone should upload their own copy of the lab write-up to Canvas
  • Only a couple of questions on each lab will be graded accuracy, so your focus should be on learning the material rather than “getting the right answers” as quickly as possible

\(~\)

Introduction

The purpose this lab is to practice applying concepts and procedures related to hypothesis testing

\(~\)

Concept Review

A hypothesis test seeks to falsify a certain null hypothesis using sample data. The major steps are:

  1. State a falsifiable null hypothesis about the population being studied
  2. Come up with a model for the null distribution
  3. Compare the outcome from the real data against the null distribution to find the \(p\)-value
  4. Use the \(p\)-value to make a decision

In many circumstances we can use the Central Limit theorem to find a Normal model for the null distribution:

  • If \(H_0\) is true, then \(\text{sample outcome} \sim N(\text{null value}, SE)\)

Below is a review of the different \(SE\) formulas from the CLT:

Summary measure population parameter sample estimate \(SE\)
single proportion \(p\) \(\hat{p}\) \(\sqrt{\tfrac{p*(1-p)}{n}}\)
difference of proportions \(p_1-p_2\) \(\hat{p}_-\hat{p_2}\) \(\sqrt{\tfrac{p_1*(1-p_1)}{n_1} + \tfrac{p_2*(1-p_2)}{n_2}}\)
single mean \(\mu\) \(\bar{x}\) \(\tfrac{\sigma}{\sqrt{n}}\)
difference of means \(\mu_1 - \mu_2\) \(\bar{x}_1 - \bar{x}_2\) \(\sqrt{\tfrac{\sigma_1^2}{n_1} + \tfrac{\sigma_2^2}{n_2}}\)

Other times we might need to rely on simulation to generate a null distribution, something we can do using StatKey.

\(~\)

Application #1 - Police-involved Deaths

The Washington Post manages a comprehensive database of instances where police officers have used deadly force on an suspect dating back to 2015.

  • This link can be used to download data derived from this database

These data contain the following variables:

  • name - name of the individual killed
  • date - date of the incident
  • year - year (extracted from date)
  • manner_of_death - cause of death
  • armed - weapon the killed individual was carrying (if applicable)
  • age - age of the individual killed
  • gender - gender of the individual killed
  • race - racial/ethnic group of the individual killed, using the US Census designations, “A” = Asian, “B” = Black, “H” = Hispanic (of any race), “N” = Native American, “O” = Other, “W” = Non-Hispanic White
  • city - location of the incident
  • state - state where the incident took place
  • signs_of_mental_illness - whether any past signs of mental illness were present
  • threat_level - whether the suspect was attacking the involved officers
  • flee - how the suspect fled (if applicable)
  • body_camera - whether any of the involved officers were wearing a body camera

Question #1: According to the US Census, the current racial composition of the US is 61.5% Non-Hispanic White, 17.6% Hispanic (of any race), 12.3% Black, 5.3% Asian, 0.7% Native American, and 2.6% other (source). Based upon this information, perform a hypothesis test to evaluate whether the percentage of Hispanic individuals among those killed by the police is statistically different from what would be expected according to the US Census. In doing so, please organize your response by answering the following parts (A - E):

  • Part A: State the null hypothesis using proper statistical notation.
  • Part B: Determine the sample statistic, or the outcome observed in these data that is to be used as evidence against the null hypothesis.
  • Part C: Use StatKey to simulate the null distribution. Briefly explain what each dot represents, and include a screenshot depicting the \(p\)-value.
  • Part D: Use CLT formulas to find a Normal model for the null distribution. Include a screenshot depicting this model and the \(p\)-value.
  • Part E: Provide a brief conclusion that summarizes the findings of this hypothesis test. Be sure to follow the guidelines from our notes.

Question #2: According to a report published by the Bureau of Justice Statistics (BJS) in November 2018, 47% of law enforcement agencies had acquired body cameras, and among these agencies 29 body cameras were available for every 100 officers. This suggests a 13.63% chance that a randomly selected officer will be wearing a body camera at any moment in time. Based upon this information, perform a hypothesis test to determine whether the presence of body cameras during police involved killings is statistically different from the expected rate of 13.63%. Please organize your response by answering the following parts:

  • Part A: State the null hypothesis using proper statistical notation.
  • Part B: Find the \(p\)-value using either simulation or a Normal model (your choice).
  • Part C: Provide a brief conclusion that summarizes the findings of this hypothesis test. Be sure to follow the guidelines from our notes.

Question #3: Is it possible that the conclusion you reached in Question #1 was an error? If so, would be a Type 1 or Type 2 error? Briefly explain.

Question #4: Is it possible that the conclusion you reached in Question #2 was an error? If so, would be a Type 1 or Type 2 error? Briefly explain.

\(~\)

Application #2 - Ballot Initiatives

A ballot initiative refers to a law, provision, or constitutional change that is voted on directly by the population of a state. These initiatives generally appear on the ballot during regularly scheduled elections held within the state. In most states, adding an initiative to the election ballot requires a formal petition that is signed by a minimum number of registered voters.

According to Ballotpedia, the minimum number of signatures needed to place a measure on the ballot in Ohio is based on the total number of votes cast for the governor in the preceding general election. For example, the current threshold for getting a proposed change to the state constitution onto the ballot is 442,958 signatures (until Nov. 2022 when the next gubernatorial race is held).

For this application we’ll consider an advocacy group that has collected 490,000 signatures on a petition for a proposed change to the Ohio constitution. Before the proposed change is added to the ballot, these signatures must be verified to ensure they actually correspond to registered voters. Because signature verification is a very time-intensive process, it is common to validate only a sample of the signatures and use statistical methods to determine if the signature threshold is met.

As an example of how this works, let’s suppose that government officials take a simple random sample of \(n = 2000\) signatures, and 1826 are verified as belonging to registered voters.

Question #5: What proportion of the groups 490,000 signatures need to be valid for the change to make it onto the ballot? What proportion of signatures were valid in the random sample of \(n = 2000\)? Does the fact that the proportion is higher in the sample provide sufficient evidence that the change should be on the ballot? Briefly explain.

Question #6: Are you confident that this sample of \(n = 2000\) signatures accurately represents the entire population of 490,000 signatures? Briefly explain.

Question #7: Perform a hypothesis test evaluating whether the sample of \(n = 2000\) signatures provide sufficient evidence to put the proposed change onto the ballot. Be sure to: 1) clearly state your null and alternative hypotheses, 2) use an appropriate null model to calculate a \(p\)-value, 3) provide a 1-sentence conclusion that puts the results of your test into context.

Question #8: What would a Type 1 error mean for the hypotheses you evaluated in Question #7? Is it possible your decision was a Type 1 error?

Question #9: What would a Type 2 error mean for the hypotheses you evaluated in Question #7? Is it possible your decision was a Type 2 error?

Question #10: For this application, which do you think is worse, a Type 1 or Type 2 error? What could be done to reduce to the likelihood of making a Type 2 error?

\(~\)

Application #3 - The Pepsi Challenge

In the 1980s, Pepsi launched what they called the “Pepsi Challenge”, where they had individuals taste unlabeled cups of both beverages and report which they preferred. Wikipedia describes the challenge’s methodology as follows:

At malls, shopping centers, and other public locations, a Pepsi representative sets up a table with two cups: one containing Pepsi and one with Coca-Cola. Shoppers are encouraged to taste both colas, and then select which drink they prefer. Then the representative reveals the two bottles so the taster can see whether they preferred Coke or Pepsi. The results of the test leaned toward a consensus that Pepsi was preferred by more Americans

If you’re curious, a television ad from 1983 that demonstrates the challenge is embedded below:

\(~\)

Question #11: Briefly critique the study of the Pepsi Challenge. That is, identify and discuss (using proper statistical terms when applicable) what Pepsi did to address confounding variables and biases as possible explanations for the choices made by study participants.

Question #12: In both words and statistical notation, state the null hypothesis that would be of interest in this application.

Question #13: In a trial of the Pepsi Challenge using \(n = 71\) study participants, 42 chose the cup containing Pepsi. Use this information to come up with a Normal probability model that can be used to evaluate the null hypothesis you stated in Question #12.

Question #14: Find the \(p\)-value using the model you described in Question #13, then provide a brief conclusion using the \(\alpha = 0.01\) threshold for statistical significance.