Directions (read before starting)

  1. Please work together with your assigned partner. Make sure you both fully understand something before moving on.
  2. Record your answers to lab questions separately from the lab’s examples. You and your partner should only turn in responses to lab questions, nothing more and nothing less.
  3. Ask for help, clarification, or even just a check-in if anything seems unclear.

\(~\)

Introduction

This lab focuses on developing a conceptual understanding of hypothesis testing using the randomization approaches implemented by StatKey.

On this lab, when performing a hypothesis test you should include each of the following steps:

  1. Clearly state the null and alternative hypotheses using statistical symbols and using words.
  2. Find and report the point estimate in the sample data that relates to the null hypothesis, then use StatKey to find and report an estimate of the \(p\)-value.
  3. Use the point estimate and \(p\)-value to assess both statistical significance and practical significance and provide a 1-2 sentence conclusion summarizing your analysis.

Shown below is an example of what a good answer would look like for these 3 steps involving the Johns Hopkins premature birth survival rate example. Recall that the claim being tested is Wikipedia’s statement that the survival rate of babies borth at 25 weeks gestation is 70% and the observed sample data contained 31 of 39 babies born at this age surviving.

  1. \(H_0: p = 0.7\) vs. \(H_a: p \ne 0.7\) - The null hypothesis is that the survival rate for all babies born at 25 weeks is 70%, and the alternative is that the survival rate is different from 70%.
  2. The point estimate is the sample proportion \(\hat{p} = 31/39 = 0.795\), using StatKey we find a 2-sided p-value of approximately 0.14
  3. Since this p-value is moderately high, we do not have the statistical evidence in this sample to refute Wikipedia’s claim. The survival rate in our sample being about 9.5% above the claimed survival rate could have been due to random chance (sampling variability).

\(~\)

Lab

In this section you will analyze data from 3 different studies. When asked to perform a hypothesis test please make sure you follow the steps described in the Introduction in order to receive full credit.

\(~\)

Study 1 - Oatbran and LDL

To assess whether oatbran cereal might be effective in reducing LDL cholesterol, researchers randomly assigned 14 adult males with high cholesterol into two groups:

  • The first group followed a diet involving daily consumption of corn flakes cereal for two weeks, then had a one week washout period, and then engaged in two more weeks of dieting involving daily consumption of oatbran cereal.
  • The second group followed a similar protocol, but consumed the oatbran cereal during the first two week period, then consumed corn flakes for the second two period.

We’ll analyze each subject’s difference in LDL cholesterol when they were on the oatbran diet relative to when they were on the cornflakes diet. This outcome is recorded as the variable “difference” in the dataset linked below. You should recognize that a positive value of “difference” indicates a reduction in LDL cholesterol when on the oatbran diet.

https://remiller1450.github.io/data/Oatbran.csv

Question #1:

  • Part A: Briefly explain why it is more appropriate to conduct a hypothesis test involving the “difference” column rather than a test comparing the means of the “OatBran” and “CornFlakes” columns? Hint: Think about whether this study involves one sample or two samples (independent groups, such as a “treatment” and a “control” group in a typical experimental setting).
  • Part B: Perform an appropriate hypothesis on the “difference” column to determine whether the oatbran cereal was effective in lowering LDL cholesterol. Be sure to follow the steps outline in the lab’s introduction.

\(~\)

Study 2 - Drug-impaired Driving

The impacts of recreational drugs, including cannabis, on driving have been an active area of research over the past several decades. Because it would be unethical to place drug-impaired drivers onto real roadways, these studies are typically done using advanced driving simulators like the one below:

https://www.youtube.com/watch?v=BvnXKHMh8qQ

One study conducted using the NADS-1 driving simulator involved 19 volunteer participants who each engaged in 45-minute simulated drives under 6 different dosing conditions:

  • Placebo - Placebo
  • Placebo - Low THC cannabis
  • Placebo - High THC cannabis
  • Alcohol - Placebo
  • Alcohol - Low THC cannabis
  • Alcohol - High THC cannabis

These conditions were randomly assigned across the participant’s 6 visits, and were separated by washout periods of at least 10 days. The alcohol dosing condition involved the participant drinking either fruit juice mixed with 90% grain alcohol until their breath alcohol concentration (BrAC) was 5%, or drinking a placebo drink that consisted of fruit juice with an alcohol-swabbed rim for a comparable amount of time. The cannabis dosing condition occurred after alcohol dosing, and had the participant inhale placebo, low THC, or high THC cannabis ad-libitum (at their own pace) from a vaporizer for 10-minutes.

One of this study’s research questions was whether each participant’s lateral control (lane-keeping ability) worsened after drug use. The researchers measured this using standard deviation of lateral position (SDLP), which roughly corresponds with the average deviation from the center of the lane. Thus, higher SDLP values indicate more variability in lane position and thus correspond to worse lateral control (more side-to-side movement within the lane).

The table below displays SDLP (in cm) for each subject under the Placebo-Placebo condition during an interstate segment of the drive. The additional columns of the table display whether a participant’s SDLP increased for the other dosing conditions (relative to the control condition of Placebo-Placebo). A value of “1” indicates higher SDLP compared to their placebo-placebo drive.

Participant Baseline SDLP Placebo/Low THC Placebo/High THC Alcohol/Placebo Alcohol/Low THC Alcohol/High THC
1 14.2 0 0 1 1 1
2 17.9 0 1 1 1 0
3 19.9 1 0 1 1 0
4 11.4 0 0 1 1 1
5 18.3 1 1 0 0 1
6 18.5 0 1 1 1 0
7 15.8 1 1 1 1 1
8 15.9 1 1 1 1 0
9 15.8 0 0 1 1 0
10 15 1 1 0 1 0
11 16 1 1 1 1 1
12 14.7 0 1 1 1 1
13 15.3 1 1 1 1 1
14 17.4 1 1 0 1 1
15 19.6 0 0 1 1 1
16 16.9 1 1 0 0 1
17 15.9 1 0 1 1 0
18 14.9 1 1 1 0 1
19 15.1 1 1 1 1 1
Total 12 13 15 16 12

Question #2: If a dosing condition has no impact on driving performance, you’d expect it to be equally likely for an individual to have a higher or a lower SLDP on their dosed drive relative to their placebo drive. Keep this in mind when responding to the following questions.

  • Part A: Perform a hypothesis test evaluating whether the Alcohol/High THC condition led degraded lateral control relative to the placebo condition. Be sure to follow all of the steps described in the introduction.
  • Part B: Briefly explain why the same randomization distribution can be used for testing all 6 dosing conditions used in this study.
  • Part C: Use the randomization distribution from Part A to find \(p\)-values for the hypothesis tests evaluating degraded lateral control for all 6 dosing conditions. You do not need to provide the full set of steps from the introduction for each condition, you only need to provide the \(p\)-values.
  • Part D: Suppose a statistical significance threshold of \(\alpha = 0.05\) is applied to all of the \(p\)-values in Part C. If each of these \(p\)-values are independent, and the null hypothesis is correct for each scenario, what is the likelihood of making at least one Type I error? Hint: Use the formula
  • Part E: Use the Bonferroni correction to achieve a family-wise Type I error rate of 5% across the entire set of hypothesis tests corresponding to the \(p\)-values used in Parts C/D. What can you conclude after applying this revised significance threshold?
  • Part F: Briefly describe the benefits and downsides of using the Bonferroni correction in an application like this one. That is, why might researchers consider using this correction when reporting their results? Why might they not want to use the correction?