This lab is intended to provide insight into provide practice applying one-sample hypothesis testing methods to real data.
Directions (Please read before starting)
\(~\)
This lab is intended to provide hands on practice analyzing one-sample data. In doing so, you’ll need to: interpret research questions, graphically explore the data, choose and execute appropriate hypothesis tests, and assess study design when reporting your findings.
As review, we’ve covered the following hypothesis tests for one-sample categorical data:
pbinom.test
prop.test
Generally speaking, the exact binomial test should always be used. The \(Z\)-test is mostly a historical artifact from when the exact binomial test was too computationally difficult for large samples.
We’ve also covered the following tests for one-sample quantitative data:
t.test
wilcox.test
Generally speaking, the \(t\)-test is more powerful and should be preferred over the signed rank test if conditions allow for it.
\(~\)
A waiter at national chain restaurant located in a suburban shopping mall in the early 1990s recorded data on the tables they served during a three month period in hopes of demonstrating to their boss that they were being tipped significantly less than 20%.
tips = read.csv("https://remiller1450.github.io/data/Tips.csv")
tips$TipPercent = tips$Tip/tips$TotBill ## Create tip percentage
\(~\)
Question #1: Identify whether the variable of interest in quantitative or categorical and propose a suitable null hypothesis pertaining to that variable (you should be able to form your null hypothesis before looking at the sample data).
Question #2: Use a graph or descriptive statistics to explore the distribution of the variable of interest. Based upon your exploration, which hypothesis test should be used? Briefly justify your choice of test.
Question #3: If necessary, revise your null hypothesis from Question #1, then perform the hypothesis test you identified in Question #2. Be sure to clearly state your hypotheses and report your \(p\)-value along with a conclusion in the context of the research question.
Question #4: What would need to be true regarding the design of this study (the data collection in particular) for the results of hypothesis test you conducted in Question #3 to be considered compelling evidence by your boss?
\(~\)
In an investigation into whether consumption of oatbran cereal might lower LDL cholesterol, researchers randomly assigned 14 adult males with high cholesterol into one of two groups:
The researchers were interested in an individual’s LDL cholesterol (in mmol/L) when they were on the oatbran diet relative to when they were on the cornflakes diet. This outcome is recorded as the variable “difference” such that a positive value of “difference” indicates a reduction in LDL cholesterol when following the oatbran diet.
oats = read.csv("https://remiller1450.github.io/data/Oatbran.csv")
\(~\)
Question #5: Identify whether the variable of interest in quantitative or categorical and propose a null hypothesis pertaining to that variable (you should be able to do this before looking at the sample data).
Question #6: Use graphs or descriptive statistics to explore the distribution of the variable of interest. Based upon your exploration, which hypothesis test should be used? Briefly justify your choice of test.
Question #7: If necessary, revise your null hypothesis from Question #5, then perform the test you identified in Question #6. Be sure to clearly state your hypotheses and report your \(p\)-value along with a conclusion in the context of the research question.
Question #8: Are the results of the hypothesis test you performed in Question #7 compelling evidence that oatbran causes a reduction in cholesterol? Briefly explain.
\(~\)
In recent years, many states have moved to legalize recreational marijuana; however, the impact of these policies on traffic safety is unclear. Obviously, it is unethical to conduct potentially dangerous on-road experiments using intoxicated drivers. Fortunately, such studies can be done using advanced driving simulators (video shown below):
One study conducted using the NADS-1 driving simulator involved 19 volunteer participants who each engaged in 45-minute simulated drives under 6 different dosing conditions:
These conditions were randomly assigned across the participant’s 6 visits, and were separated by washout periods of at least 10 days. The alcohol dosing condition involved the participant drinking either fruit juice mixed with 90% grain alcohol until their breath alcohol concentration (BrAC) was 5%, or drinking a placebo drink that consisted of fruit juice with an alcohol-swabbed rim for a comparable amount of time. The cannabis dosing condition occurred after alcohol dosing, and had the participant inhale placebo, low THC, or high THC cannabis ad-libitum (at their own pace) from a vaporizer for 10-minutes.
One research question that the study investigated was whether a participant’s lateral control (lane-keeping ability) worsened when they were intoxicated. This was assessed using a measure known as standard deviation of lane position (SDLP), which roughly corresponds to the average distance of a participant from their average lane position. Higher SDLP values indicate more variability in lane position and thus correspond to worse lateral control (more side-to-side movement within the lane).
The table below displays SDLP (in cm) for each subject under the Placebo-Placebo condition during an interstate segment of the drive. The additional columns of the table display whether a participant’s SDLP increased for the other dosing conditions (relative to the control condition of Placebo-Placebo). A value of “1” indicates higher SDLP compared to their placebo-placebo drive.
Participant | Baseline SDLP | Placebo/Low THC | Placebo/High THC | Alcohol/Placebo | Alcohol/Low THC | Alcohol/High THC |
---|---|---|---|---|---|---|
1 | 14.2 | 0 | 0 | 1 | 1 | 1 |
2 | 17.9 | 0 | 1 | 1 | 1 | 0 |
3 | 19.9 | 1 | 0 | 1 | 1 | 0 |
4 | 11.4 | 0 | 0 | 1 | 1 | 1 |
5 | 18.3 | 1 | 1 | 0 | 0 | 1 |
6 | 18.5 | 0 | 1 | 1 | 1 | 0 |
7 | 15.8 | 1 | 1 | 1 | 1 | 1 |
8 | 15.9 | 1 | 1 | 1 | 1 | 0 |
9 | 15.8 | 0 | 0 | 1 | 1 | 0 |
10 | 15 | 1 | 1 | 0 | 1 | 0 |
11 | 16 | 1 | 1 | 1 | 1 | 1 |
12 | 14.7 | 0 | 1 | 1 | 1 | 1 |
13 | 15.3 | 1 | 1 | 1 | 1 | 1 |
14 | 17.4 | 1 | 1 | 0 | 1 | 1 |
15 | 19.6 | 0 | 0 | 1 | 1 | 1 |
16 | 16.9 | 1 | 1 | 0 | 0 | 1 |
17 | 15.9 | 1 | 0 | 1 | 1 | 0 |
18 | 14.9 | 1 | 1 | 1 | 0 | 1 |
19 | 15.1 | 1 | 1 | 1 | 1 | 1 |
Total | 12 | 13 | 15 | 16 | 12 |
Question #9: In this question you’ll focus on the Alcohol/High THC dosing condition. If this dosing condition had no impact on driving performance, you’d expect it to be equally likely for an individual to have higher or lower SLDP on the dosed drive relative to the placebo drive. With that in mind, identify the outcome of interest, determine a suitable null hypothesis, and then perform that hypothesis test using the sample data.
Question #10: Repeat the hypothesis test you performed in Question #9 for each of the remaining dosing conditions and record the \(p\)-value of each test. You do not need to write out a separate conclusion for each test, instead you can just copy-paste code and organize the \(p\)-values.
Question #11: Describe (in words) what a Type I error and a Type II error would entail in this application.
Question #12: Suppose a decision threshold of \(\alpha = 0.05\) is applied to all of the hypothesis tests you performed (across Questions #9 and #10). If each of these tests are independent, and the null hypothesis is correct for every test, what is the likelihood of making at least one Type I error?
Question #13: Use the Bonferroni correction to achieve a family-wise Type I error rate of 5% across the entire set of hypothesis tests you’ve performed. What do you conclude after applying this revised significance threshold? Note that you do not need to repeat any hypothesis tests, instead you can simply report your updated conclusions (in terms of which dosing conditions you conclude are significantly different from baseline).
Question #14: Briefly describe the benefits and downsides of using the Bonferroni correction in an application like this one. That is, why might researchers consider using this correction when reporting their results? Why might they not want to use the correction?