Goals:

The purpose of this lab is to provide practice applying one-way ANOVA to test for a difference in means across multiple groups.

Directions:

  • You are expected to progress through the analyses described in this document as a group, recording your answers in a shared document. It’s completely up to your group how you’d like to organize this - some groups like using a shared Google Doc, while other might designate one person to be the group’s recorder.
  • You are expected to work together, any attempts to “divide and conquer” the lab questions may result in point deductions on your group’s lab score.
  • Labs are graded primarily for completion, and we will get together as group for the last 10-15 minutes of class to discuss some of the lab questions. This means you should focus on learning the material (while also helping the teammates in your group) rather than seeing labs as an assessment (like homework or exams).
  • Please upload your responses to the Lab’s questions on Canvas. The expectation is that everyone uploads their own copy (they can be identical within your group).
  • Use the snipping tool on Windows or take a Mac screenshot to add a screenshots to your lab write-up as requested.

\(~\)

Study #1 - Diet and Weight Loss

Researchers at the University of Sheffield conducted a randomized experiment comparing three diet regimes. Participants interested in losing weight were randomly assigned to one of three diets, with their weight (in kg) being measured at the start of the study, and then again 6 weeks later. These data are available at the following link:

Data: https://remiller1450.github.io/data/diet.csv

For our analyses we will consider the following variables:

  • Diet - which of the three diets was the subject assigned
  • preWeight - the weight (in kg) of the subject at the start of the study
  • postWeight - the weight (in kg) of the subject after 6 weeks of dieting
  • weightChange - the weight change (pre - post) of the subject during study period

\(~\)

Question #1: Use StatKey to create side-by-side boxplots or histograms of the variable “weightChange” for each diet group. Based upon what you see, do the assumptions of ANOVA (equal variances and Normally distributed errors) appear to be satisfied?

Question #2: Use the “ANOVA for Differences in Means” menu of StatKey to find the ANOVA table that summarizes the between group and within group variability in the variable “weightChange”. Then, use the F-value and degrees of freedom from this table to find the \(p\)-value. Based upon this \(p\)-value, what do you conclude regarding the efficacy of these three diets (relative to each other)?

Question #3: The boxplots you created in Question #1 suggest Diet #3 is the most effective and Diet #2 is the least effective. The ANOVA test from Question #2 establishes that a difference in means across these two groups is statistically significant. However, other significant differences might also exist. For this question, perform a two-sample t-test to determine if Diet #3 is significantly better than Diet #1.

Question #4: Use StatKey to create side-by-side boxplots of the variable “preWeight” for each diet group. Based upon what you see, are the conditions needed to reliably use ANOVA satisfied?

Question #5: Use the “ANOVA for Differences in Means” menu of StatKey to find the ANOVA table that summarizes the between group and within group variability in the variable “preWeight”. Then, use the F-value and degrees of freedom from this table to find the \(p\)-value. Based upon this \(p\)-value, what do you conclude regarding average “preWeight” in each diet group?

Question #6: Considering the design of this study, are the results you found in Question #5 surprising? Briefly explain.

\(~\)

Study #2 - Critical Flicker Frequency

An individual’s critical flicker frequency is the highest frequency at which a flickering light source can be detected. At frequencies above the critical frequency, the light source appears to be continuous even though it is actually flickering. The data below come from a studied study called “The effect of iris color on critical flicker frequency” published in the Journal of General Psychology. The study recorded the critical flicker frequency and iris color (part of the eye) for \(n = 19\) subjects:

In this application you’ll rely upon output from R, the most popular statistical software used by professional statisticians. Shown below is the statistical output of an ANOVA test performed on these data:

summary(aov(Flicker ~ Color, data = flick))
##             Df Sum Sq Mean Sq F value Pr(>F)  
## Color        2  23.00  11.499   4.802 0.0232 *
## Residuals   16  38.31   2.394                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Question #7: Using the boxplots provided above, do the conditions required to use one-way ANOVA appear to be met by these data? Briefly comment upon each condition.

Question #8: Consider the R output provided above. Had we used StatKey to generate a similar ANOVA table, what label would it have used instead of “Color”? What label would it have used instead of “Residuals”?

Question #9: Use StatKey to confirm that the \(F\)-value and degrees of freedom provided in the R output correspond with a \(p\)-value of 0.023. Include a screenshot of the \(F\)-distribution from StatKey in your lab write-up as proof.

Question #10: Briefly interpret the \(p\)-value provided by the R output in the context of critical flicker frequencies.

\(~\)

Tukey’s Honest Significant Difference Testing

A popular follow-up to a significant ANOVA test is a pairwise testing approach known as Tukey’s Honest Significant Differences test. This procedure is similar to a series of two-sample T-tests; however, it adjusts for the fact that multiple hypothesis tests are being performed within the context of a single analysis. Shown below is R output of this test when applied to the ANOVA results from the critical flicker frequencies data:

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Flicker ~ Color, data = flick)
## 
## $Color
##                  diff        lwr       upr     p adj
## Brown-Blue  -2.579167 -4.7354973 -0.422836 0.0183579
## Green-Blue  -1.246667 -3.6643959  1.171063 0.3994319
## Green-Brown  1.332500 -0.9437168  3.608717 0.3124225

Question #11: The mean for the 6 cases in the “Blue” group was 28.167 (standard deviation of 1.528), while the mean for the 5 cases in the “Green” group was 26.92 (standard deviation of 1.843). Use this information to perform a two-sample \(T\)-test. Then, briefly explain why the \(p\)-value for Tukey’s Honest Significant Differences test is higher than the one resulting from this two-sample \(T\)-test.