Before Fall break, we learned about basic probability models, computational methods like bootstrapping and monte carlo simulation, and confidence intervals as tools for understanding uncertainty. Today we’ll cover what is arguably the most well-known statistical tool: hypothesis testing
The basic idea behind hypothesis testing is as follows:
\(~\)
In an article published in the journal Nature, Hamlin, Wynn, Bloom (2007) explored the capacity of infants to judge pro-social behavior.
In one part of their study, infants were repeatedly shown puppet shows where a “climber” character struggled to reach the top of a hill. There were two variations of the show:
Infants watched each variation of the show several times in an alternating order, giving them the opportunity to learn the behavior of each character (which could be identified across repetitions by its color and shape).
Next, they were given the opportunity to choose a character to play with:
The researchers recorded the choices of 16 infants who participated in the study. They sought to evaluate whether the infants were inclined to select the “helper” character over the “hinderer” character. They found that 14 of the 16 infants selected the “helper” character.
\(~\)
Question #1: The first step in hypothesis testing is setting up a null hypothesis that the researchers hope to falsify using their data. In words, what would this hypothesis be for the study/data described in this section?
Question #2: Briefly describe one way that you could generate/simulate data that are consistent with the null hypothesis you provided in Question #1.
Question #3: On this StatKey simulation app, click on “Edit Data” and enter the observed count and the sample size of this study. Next, verify the null hypothesis is \(p=0.5\) and generate 10 simulated outcomes. What does each dot shown in the app’s main panel represent? Briefly explain.
Question #4: On the same page used in Question #3, click “Reset Plot” then generate 2,000 simulated outcomes. You should use the same data and null hypothesis as Question #3. Now, considering the steps and philosophy of hypothesis testing, briefly explain why this distribution of simulated outcomes is useful.
Question #5: Calculate the probability of an outcome at least as extreme as the one observed in the actual study appearing under the model used in Questions #3 and #4. Note that this probability is known as the p-value.
Question #6: Based upon the result you found in Question #5, decide whether you believe the results of this study are sufficient to falsify the null hypothesis.
Question #7: Assume you decided in Question #6 that the observed data provided sufficient evidence to falsify the null hypothesis. Does this mean that infants prefer pro-social behavior? Or, are there other explanations that you still must rule out?
Question #8: In the actual study, the researchers randomly assigned the color and shape of the “helper” and “hinderer” characters for each infant. That is, for some infants the helper was a red triangle, while for others it was a blue square, or yellow triangle, etc. Why do you think the researchers did this? Briefly explain.