This lab provides a brief overview of the functions used to perform the statistical tests commonly covered in an introductory statistics course.

If you are unfamiliar with the concept of hypothesis testing, I encourage you to read through my course notes on the topic

Directions (Please read before starting)

  1. Please work together with your assigned partner. Make sure you both fully understand each concept before you move on.
  2. Please record your answers and any related code for all embedded lab questions. I encourage you to try out the embedded examples, but you shouldn’t turn them in.
  3. Please ask for help, clarification, or even just a check-in if anything seems unclear.

\(~\)

One-sample Tests

One-sample tests are used to assess whether a summary statistic observed in the sample data is statistically different from a hypothesized value.

For categorical data, a common null hypothesis is \(H_0: p = p_0\), where \(p_0\) is a hypothesized proportion for the categorical outcome of interest. This hypothesis can be evaluated using a one-sample Z-test:

acs <- read.csv("https://remiller1450.github.io/data/EmployedACS.csv")  ## Random sample of 1287 employed individuals from the American Community Survey

n_male = sum(acs$Sex == 1)  ## Number of males among respondents
prop.test(x = n_male, n = nrow(acs), p = 0.5, alternative = "two.sided")
## 
##  1-sample proportions test with continuity correction
## 
## data:  n_male out of nrow(acs), null probability 0.5
## X-squared = 3.1826, df = 1, p-value = 0.07443
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
##  0.4975476 0.5528050
## sample estimates:
##         p 
## 0.5252525

or an exact binomial test:

binom.test(x = n_male, n = nrow(acs), p = 0.5, alternative = "two.sided")
## 
##  Exact binomial test
## 
## data:  n_male and nrow(acs)
## number of successes = 676, number of trials = 1287, p-value = 0.07439
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.4975508 0.5528386
## sample estimates:
## probability of success 
##              0.5252525

For either test, you must provide the numerator and denominator of the sample proportion (ie: count of males and total sample size), as well as the hypothesized population proportion, “p”.

For quantitative data, the one-sample \(t\)-test should be used to assess the hypothesis: \(H_0: \mu = \mu_0\), where \(\mu_0\) is a hypothesized mean.

t.test(x = acs$Income, mu = 40, alternative = "two.sided")
## 
##  One Sample t-test
## 
## data:  acs$Income
## t = 2.9449, df = 1286, p-value = 0.003289
## alternative hypothesis: true mean is not equal to 40
## 95 percent confidence interval:
##  41.50877 47.53074
## sample estimates:
## mean of x 
##  44.51976

In this example, the quantitative variable is given as the x argument, and the hypothesized mean is given as mu.

Two-sample Tests

Two-sample tests are used to assess whether two different sample groups are statistically different. A common example is an A/B test, where experimental participants are randomly assigned into one of two conditions (A or B) and an outcome is recorded.

For categorical data, you might use a difference in proportions Z-test:

## First you'll need the numerator and denominator of each sample's proportion
ins_white = sum(acs$HealthInsurance == 1 & acs$Race == "white")
n_white = sum(acs$Race == "white")
ins_black = sum(acs$HealthInsurance == 1 & acs$Race == "black")
n_black = sum(acs$Race == "black")

## Z-test
prop.test(x = c(ins_white, ins_black), n = c(n_white, n_black))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(ins_white, ins_black) out of c(n_white, n_black)
## X-squared = 0.14491, df = 1, p-value = 0.7034
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.04383785  0.07295584
## sample estimates:
##    prop 1    prop 2 
## 0.9283521 0.9137931

For quantitative data, you should use a two-sample T-test:

t.test(Income ~ Sex, data = acs)
## 
##  Welch Two Sample t-test
## 
## data:  Income by Sex
## t = -4.921, df = 1231.2, p-value = 9.776e-07
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -20.650616  -8.878211
## sample estimates:
## mean in group 0 mean in group 1 
##        36.76471        51.52913

The syntax provided to t.test uses formula notation. The formula Income ~ Sex indicates the quantitative outcome, “Income”, should be evaluated according to the two groups created by the variable “Sex”.

Chi-Squared Tests

Sometimes it is too simplistic to reduce a nominal categorical variable (many categories) into a binary variable (two categories) in order to use one of the aforementioned statistical tests. In these circumstances, you can consider a Chi-squared Test (either goodness of fit or association) or Fisher’s Exact Test (association):

## Goodness of fit Chi-squared test
chisq.test(x = table(acs$Race), p = c(0.05, 0.15, 0.1, 0.7))
## 
##  Chi-squared test for given probabilities
## 
## data:  table(acs$Race)
## X-squared = 54.6, df = 3, p-value = 8.356e-12
## Chi-squared test of association
chisq.test(x = table(acs$Race, acs$HealthInsurance))
## 
##  Pearson's Chi-squared test
## 
## data:  table(acs$Race, acs$HealthInsurance)
## X-squared = 25.378, df = 3, p-value = 1.287e-05
## Fisher's exact test
fisher.test(x = table(acs$Race, acs$HealthInsurance))
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table(acs$Race, acs$HealthInsurance)
## p-value = 0.0001573
## alternative hypothesis: two.sided

ANOVA

Similarly, some studies will naturally require you to compare a quantitative outcome across more than two groups, in which case you should use one-way ANOVA:

anova_mod <- aov(Income ~ Race, data = acs)
summary(anova_mod)
##               Df  Sum Sq Mean Sq F value   Pr(>F)    
## Race           3   56523   18841   6.291 0.000309 ***
## Residuals   1283 3842204    2995                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Post-hoc Testing

Recall that a statistically significant ANOVA test should be followed up on using some sort of post-hoc testing, for example Tukey’s Honest Significant Differences:

TukeyHSD(anova_mod)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Income ~ Race, data = acs)
## 
## $Race
##                   diff        lwr        upr     p adj
## black-asian -26.750588 -46.403386 -7.0977905 0.0026929
## other-asian -30.045124 -50.285682 -9.8045650 0.0008099
## white-asian -15.697207 -31.049147 -0.3452658 0.0428292
## other-black  -3.294535 -22.402516 15.8134456 0.9708621
## white-black  11.053382  -2.771118 24.8778819 0.1680528
## white-other  14.347917  -0.300105 28.9959390 0.0573859

Correlation

The final combination of variables yet to be considered in this tutorial is two quantitative variables.

In this situation, the correlation coefficient can form the basis of a hypothesis test of \(H_0: \rho = 0\), or no correlation between the variables being studied:

cor.test(x = acs$HoursWk, y = acs$Income, method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  acs$HoursWk and acs$Income
## t = 12.893, df = 1285, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2891509 0.3859513
## sample estimates:
##       cor 
## 0.3384462

Closing Remarks

This tutorial is intended to provide a quick reference to several functions used to perform common statistical tests.

It does not cover:

  • the assumptions of the aforementioned hypothesis tests. You should also always check to make sure that your data are appropriate for the statistical test you are using.
  • the interpretation of hypothesis testing results, effect size, or confidence intervals. All of these topics are critical in providing an accurate assessment of the information your data provide.

\(~\)

Practice

For each scenario, write out a reasonable null hypothesis and evaluate it using the proper statistical test.

Question #1: The “infant heart” data set documents the results of a experiment investigating two developmental indices, PDI and MDI, after random assignment to one of two surgical approaches, low-flow bypass and circulatory arrest.

ih <- read.csv("https://remiller1450.github.io/data/InfantHeart.csv")

Part A: Use a statistical test to determine if there is sufficient statistical evidence to conclude that one of the two surgeries yields significantly greater PDI outcomes (indicating better physical development).

Part B: Use a statistical test to determine if there is sufficient statistical evidence to conclude that an infant’s PDI and MDI scores are related to each other.

Part C: Use a statistical test to determine if there is sufficient statistical evidence that a larger share of male infants were assigned to the low-flow bypass than the circulatory arrest group.

\(~\)

Question #2: The “commute tracker” data set is a sample of daily commutes tracked by a GPS app for a worker in the greater Toronto area.

ct <- read.csv("https://remiller1450.github.io/data/CommuteTracker.csv")

Part A: Use a statistical test to determine if there is sufficient statistical evidence to conclude that the commuter is more likely to take Hwy 407 on certain days of the week.

Part B: Use a statistical test to determine if there is sufficient statistical evidence to conclude that average value of “MaxSpeed” differs by month. If it does, conduct an appropriate follow-up test.

Part C: Use a statistical test to determine if there is sufficient statistical evidence to conclude that the commuter is more to not record a commute whose destination is “Home” (opposed to one whose destination is “GSK”, the commuter’s place of employment).