Lab #4 - One-sample Hypothesis Tests in R

This lab introduces R and R Studio as well as a few procedures we’ll use in future class sessions.

Directions (read before starting)

Please work together with your assigned partner(s). Make sure you all fully understand each topic or example before moving on.
Record your answers to lab questions separately from the lab’s examples. Everyone should have their own copy of your group’s responses, and each individual should turn-in this copy, even if it’s identical to that of your lab partners.
Ask for help, clarification, or even just a check-in if anything seems unclear.

$~$

Onboarding

In our first lab you wrote your code in an R Script; however, R Studio supports several other file types, including R Markdown, a framework that allows for R code, its output, and markdown text to seamlessly coexist in the same document.

If you recently installed R Studio it should come with R Markdown already available. You can check this by navigating:

File -> New File -> R Markdown

If you do not see “R Markdown” displayed in this menu you’ll need to install the rmarkdown package:

# install.packages("rmarkdown")
library(rmarkdown)

## Warning: package 'rmarkdown' was built under R version 4.3.3

$~$

Components of an .Rmd file

At the top of an R Markdown document is the header:

The header is initiated by $\text{---}$ and closed by $\text{---}$
Here you can provide title text, authors, and other information that will appear at the top of the document created by your R Markdown file

After the end of the header you’ll see a code chunk:

Code chunks are initiated by $\text{```\{r\}}$ and closed by $\text{```}$
The first code chunk in most documents is used to set up options for the remainder of the document. In fact, the text “setup” that you see in $\text{```\{r setup\}}$ is giving this chunk the name “setup”. You should keep this chunk as it appears and use other code chunks to add your own code.
You can run the code present in any code chunk using the green arrow in its upper right corner. The grey triangle and green rectangle icon will run all code chunks in your document up to and including the current one in sequential order.

After the setup code chunk you’ll see a section header:

Sections are created using varying numbers of the $\#$ character, which the number determining the size of the header (fewer $\#$ create a larger header)

Following the section header you’ll see ordinary text:

Ordinary text will use markdown conventions, so the text $\$\text{H_0: \\mu = 0}\$$ will appear as $H_0: \mu = 0$ in your document.

$~$

Knitting

The purpose of R Markdown is to seamlessly blend R code, output, and written text. This is accomplished by “knitting” your file into a completed report. You can knit a file using the “Knit” button (blue yarn ball icon), or (on windows) by pressing ctrl-shift-k.

A few things to know about knitting:

When you knit an .Rmd file it begins with an empty environment, so the file might not knit if you’ve been testing your code out of order, or if your code depends upon things that you’ve since deleted while working.
Commands like install.packages() and View() cannot be used in the environment where the document is knit. You should comment-out or remove these commands before knitting to prevent errors.

$~$

Lab

At this point you should begin working independently with your assigned partner(s) using a paired programming framework. Remember that you should read the lab’s content, not just the questions, and you should all agree with an answer before moving on.

Tests for One-sample Categorical Data

When analyzing one-sample categorical data we compare the observed sample proportion, $\hat{p}$, with values that would be expected under a null hypothesis of the form $H_0: p = \_\_$.

The following sections will cover the one-sample $Z$-test, an approach based upon a Normal probability model that works well for large samples, and the exact binomial test, a more computationally expensive approach that calculates the exact probability of each possible value of the observed proportion (rather than approximating these probabilities with a smooth curve like the $Z$-test does).

$~$

One-sample $Z$-test

Recall from our previous lecture that the one-sample $Z$-test involves two steps:

Calculate $Z=\frac{\hat{p} - p}{\sqrt{p(1-p)/n}}$ using $p$ from the null hypothesis and $\hat{p}$ from the sample
Compare $Z$ to the N(0,1) distribution to calculate the $p$-value

The prop.test() function in R can be used to find the $p$-value produced by the one-sample Z-test. The code below uses prop.test() to replicate the one-sample $Z$-test on the infant toy choice data from our previous lecture:

prop.test(x = 14, n = 16, p = 0.5, alternative = "greater", correct = FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  14 out of 16, null probability 0.5
## X-squared = 9, df = 1, p-value = 0.00135
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.6837869 1.0000000
## sample estimates:
##     p 
## 0.875

A few details to unpack:

We provide the numerator of the sample proportion as the x argument and the denominator (the sample size for one-sample data) as the n argument.
The null hypothesis is specified by the p argument.
By default the alternative argument is used for a two-sided test, but we could set it to "less" or "greater" for one-sided tests
By default prop.test() applies the Yates’ continuity correction, but we can turn this off using correct = FALSE
- Note you typically won’t set correct = FALSE in a real data analysis, but we are only doing it in this lab to see that the $p$-values from prop.test() exactly match the ones we calculate ourselves

Question #1: In Lab 3 you worked the CMU ICU admissions data set, a random sample of $n=200$ patients. These data are found below:

https://remiller1450.github.io/data/ICUAdmissions.csv

Part A: In Lab 3 you tested the hypothesis $H_0: p=0.1467$, where $p$ is the readmission rate, using the sample proportion $\hat{p} = 30/200=0.15$. Apply the $Z$-score transformation to the sample proportion as the first step of the one-sample $Z$-test. Report the $Z$-value and show how you calculated it.
Part B: Use the $Z$-value you calculated in Part A and the Normal Probability curve menu of StatKey to find the two-sided $p$-value for the one-sample $Z$-test. Report your $p$-value as part of a one-sentence conclusion.
Part C: Use prop.test() to perform the one-sample $Z$-test and confirm that $p$-value you get matches the one you found using StatKey.

$~$

The Exact Binomial Test

The Normal probability model that the one-sample Z-test relies upon is only reasonable when at least 10 observations in the sample belong to each category involved in the proportions. Or, put differently, when $n\cdot p \geq 10$ and $n\cdot (1-p) \geq 10$.

The primary issue with the Normal model in these small-sample situations is that the null distribution contains a small number of discrete possibilities that cannot be reliably approximated by a continuous curve. The exact binomial test overcomes this by using the binomial probability distribution to calculate the probability of each discrete outcome present in the null distribution.

The example below uses binom.test() to perform the exact binomial test on the data from our helper-hinderer example:

binom.test(x = 14, n = 16, p = 0.5, alternative = "greater")

## 
##  Exact binomial test
## 
## data:  14 and 16
## number of successes = 14, number of trials = 16, p-value = 0.00209
## alternative hypothesis: true probability of success is greater than 0.5
## 95 percent confidence interval:
##  0.6561748 1.0000000
## sample estimates:
## probability of success 
##                  0.875

You should note that this function uses the same arguments/syntax as prop.test(), but the $p$-value we get is slightly different.

You should also notice that we only observed 2 choices of the “hinderer” toy in the study, so the large sample condition of the $Z$-test is not met, leading us to prefer the exact binomial test for an analysis of these data.

Question #2: People who receive a liver transplant have an 89% chance of surviving at least one year. A medical consultant seeking to attract patients advertises that 59 of 62 patients she has worked with have survived at least one year, a death rate of less than half the national average (4.8% vs. 11%).

Part A: Consider the null hypothesis $H_0: p=0.89$, where $p$ is the one-year survival rate of liver transplant patients. Do the data provided by this medical consultant provide sufficient statistical evidence that their clients have a survival rate that differs from the national average? Address this question using a one-sample $Z$-test performed using prop.test(). Provide the R code for the test along with a one-sentence conclusion.
Part B: Are the large sample conditions of the one-sample $Z$-test met for this scenario? Briefly explain.
Part C: Perform an exact binomial test to evaluate the hypothesis $H_0: p=0.89$. How does the $p$-value of this test compare to the one you found in Part A?
Part D: Now consider the null hypothesis $H_0: p=0.11$, where $p$ is the proportion of patients who will die within one year of liver transplant surgery. Use prop.test() to confirm that you arrive at the same $p$-value that you found in Part A, and also use binom.test() to confirm you arrive at the same $p$-value you found in Part C. Is it surprising that these $p$-values are identical? Briefly explain.

$~$

Tests for One-sample Quantitative Data

When analyzing one-sample quantitative data we typically compare the observed sample mean, $\overline{x}$, with what could have been expected under a null hypothesis of the form $H_0: \mu = \_\_$.

The following sections will cover the one-sample $T$-test, an approach that relies upon Student’s $t$-distribution as a probability model, and the Wilcoxon Signed-Rank test, an approach that doesn’t assume any particular probability model and thus can be used in scenarios where the conditions for the $T$-test are not met.

$~$

One-sample $T$-test

The one-sample $T$-test is performed in almost exactly the same manner as the one-sample $Z$-test:

Calculate $T=\frac{\overline{x} - \mu}{s/\sqrt{n}}$ using $\mu$ from the null hypothesis and $\overline{x}$ and $s$ from the sample
Compare $T$ to the $t$-distribution with $n-1$ degrees of freedom to calculate the $p$-value

The t.test() function is used to perform the one-sample $t$-test. The R code below performs this test on data from an experiment which compared changes in LDL cholesterol experienced by subjects when they ate an oat bran cereal as part of their breakfast relative to when they ate a corn flakes cereal. The variable Differenece recorded each subject’s change in LDL (mmol/L).

## Load data
oat_diet = read.csv("https://remiller1450.github.io/data/Oatbran.csv")

## Perform T-test
t.test(x = oat_diet$Difference, mu = 0, alternative = "two.sided")

## 
##  One Sample t-test
## 
## data:  oat_diet$Difference
## t = 3.3444, df = 13, p-value = 0.005278
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.1284606 0.5972537
## sample estimates:
## mean of x 
## 0.3628571

Question #3: For this question you will use the CMU ICU admissions data set used previously in this lab and in Lab 3.

https://remiller1450.github.io/data/ICUAdmissions.csv

Part A: A systolic blood pressure of 120 mm/Hg or less is considered healthy. In Lab 3 you tested the hypotheses $H_0: \mu = 120$ and $H_a: \mu > 120$ using a randomization test in Statkey. In this question you will test the same hypothesis using a one-sample $T$-test. To begin, apply the $Z$-score transformation to the sample mean as the first step of the test. Report the $T$-value and show how you calculated it.
Part B: Use the $T$-value you calculated in Part A and the $t$-distribution menu of StatKey to find the one-sided $p$-value for the one-sample $T$-test. Report your $p$-value as part of a one-sentence conclusion.
Part C: Use t.test() to perform the one-sample $T$-test on these data and confirm that $p$-value you get matches the one you found using StatKey in Part B.
Part D: The $T$-test assumes data come from a Normally distributed population, which can be assessed using a histogram of the sample, or that the sample size is relatively large (at least $n=30$ being a common rule of thumb). Are these conditions met for this data set? Briefly explain.

$~$

Wilcoxon Signed-Rank Test

The Wilcoxon Signed-Rank Test is a non-parametric analog to the one-sample $T$-test for a single mean. It is often used when the sample size is small and it is unreasonable to assume the data came from a Normally distributed population.

Consider the oat bran study from the previous section where we performed a $T$-test on the variable Difference. Rather than using the numerical values themselves as the basis of the test, the Wilcoxon Signed-Rank Test first ranks each data-point based upon its absolute value, then the data-points are grouped according to their sign (positive or negative):

The test then compares the sum of each data-point’s rank multiplied by its sign against a null distribution to produce a $p$-value. This comparison amounts to a test of whether the median of the population is a specified value. We will not cover the precise steps of this test in detail, but you should recognize that if the null hypothesis is $H_0: m = 0$ and the data are symmetric then the expected test statistic is zero.

The code below performs the Wilcoxon Signed-Rank Test on the oat bran data set:

## Signed Rank Test
wilcox.test(x = oat_diet$Difference, mu = 0)

## 
##  Wilcoxon signed rank exact test
## 
## data:  oat_diet$Difference
## V = 93, p-value = 0.008545
## alternative hypothesis: true location is not equal to 0

Question #4: A veterinary anatomist measured the nerve cell density at two different locations in the intestine, site 1 - the mid-region of the jejunum and site 2 - the mesenteric region of the jejunum. The nerve cell density (thousands of cells per mm$^3$) was measured in each location, with the difference recorded as the variable diff, which is the focus of this analysis.

https://remiller1450.github.io/data/horse_nerves.csv

Part A: Create a histogram displaying the distribution of the variable diff using 15 bins. Based upon this histogram and the sample size, explain whether you believe the one-sample $T$-test is appropriate for these data.
Part B: Perform a Wilcoxon Signed-Rank Test to assess whether the mean difference in nerve cell density across these two locations is zero. Report the two-sided $p$-value and a one-sentence conclusion.
Part C: Now perform a one-sample $T$-test to evaluate the same hypotheses used in Part B. Considering the $p$-value of this test, why was the assumption check you performed in Part A important? Briefly explain.

$~$

Practice (required)

In this section you will practice the decision making skills involved in deciding upon the proper statistical analysis for a given research question. This includes:

Identifying whether the data/research question pertain to one-sample or two-sample data with a categorical or quantitative outcome.
Producing appropriate data visualizations and descriptive statistics for the type of identified type of data.
Performing an appropriate hypothesis test for the identified type of data.

Question #5: On Homework 1 you worked with the “TSA claims” data set, a random sample of $n=5000$ of claims made by travelers against the Transport Security Administration (TSA) between 2003 and 2008, the first five years that the agency existed. In this question you will investigate whether the average paid claim amount is less than $200.

https://remiller1450.github.io/data/tsa_small.csv

Part A: Create an appropriate data visualization for the data involved in the research question described above.
Part B: Provide appropriate descriptive statistics that summarize the key aspects of the data visualization you created in Part A.
Part C: Perform an appropriate hypothesis test addressing the research question described above. In addition to the R used to perform the test, provide a one-sentence conclusion and a one-sentence justification for the choice of test, being mindful of the assumptions involved.

$~$

Question #6: Homework 2 introduces the “ACS Employment” data set, which is a random sample of 1287 employed individuals collected as part of the American Community Survey (ACS) performed by during the US Census Bureau. In this question you will investigate a claim by Forbes that 89% of US adults have health insurance.

https://remiller1450.github.io/data/EmployedACS.csv

Part A: Create an appropriate data visualization for the data involved in the research question described above.
Part B: Provide appropriate descriptive statistics that summarize the key aspects of the data visualization you created in Part A.
Part C: Perform an appropriate hypothesis test addressing the research question described above. In addition to the R used to perform the test, provide a one-sentence conclusion and a one-sentence justification for the choice of test, being mindful of the assumptions involved.

Lab #4 - One-sample Hypothesis Tests in `R`