MATH-156 - Lab #7

Directions:

My main expectation is that you thoughtfully work through labs collaboratively with your group, discussing the embedded questions and recording your responses in a shared document.
- At times you might be asked to add screenshots to your write-up. If you are on a Windows PC, an easy way to do this is the “snipping tool”, which you can find using the search bar along the bottom of your screen. If you are on a Mac, you can find instructions on how to take a screenshot at this link.
Everyone should upload their own copy of the lab write-up to Canvas
Only a couple of questions on each lab will be graded accuracy, so your focus should be on learning the material rather than “getting the right answers” as quickly as possible

\(~\)

Introduction

The purpose this lab is to practice applying concepts and procedures related to hypothesis testing. More specifically, \(Z\) and \(T\) tests performed on a single sample.

\(~\)

Concept Review

A hypothesis test seeks to falsify a certain null hypothesis using sample data.

We’ve now learned how to do this using the \(Z\)-test (categorical outcomes) and \(T\)-test (quantitative outcomes)

State a falsifiable null hypothesis about the population being studied
Find a \(Z\) or \(T\) value describing how many standard errors the observed outcome is above/below the null hypothesis
Compare \(Z\) or \(T\) value against the appropriate probability model to find the \(p\)-value
Use the \(p\)-value to make a decision

Below is a review of the different \(SE\) formulas from the CLT:

Summary measure	population parameter	sample estimate	\(SE\)
single proportion	\(p\)	\(\hat{p}\)	\(\sqrt{\tfrac{p*(1-p)}{n}}\)
difference of proportions	\(p_1-p_2\)	\(\hat{p}_1-\hat{p_2}\)	\(\sqrt{\tfrac{p_1(1-p_1)}{n_1} + \tfrac{p_2(1-p_2)}{n_2}}\)
single mean	\(\mu\)	\(\bar{x}\)	\(\tfrac{\sigma}{\sqrt{n}}\)
difference of means	\(\mu_1 - \mu_2\)	\(\bar{x}_1 - \bar{x}_2\)	\(\sqrt{\tfrac{\sigma_1^2}{n_1} + \tfrac{\sigma_2^2}{n_2}}\)

The \(Z\)-test and \(T\)-test each rely test statistics of the form:

\[\text{test statistic} = \frac{\text{observed} - \text{null}}{SE}\] The test statistic is compared against an appropriate probability model to find the \(p\)-value and reach a conclusion.

\(~\)

Study #1 - Oatbran and LDL cholesterol

In an investigation of whether oatbran cereal might be effective in reducing LDL cholesterol, researchers randomly assigned 14 adult males with high cholesterol into two groups:

The first group followed a diet involving daily consumption of corn flakes cereal for two weeks, then had a one week washout period, and then engaged in two more weeks of dieting involving daily consumption of oatbran cereal.
The second group followed a similar protocol, but consumed oatbran cereal during the first two week diet and oatbran in the second two week diet.

We’ll analyze each subject’s difference in LDL cholesterol when they were on the oatbran diet relative to when they were on the cornflakes diet. This outcome is recorded as the variable “difference” in the dataset linked below. You should recognize that a positive value of “difference” indicates a reduction in LDL cholesterol on the oatbran diet.

Click Here to download the data from this study.

\(~\)

Orientation

Question #1: Briefly describe one population that the researchers can reasonably generalize the results of this study to. Additionally, briefly describe another population that the researchers should avoid generalizing the results of this study to.

Question #2: Is this a randomized experiment or an observational study? With that in mind, how concerned are you about the study’s outcome being influenced by bias or confounding variables? You should respond in 2-3 sentences.

\(~\)

Analysis

Question #3: These researchers wanted to determine whether the oatbran diet was capable of reducing LDL cholesterol levels as measured by the variable “difference”. With that in mind, state the null hypothesis the researchers should evaluate. Be sure to define (in words) any population parameters you include in the null hypothesis (ie: define the meaning of \(\mu\), \(p\), etc.)

Question #4: Perform a \(Z\) or \(T\) test to evaluate the null hypothesis you proposed in Question #3. Your answer should show how you calculated your test statistic, and it should provide a \(p\)-value alongside a brief conclusion that addresses the context of this application.

Question #5: Notice how this dataset contains columns labeled “OatBran” and “CornFlakes”. For this question, briefly explain why it is more reasonable to analyze the “Difference” column instead of comparing the averages found in the “OatBran” and “CornFlakes” columns.

\(~\)

Study #2 - ICU Admissions

Intensive care units, or ICUs, are primary spaces in hospitals that are reserved for patients in critical condition. The dataset linked below is a random sample of \(n = 200\) ICU patients from a research hospital affiliated with Carnegie Mellon University (CMU).

Link: https://remiller1450.github.io/data/ICUAdmissions.csv

The data dictionary below documents each variable contained within the dataset:

ID - Patient ID number
Status - Patient status: 0=lived or 1=died
Age - Patient’s age (in years)
Sex - 0=male or 1=female
Race - Patient’s race: 1=white, 2=black, or 3=other
Service - Type of service: 0=medical or 1=surgical
Cancer - Is cancer involved? 0=no or 1=yes
Renal - Is chronic renal failure involved? 0=no or 1=yes
Infection - Is infection involved? 0=no or 1=yes
CPR - Patient received CPR prior to admission? 0=no or 1=yes
Systolic - Systolic blood pressure (in mm of Hg)
HeartRate - Pulse rate (beats per minute)
Previous - Previous admission to ICU within 6 months? 0=no or 1=yes
Type - Admission type: 0=elective or 1=emergency
Fracture - Fractured bone involved? 0=no or 1=yes
PO2 - Partial oxygen level from blood gases under 60? 0=no or 1=yes
PH - pH from blood gas under 7.25? 0=no or 1=yes
PCO2 - Partial carbon dioxide level from blood gas over 45? 0=no or 1=yes
Bicarbonate - Bicarbonate from blood gas under 18? 0=no or 1=yes
Creatinine - Creatinine from blood gas over 2.0? 0=no or 1=yes
Consciousness - Level upon arrival: 0=conscious, 1=deep stupor, or 2=coma

\(~\)

Orientation

Question #6: Briefly describe one population that the researchers can reasonably generalize the results of this study to. Additionally, briefly describe another population that the researchers should avoid generalizing the results of this study to.

Question #7: Is this a randomized experiment or an observational study? What role, if any, does the design of this study have on the strength of the conclusions you can reach by analyzing it?

\(~\)

Analysis

Question #8: The demographics of the Pittsburgh, PA metropolitan area (where CMU is located) are 85% non-Hispanic white according to the most recent census. Based upon this information, use a \(Z\) or \(T\) test to evaluate whether these data provide evidence that racial minorities are overrepresented among ICU patients at CMU. Your answer should clearly state the null hypothesis, and show how you calculated your test statistic. It should then provide a \(p\)-value alongside a brief conclusion that addresses the context of this application.

Question #9: According to the American Heart Association, a healthy systolic blood pressure is 120 mm Hg. Based upon this information, use a \(Z\) or \(T\) test to evaluate whether these data provide evidence that ICU patients are admitted with systolic blood pressures that differ from the healthy level. Your answer should clearly state the null hypothesis, and show how you calculated your test statistic. It should then provide a \(p\)-value alongside a brief conclusion that addresses the context of this application.

Question #10: Use these data to perform a \(Z\) or \(T\) test to evaluate whether these data provide evidence of a sex imbalance among ICU patients. Your answer should clearly state the null hypothesis, and show how you calculated your test statistic. It should then provide a \(p\)-value alongside a brief conclusion that addresses the context of this application.