Directions:
- Submit your work via the “Assignments” tab on Canvas
- For this assignment you should record your answers/code using
R Markdown
- Please upload HTML, Word, or PDF output created using R Markdown and
make sure it contains your code, output, and written answers. You should
not include extraneous output, such as printing an entire data
frame.
- Homework is an individual assignment. It’s okay to check
your work or collaborate with your classmates, student mentors, and
others, but it is not okay to pass off their work as your own.
- Please clearly acknowledge any help you get from individuals other
than yourself, or resources other than the materials on our course
website (such as external websites and AI)
Question #1
For this question you will use the Employed ACS data set described in
our previous homework assignment.
https://remiller1450.github.io/data/EmployedACS.csv
- Part A: Consider the conjecture that married
individuals on average work more hours per week than unmarried
individuals. State the type of hypothesis testing scenario this
represents (ie: one-sample quantitative, two-sample categorical, etc.)
and provide the null hypothesis using proper statistical symbols, such
as \(p\) or \(\mu\). It is okay to use the letters “mu”
to represent the symbol \(\mu\).
- Part B: Create an appropriate data visualization
related to the research question described in Part A.
- Part C: Use
R
to perform an
appropriate hypothesis test addressing the research question described
in Part A. Provide all code used to prepare the data (if necessary) and
run the test. Report a one-sentence conclusion that includes both the
\(p\)-value and appropriate descriptive
statistics (see Lab 5 for examples).
- Part D: Provide a one-sentence justification for
the hypothesis testing approach you used in Part C. For example, you
might reference the sample size, the shape of the distribution, or the
expected frequencies of certain outcomes.
\(~\)
Question #2
For this question you will continue using the Employed ACS data
set.
https://remiller1450.github.io/data/EmployedACS.csv
- Part A: According to World Bank, the average age of
the US labor force is 42.2 years, significantly older than it was in
2000 (39 years). Consider the conjecture that World Bank is wrong, and
the average age of US workers is actually higher than 42.2 years. State
the type of hypothesis testing scenario this represents (ie: one-sample
quantitative, two-sample categorical, etc.) and provide the null
hypothesis using proper statistical symbols, such as \(p\) or \(\mu\). It is okay to use the letters “mu”
to represent the symbol \(\mu\).
- Part B: Create an appropriate data visualization
related to the research question described in Part A.
- Part C: Use
R
to perform an
appropriate hypothesis test addressing the research question described
in Part A. Provide all code used to prepare the data (if necessary) and
run the test. Report a one-sentence conclusion that includes both the
\(p\)-value and appropriate descriptive
statistics (see Lab 5 for examples).
- Part D: Provide a one-sentence justification for
the hypothesis testing approach you used in Part C. For example, you
might reference the sample size, the shape of the distribution, or the
expected frequencies of certain outcomes.
\(~\)
Question #3
For this question you will use the ICU hospital admissions data set
that was first described in Lab 3.
https://remiller1450.github.io/data/ICUAdmissions.csv
- Part A: Consider the conjecture that female
patients (
Sex = 1
) are more likely to be in the ICU due to
an infection (Infection = 1
) than male patients
(Sex = 0
). State the type of hypothesis testing scenario
this represents (ie: one-sample quantitative, two-sample categorical,
etc.) and provide the null hypothesis using proper statistical symbols,
such as \(p\) or \(\mu\). It is okay to use the letters “mu”
to represent the symbol \(\mu\).
- Part B: Create an appropriate data visualization
related to the research question described in Part A.
- Part C: Use
R
to perform an
appropriate hypothesis test addressing the research question described
in Part A. Provide all code used to prepare the data (if necessary) and
run the test. Report a one-sentence conclusion that includes both the
\(p\)-value and appropriate descriptive
statistics (see Lab 5 for examples).
- Part D: Provide a one-sentence justification for
the hypothesis testing approach you used in Part C. For example, you
might reference the sample size, the shape of the distribution, or the
expected frequencies of certain outcomes.