Directions:

  • My main expectation is that you thoughtfully work through labs collaboratively with your group, discussing the embedded questions and recording your responses in a shared document.
    • At times you might be asked to add screenshots to your write-up. If you are on a Windows PC, an easy way to do this is the “snipping tool”, which you can find using the search bar along the bottom of your screen. If you are on a Mac, you can find instructions on how to take a screenshot at this link.
  • Everyone should upload their own copy of the lab write-up to Canvas
  • Only a couple of questions on each lab will be graded accuracy, so your focus should be on learning the material rather than “getting the right answers” as quickly as possible

\(~\)

Introduction

This lab is a little different from most in this course. Rather than practicing analyzing real data, this lab will provide further practice in the foundations of probability that will soon enable us to conduct more sophisticated statistical analyses.

\(~\)

Warm-up

Drawing from a deck of cards is a classic example used to teach probability rules. Shown below is a diagram of the 52 cards in a standard deck:

Question #1: Relate the terms: trial, outcome, event, and sample space to a single draw from a standard deck of cards. Provide an example of each.

Question #2: Again, consider a single draw from the deck, and calculate the probability of each of the following events using two approaches: First, by directly counting the involved outcomes. Second, by applying probability rules (addition rule, complement rule, multiplication rule). However, because there are a lot of outcomes here, don’t feel obligated to break things down all the way to the single-card level when applying probability rules.

  1. Drawing the ace of spades
  2. Drawing a red card (heart or diamond)
  3. Drawing an ace or a red card (heart or diamond)
  4. Drawing a 3 and a red card (heart or diamond)

\(~\)

Sampling from a small population

Suppose a sock drawer contains 12 socks, 4 are blue, 5 are grey, and 3 are black.

Question #3: Consider randomly selecting two socks from this drawer. It is reasonable to view these selections are independent? Why or why not?

Question #4: Consider randomly selecting two socks from this drawer. Calculate each of the following probabilities:

  1. The first sock is blue, and the second sock is black
  2. Both socks are blue
  3. Neither sock is black
  4. Both socks are the same color

\(~\)

Sampling from a large population

Now suppose we’re sampling from a sock warehouse that contains hundreds of thousands of socks. Of these, 20% are blue, 50% are grey, and 30% are black.

Question #5: Consider randomly selecting two socks from this drawer. It is reasonable to view these selections are independent? Why or why not?

Question #6: Consider a sample of five socks, all of which are blue. Do you believe this could have been a random sample from the warehouse? Justify your answer using probability.

Question #7: Consider a sample of five socks, none of which are blue. Do you believe this could have been a random sample from the warehouse? Justify your answer using probability.

\(~\)

Application - Diagnostic Testing

The previous examples illustrate one way that probability is used in statistical inference, or the process of learning about a population using sample data.

Another application of probability is in the realm of diagnostic testing. In a diagnostic test, an individual who gets tested either has the disease or does not have the disease. Then, they either test positive or negative.

Before we begin our application, we need to define two different conditional probabilities:

  • Sensitivity: The probability that someone who has the disease will receive a positive test result
  • Specificity: The probability that someone who does not have the disease will receive a negative test result

ELISA and HIV

Enzyme-Linked Immunosorbent Assays, or ELISA, are commonly used to determine if an individual has human immunodeficiency virus (HIV). An ELISA test is often used as a screening test prior to blood donation to prevent transmission of HIV. However, as with most medical diagnostic tests, it is not infallible. Experts estimate that if an individual truly has HIV, they’ll test positive during an ELISA screening 97.7% of the time. If an individual does not have HIV, they’ll test negative 92.6% of the time.

Question #8: Based upon the information above, what is the sensitivity of an ELISA test?

Question #9: Based upon the information above, what is the specificity of an ELISA test?

Question #10: Suppose an individual tests positive during an ELISA screening prior to a blood donation. How likely do you think it is that this person actually has HIV? In your write-up, record whether you think this probability is closest to 0.1, 0.5, or 0.9. For now, there’s not a correct answer, I’m only looking for your intuitive judgment and a brief explanation.

\(~\)

Conditional Probability and Contingency Tables

Imagine a hypothetical population of 1,000,000 people and suppose that 0.5% of this population actually has HIV (this is roughly the percentage in the US).

Question #11: How many people in this population have HIV (as a count)? How many do not have HIV (as a count)?

Question #12: Considering all of the members of the population who have HIV, if 97.7% of them would test positive on an ELISA screening, how many positive and negative tests will there be (as counts) in this group?

Question #13: Considering all of the members of the population who do not have HIV, if 92.6% of them would test negative on an ELISA screening, how many positive and negative tests will there be (as counts) in this group?

Question #14: Using your answers to Questions #11 - #13, fill out the contingency table outlined below:

Positive ELISA Negative ELISA Total
Has HIV
Doesn’t have HIV
Total 1,000,000

\(~\)

Reversing the Conditional Probabilities

Question #15: Using the hypothetical contingency table you created in Question #14, what proportion of those who test positive on an ELISA screening test actually have HIV? Why is this probability smaller than what most people would expect?

Question #16: Using the hypothetical contingency table you created in Question #14, what proportion of people who test negative on an ELISA screening test are actually free of HIV? How does this probability estimate compare with what you’d expect?