Lab 6 - A/B Testing

Directions

Read through the entire lab (not just the questions). The lab will introduce course content that you will be responsible for on exams/homework.
Answer all questions in a separate document, attaching Minitab output if needed.
Do not use a “divide and conquer” strategy. While it is tempting to get done quicker, this approach negatively impacts you and your classmates. You are expected to work through the lab as a team. Also, you should recognize that Prof. Miller is happy to devote more class time to a lab if it is taking longer than anticipated.

Introduction

When reading modern scientific research it might seem like the statistical tests we’ve been learning about are outdated and rarely used, but in reality, they are still used quite frequently for something known as A/B Testing.

A/B testing is general methodology centered on a statistical comparison of two randomly assigned conditions, “A” and “B”, which can often be viewed as treatment and control groups. For example, a business trying to optimize their webpage design might randomly assign new visitors to receive one of two variants of the page. The business can track the clicks made by the visitors assigned to each page, paying particular attention to certain behaviors (clicking on promotions/ads, adding items to their cart, completing a purchase, etc.)

In this lab we will analyze data from an A/B test conducted by an anonymous company in January 2017 to evaluate the efficacy of two landing page designs for their website. The outcome of interest for this analysis is conversion, in this example conversion refers to a successful sale of the company’s product.

Question 1:

Can a causal relationship be inferred from the A/B testing protocol described in the paragraphs above? What would a causal relationship mean in the context of the application we’ll be investigating in this lab?

The Data

Download and load the A/B Testing Data into Minitab. The data set contains the following variables:

user_id - A six-digit identifier of the visitor
timestamp - The date and time when the visitor accessed the website
group - Whether the visitor was randomly assigned to the treatment or control group
landing_page - The landing page that the visitor saw
converted - Whether the visitor made a purchase from the website

Data Analysis

Was Randomization Successful?

In nearly any randomized experiment it is possible that the randomization protocol does not work as intended. We’ve already seen several examples of this, including one where the distribution of a key confounding variable was not balanced across groups (the lab monkey example), and another where there was disproportionate cross-over between the assigned treatments (the Minneapolis domestic abuse study). For these reasons, an important preliminary step in analyzing data from a randomized experiment is to determine whether randomization was successful.

Question 2:

Construct a “histogram with groups” of the variable “timestamp” using the variable “group” as the grouping variable. Based upon this graph, do you believe that this experiment’s randomization procedure was successful in preventing date from being a confounding variable in the relationship between “group” and “converted”? Include your graph and briefly explain your reasoning.

Question 3:

The creators of this experiment intended for an equal number of new visitors to be assigned to view each page design and constructed their randomization scheme according. Using the appropriate statistical test, is there evidence that an imbalanced proportion of the sample was assigned to either page? You may conduct your test in Minitab, but you should include the relevant output and a 1-2 sentence conclusion in your lab write-up.

Question 4:

A final concern in some randomized experiments is adherence. For various reasons (browser compatibility, browsing history, etc.), not every new visitor ended up viewing the version of the page they were randomly assigned to visit. To answer this question, use a two-way frequency table to evaluate adherence. Do you think that lack of adherence is a major concern (ie: source of bias) in this experiment? Briefly explain. (Hint: you’re not required to use a statistical test here, rather you should consider the data along with the background of the experiment and make a judgement)

Evaluating Each Page Design

In this section we will analyze the impact of page design on conversion. We will consider three different hypothesis tests for the purposes of learning about the similarities and differences in these tests. In practice you’d only be interested in performing a single test.

Question 5:

Use a \(z\)-test to evaluate whether conversion rates are different for the treatment and control groups. I’d like you to perform this test “by hand”, but you may check your work in Minitab. Your answer should include: the statement of your null and alternative hypotheses using proper statistical notation, the calculation of your test statistic, the \(p\)-value of your test, and a one sentence conclusion.

Question 6:

Use the appropriate exact test in Minitab to evaluate whether conversion rates are different for the treatment and control groups. Include your Minitab output in your lab write-up

Question 7:

Using Statkey, perform an appropriate randomization test to evaluate whether conversion rates are different for the treatment and control groups. Include a screenshot of your randomization distribution (showing the \(p\)-value) in your lab write-up.

Question 8

How do the results of the three different tests in Questions 5-7 compare with one another? What do you think is the reason for these similarities/differences?

The Intention to Treat Principle

In our notes on randomized experiments we first learned about the intention to treat principle (ITT), an analysis approach that should be considered when it is possible for subjects not to adhere to their randomly assigned treatment. In the A/B testing data analyzed in this lab it was possible for subjects not to end up viewing the page they were randomly assigned to, meaning we should consider the ramifications of an ITT analysis versus an analysis using the variable “landing_page” (which is sometimes called an “as-treated” analysis).

Question 9

What does intent-to-treat analysis mean in the context of this application? That is, what would the explanatory and response variables of this analysis be?

Question 10

Conduct an appropriate test to evaluate whether conversion rates are different for each “landing_page” (ie: perform an as-treated analysis). In your lab write-up you should include a screenshot of the output from your test, along with a one sentence conclusion.

Question 11

Does the hypothesis test you performed in Question #10 agree with the results of your ITT analysis (which you performed in the previous section) of these data?

Question 12

Does it seem necessary to use the intention to treat principle in this application? Briefly explain.

The Next Steps

Question #13

Based upon the statistical tests you performed on these data (and the conclusion you reached), is it possible that you made a Type I error? Briefly explain.

Question #14

Based upon the statistical tests you performed on these data (and the conclusion you reached), is it possible that you made a Type II error? Briefly explain.

Question #15

If you were going to re-run this experiment, how could you increase its statistical power? Briefly explain.

Question #16

Suppose that more powerful repeat of this experiment is to be run, and you are asked to decide upon a sample size that you’d expect to have an 80% chance of producing a statistically significant result at the \(\alpha = 0.05\) level. Based upon the effect size observed in these data, use the power calculator at this link to determine the necessary sample size.

Submission Directions

Double check that you’ve completed all of the lab’s questions, making sure that everyone in your group agrees with the answer you’ve provided. You will receive a single group score for the lab.
Make sure that everyone’s name is on the write-up.
Email your completed write-up to Professor Miller with a subject heading that includes the text “Sta-209-Lab6”. Please include this exact character string, including the dashes. You will lose 1 point off the top of your score if you don’t do so.
If you’d like to provide feedback on your group, fill out the optional review form at this link: https://forms.gle/wNWRFMbbra8oK4LJ8