MATH-256 - Practice Exam #1

These questions are intended to help you practice for Exam #1. The real exam will feature 2-3 questions that follow a similar format. All course content up until this point, including the summarizing data and sampling and study design lectures, Labs 1 and 2, and Problem Set #1 may appear on the exam.

You should record your answers in an R Markdown document. You are welcome to use the template available on Canvas.

\(~\)

Study #1 - The Golden State Warriors historic season

Data are available here: https://remiller1450.github.io/data/GSWarriors.csv

gsw <- read.csv("https://remiller1450.github.io/data/GSWarriors.csv")

Study Description:

The Warriors are professional basketball team based in Oakland California. In 2015-2016 they set an NBA record for the most wins in NBA regular season history with a win-loss record of 73-9 (breaking the 1995-95 Chicago Bulls record of 72-10). Additionally, the Warriors team set 25 different NBA records during the 2015-16 season, which is regarded as one of the best seasons in NBA history.

Our goal in analyzing these data is to better understand factors related to the Warrior’s success during their record breaking season.

Data Dictionary:

Game: A chronologically determined number identifying each game within the 82 game season
Date: The date the game was played
Location: Whether the game was home or away
Opp: Opposing team’s name
Win: Whether the game was a win (W) or a loss (L) for Golden State
Points: Number of points scored by Golden State
OppPoints: Number of points scored by the opponent
FG: Number of field goals made by Golden State
FGA: Number of field goals attempted by Golden State
FG3: Number of 3-point shots made by Golden State
FG3A: Number of 3-point shots attempted by Golden State
FT: Number of free throws made by Golden State
FTA: Number of free throws attempted by Golden State
Rebounds: Total number of rebounds by Golden State
OffReb: Number of offensive rebounds by Golden State
Assists: Number of assists by Golden State
Steals: Number of steals by Golden State
Blocks: Number of blocked shots by Golden State
Turnovers: Number of turnovers made by Golden State
Fouls: Number of fouls committed by Golden State
OppFG: Number of field goals made by the opponent
OppFGA: Number of field goals attempted by the opponent
OppFG3: Number of 3-point shots made by the opponent
OppFG3A: Number of 3-point shots attempted by the opponent
OppFT: Number of free throws made by the opponent
OppFTA: Number of free throws attempted by the opponent
OppRebounds: Total number of rebounds by the opponent
OppOffReb: Number of offensive rebounds by the opponent
OppAssists: Number of assists by the opponent
OppSteals: Number of steals by the opponent
OppBlocks: Number of blocked shots by the opponent
OppTurnovers: Number of turnovers made by the opponent
OppFouls: Number of fouls committed by the opponent

\(~\)

1-A: Are these data best viewed as a sample or a population? Explain your answer in no more than 2 sentences (No use of R is required for this question)

1-B: Create an appropriate data visualization depicting the relationship between “Location” and “Win”. Then write 1-2 sentences describing the relationship you see between these variables.

1-C: Report the difference in the proportion of games won by location (ie: winning percentage at home - winning percentage away).

1-D: Create an appropriate data visualization depicting the relationship between “OppPoints” and “Points”. Then write 1-2 sentences describing the relationship you see between these variables.

1-E: Can a causal relationship between “Location” and “Win”, or between “OppPoints” and “Points”, be established by this study. If so, briefly explain. If not, briefly describe an alternative explanation for one of these observed associations. (No use of R is required for this question)

\(~\)

Study #2 - Chicken growth in response to diet

Data are available here: https://remiller1450.github.io/data/ChickWeight.csv

chicks <- read.csv("https://remiller1450.github.io/data/ChickWeight.csv")

Study Description:

At birth, 71 chicks (baby chickens) were randomly assigned to one of six diets, and their weight was measured every second day until they reached 21 days old. These diets differed only in terms of the protein source used in the feed mixture.

The data you are provided contains each chick’s assigned diet and that chick’s weight (in grams) when they were 21 days old.

Data Dictionary:

weight - the chick’s weight in grams at day 21
feed - the protein source used in feed for the chick’s diet

\(~\)

2-A: In one sentence, state the research question of this study. (No use of R is required for this question)

2-B: Create a histogram of the variable “weight” and describe distribution of this variable. Please do not consider the variable “feed” when answering this question.

2-C: Create an appropriate data visualization depicting the relationship between “feed” and “weight”. Briefly describe which diets appeared the most successful (in terms of achieving the greatest weight gain).

2-D: Based upon the graph you created in Part C, which diet appeared to have greatest variability in the 21 day weights of chicks that adhered to it? Justify your answer by referencing an appropriate measure of variability (you do not need to calculate the exact number).

2-E: The birth weight of each chick is not included in these data despite being a factor that is clearly associated with that chick’s 21 day weight. Does the omission of this variable pose a problem when attempting to establish a causal relationship between a chick’s diet and its weight? Briefly explain. (No use of R is required for this question)