Directions
In the typical statistics class (or textbook) you’ll learn about a particular method, practice applying it to a few examples, and then move on. This workflow is useful for learning new statistical techniques, but it doesn’t prepare you for the real world where you don’t know which chapter’s methods are best suited for the question you’re trying to answer. One of the most challenging aspects of applied statistics is knowing how to choose the right approach for a given situation. The goal this lab is for you to:
Some of the decisions we will focus on include:
To gain broader experience this lab will involve two very different datasets.
The data we will analyze in this part of the lab come from the fivethirtyeight article “Where Police Have Killed Americans in 2015”. These data can be accessed here, and contain the following variables:
Because these data only contain incidents that occured in 2015, we will consider them a sample that represents additional years (including future years which haven’t yet occured).
Question 1
A Pennsylvania jury recently acquitted a police officer who fatally shot an unarmed teenager (this NPR Article provides details). Based upon this event, we might wonder how common are police killings of an unarmed individuals among all police killings? Use statistical methods to answer this question, including all relavent Mintab output, your conclusion, and a short rationale for your approach.
Question 2
The Pennsylvania case mentioned in Question #1 received a lot of publicity because it involved a white police officer killing an unarmed black individual. We might wonder, among those killed by the police, is the proportion of blacks who were unarmed different from the proportion of whites who were unarmed? Use statistical methods to answer this question, including all relavent Mintab output, your conclusion, and a short rationale for your approach.
Question 3
Everyone who appears in the database was killed by the police (by definition), but this means we don’t have data on individuals who were not killed by the police. In this situation we might seek to bring in external information to shape our analysis. It is estimated that the racial composition of the United States in 2015 was 61.8% non-Hispanic white, 13.2% black, 17.8% Hispanic (of any race), 5.2% Asian, 0.8% Native American, and 1.2% other. Based upon this, we might wonder if police killing are equally prevalent across races, or if some racial/ethnic groups are disproportionately involved in police killings? Use statistical methods to answer this question, including all relavent Mintab output, your conclusion, and a short rationale for your approach.
Question 4
Critics of the analysis described in Question 3 might argue that because of socio-economic factors not all racial/ethnic groups commit crimes at the same rate, and therefore exposure to situations with a possibility of being killed by the police is unequal across groups. It might be possible to evaluate this criticism using external information from the National Crime Victimization Survey (NCVS). According the NCVS, 22.7% of the victims of violent crimes report that the perpetrator of the crime was black. Based upon this, we might wonder if the proportion of black individuals in the Police Killings data differs from proportion crimes with black perpetrators (given by the NCVS)? Use statistical methods to answer this question, including all relavent Mintab output, your conclusion, and a short rationale for your approach.
Question 5 (Group Only)
If socio-economic factors are related with the demographics of those killed by the police, we’d expect police killings to occur in census tracks that are worse off economically than the rest of the country. In 2015, the national unemployment rate was 5.2%, the national poverty rate was 13.5%, and 35% of the US population (Age 25 or over) have a bachelor’s degree or higher. We might wonder if, on average, the locations of police killings have poverty rates, unemployment rates, and levels of college education that differ from these national averages, as well as how different they are. Use statistical methods to answer this question and provide a short rationale for your approach.
For this section we will shift to a lighter topic and analyze data from the 2015-16 Golden State Warriors record setting season. The Warriors are professional basketball team based in Oakland California. In 2015-2016 they set an NBA record for the most wins in NBA regular season history with a win-loss record of 73-9 (breaking the 1995-95 Chicago Bulls record of 72-10). Additionally, the Warriors team set 25 different NBA records during the 2015-16 season, which is regarded as one of the best seasons in NBA history.
The data we will analyze documents each of the 82 games played by the 2015-16 Warriors team; it can be accessed here, and it contains the following variables:
Question 6 (Group Only)
The Warriors have a reputation as one of the best shooting teams of all time. Based upon this claim, we might wonder whether the Warriors are better than their opponents at making free throws. Use statistical methods to evaluate this conjecture using season totals. Use statistical methods to answer this question, including all relavent Mintab output, your conclusion, and a short rationale for your approach.
Question 7 (Group Only)
The 2015-16 Warriors were led by Stephen Curry, one of the best three-point shooters of all time. Because of Curry’s presence, we might wonder if the Warriors attempted more three-point shots than their opponent in each game. Use statistical methods to answer this question, including all relavent Mintab output, your conclusion, and a short rationale for your approach.
Question 8
Critics of the Warriors have called them over-reliant on three-point shooting. If this is the case, we’d expect the Warriors to have made a lower proportion of their three-point attempts in the team’s losses than in the team’s wins. Evaluate this hypothesis using season totals. Use statistical methods to answer this question, including all relavent Mintab output, your conclusion, and a short rationale for your approach.
Question 9
It has been well-established statistically that there is a home-court advantage in the NBA. Based upon this, we might wonder how much more likely were the Warriors to win at home (relative to on the road)? Use statistical methods to answer this question, including all relavent Mintab output, your conclusion, and a short rationale for your approach.
Question 10 (Group Only)
For this question you may use either of the data sets in this lab. For your chosen data set I’d like you to create a brief report highlighting one interesting feature of the data. You report must include:
This should all fit on a single page; it should not come as a bulleted list, rather it should be a coherent paragraph (or multiple paragraphs) accompanied by your figure.