Overview

In this project you will analyze the outcomes of Xavier University men’s basketball games in the 2020-21 season. The specific question you will seek to address is:

Can using individual-level player statistics meaningfully improve a model for Xavier’s margin of victory/loss beyond simply using team-level statistics for that game?

In answering this question, you will first build a satisfactory model using team-level game data, and then explore whether that model can be improved using individual-level game data for one or more players.

\(~\)

Data Sources

All of the data involved in this project were obtained from Sports Reference

Note: You do not need to consider all of these players, but you are expected to explore using the data from at least one of them to improve your initial model.

\(~\)

Guidelines

\(~\)

Getting Starting

  1. To appropriately answer this project’s guiding question, you should begin by coming up with a model that uses team-level statistics to predict the variable “Margin”. Because Xavier has only played 19 this season, you will need to work hard to balance accuracy and parsimony in this model. Putting this more bluntly, you avoid including too many variables in this model. A commonly cited rule of thumb suggests a ratio of 10:1 for data-points to predictors (suggesting a model should contain only 2 predictors in this application); recognize, that this is merely a guideline, and shouldn’t necessarily be strictly adhered to in all circumstances (ie: your model can have more than 2 predictors, so long as the model is properly justified).
  2. Once you’ve found a satisfactory model that uses only team-level data, you should then explore whether that model can be improved by using individual-level data from one or more of the players listed above. This will require you to merge that player’s data using the “Date” variable. Be aware that some players did not play in all 19 games, so using data from these players will reduce the already small sample size.

General Recommendations

Grading

This project will be evaluated using the same rubric and grading scheme as the first midterm, which is available at this link