Description

The focus of this project is on creating interactive data exploration application. The primary product is an R Shiny application that allows a user to thoughtfully explore a data set. You will be required to submit a short proposal and an interim progress update to keep your project on track.

In addition to creating your app, you will give a 5-minute in-class presentation of your app’s features, being sure to highlight at least one interesting finding it reveals about your data.

\(~\)

Timeline

*This timeline was revised on 10/24

\(~\)

App Expectations

Your finished Shiny app is expected to include:

\(~\)

Presentation Expectations

Your presentation is expected to include:

Your target audience should be our class, so you may assume some working knowledge of R Shiny and various types of graphics/statistics; but you should not assume any familiarity with your data source or research question.

\(~\)

Groups

You may work either individually or with one classmate (of your choosing) on this project.

Note: You may not work with the same person on both this project and the final project. So, if you really want to work with a certain classmate on the final you might choose to work with someone else (or independently) on this project.

Additionally, by choosing to work with someone you are consenting to receiving the same score on the project based. If you are not comfortable receiving the same score as your partner, you might opt to work alone.

\(~\)

Proposal

By the end of the day on Tuesday 10/24, one member of your group should send a brief proposal (via email) that addresses the following:

  1. Where your data will come from.
  2. Who (if anyone) you’ll be partnering with.
  3. What is it you seek to answer or explore using your R Shiny app.

For #3, a good proposal might be something like “I want users of the app to be able to explore whether there are spatial patterns in the incidences of different types of crimes that were reported in the city of Chicago”. A bad proposal might be something like “I want display all crimes in Chicago on a map”. The first example is good because it involves something that is best achieved using Shiny (ie: a user option to change or filter by crime), while the second is bad because Shiny isn’t necessary to make a map.

\(~\)

Intermediate Progress

By the end of the day on Thursday 10/26 you are expected to have code that cleans/manipulates your data to the point where you can create a sketch version of some type of graphic that you intend for your Shiny app to display. You should submit a compiled R Markdown file documenting this progress via P-web. The sketch graphic you share does not need to ultimately be used in your app.

\(~\)

Presentation Day

\(~\)

Assessment Details (100 pts total)

App Code - 10 pts

Aesthetics - 20 pts

Function - 30 pts

Presentation - 15 pts

Misc - 5 pts

Difficulty - 20 pts

\(~\)

Level of Difficulty

One goal of this project is to afford you the opportunity to work with a topic that you find interesting. Unfortunately, real-world projects rarely utilize all areas of the data science workflow/life cycle equally. For example, some projects will require you to devote 90% of your time to data cleaning and manipulation in order produce a few relatively simple visualizations or models. Other projects might involve data come in a relatively clean format, and the majority of your time is spent making highly detailed visualizations or sophisticated models.

To address these differences, you will be asked to submit a \(\leq1\)-page written argument describing your project’s level of difficult. More specifically, you should argue that your project had “A-level”, “B-level”, or “C-level” difficulty, providing clear reasons and justification for your rating.

Hallmarks of an A-level project:

\(~\)

Finding a Data Source

You are expected to use a challenging data source of your choosing. I encourage you to find something that aligns with your interests, and you are welcome to use data from other courses/internships/etc. provided it is sufficiently complex.

If you’re having trouble finding data, I encourage you to look at this page containing data curated by Grinnell College libraries.

You can consult with the “Data Sources” section of this page, which is a creation of mine that could use updating.

Additionally, for this project, you may use data from Kaggle.com, provided the data set contains a satisfactory amount of documentation describing where it came from. You may not use a data set from a textbook, an R package, or any other source directly relating to R Shiny.

\(~\)

Additional Comments

R Shiny is a great technology to share and display your data science skills. I encourage you to consider hosting your finished app on shinyapps.io and storing your app’s code on github. If relevant, this allows you to include links to your project in a resume or cover letter to an internship or job opportunity. You can also embed a hosted R Shiny app directly into a personal webpage (if you have one).