This page contains links to data sources as well as some suggested project topics. It is by no means a comprehensive list, and is continually updated

Data Sources (suitable for any course)

Top Choices

  • Data.gov - General - tens of thousands of publically available government datasets on topics including agriculture, climate, consumer trends, ecology, education, energy, finance, health, state/local governments, manufacturing, oceans, public safety, and scientific research.

Government/Economics

  • Census.gov - Data Tables - demographic and population data for the United States. Some highlights include county-to-county and metro-to-metro migration trends, state-level tax collections by type, and geospatial data.
  • Worldbank.org - Open Data - time-series, sample survey, and geospatial data for hundreds of countries around the world. Options to search by country, or by type of data.
  • National Bureau of Economic Research - macro-economics, industry and productivity, international finance and trade, and healthcare data.
  • Bureau of Labor Statistics - Wages, inflation and prices, employment/unemployment, employee pay, worker productivity, and workplace injuries.

Sports

  • Baseball Reference - - season statistics for every MLB team/player, box scores for every game, all-time historical data, and much more
  • Basketball Reference - season statistics for every NBA team/player, box scores for every game, all-time historical data, and much more
  • Football Reference - season statistics for every NFL team/player, box scores for every game, all-time historical data, and much more
  • Hockey Reference - season statistics for every NHL team/player, box scores for every game, all-time historical data, and much more

Education

  • College Scorecard - a government database aimed at increasing transparency in higher education by publishing hundreds of variables describing colleges across the United States
  • Data.gov - Education Catalog - hundreds of publically available government datasets ranging from cross-sectional surveys, to longitudinal studies, to geospatial datasets

Project Ideas (suitable for Sta-209)

  • Student surveys where any of the following are randomized:
    • Manipulating the language/wording of a particular question
    • Manipulating the characteristics of an item (ie: article/artwork/etc.) that are shared with respondents - for example, an author’s political beliefs, education, demographics, etc. could be manipulated and tied to ratings/opinions of the item
  • Simple randomized experiments:
    • Doing X before doing Y where X is randomized - for example, X could be meditating/music/solving puzzles/etc. vs. a placebo task, and Y could any quantifiable task (ie: solving math problems with a time limit)
  • Paired experiments:
    • Having each subject perform or answer the same task twice under different conditions, possibly at different times and in a randomized order - for example, holding a plank position with vs. without receiving encouragement/music/seeing a counter
  • Analyses of existing data (sourced above)
    • Endless options here, just stay focused on a single topic and don’t go on a fishing expedition

Other Lists (secondary sources)