This page contains links to data sources as well as some suggested project topics. It is by no means a comprehensive list, and is continually updated
Data Sources (suitable for any course)
Top Choices
- Data.gov - General - tens of thousands of publically available government datasets on topics including agriculture, climate, consumer trends, ecology, education, energy, finance, health, state/local governments, manufacturing, oceans, public safety, and scientific research.
Government/Economics
- Census.gov - Data Tables - demographic and population data for the United States. Some highlights include county-to-county and metro-to-metro migration trends, state-level tax collections by type, and geospatial data.
- Worldbank.org - Open Data - time-series, sample survey, and geospatial data for hundreds of countries around the world. Options to search by country, or by type of data.
- National Bureau of Economic Research - macro-economics, industry and productivity, international finance and trade, and healthcare data.
- Bureau of Labor Statistics - Wages, inflation and prices, employment/unemployment, employee pay, worker productivity, and workplace injuries.
Sports
- Baseball Reference - - season statistics for every MLB team/player, box scores for every game, all-time historical data, and much more
- Basketball Reference - season statistics for every NBA team/player, box scores for every game, all-time historical data, and much more
- Football Reference - season statistics for every NFL team/player, box scores for every game, all-time historical data, and much more
- Hockey Reference - season statistics for every NHL team/player, box scores for every game, all-time historical data, and much more
Education
- College Scorecard - a government database aimed at increasing transparency in higher education by publishing hundreds of variables describing colleges across the United States
- Data.gov - Education Catalog - hundreds of publically available government datasets ranging from cross-sectional surveys, to longitudinal studies, to geospatial datasets
Project Ideas (suitable for Sta-209)
- Student surveys where any of the following are randomized:
- Manipulating the language/wording of a particular question
- Manipulating the characteristics of an item (ie: article/artwork/etc.) that are shared with respondents - for example, an author’s political beliefs, education, demographics, etc. could be manipulated and tied to ratings/opinions of the item
- Simple randomized experiments:
- Doing X before doing Y where X is randomized - for example, X could be meditating/music/solving puzzles/etc. vs. a placebo task, and Y could any quantifiable task (ie: solving math problems with a time limit)
- Paired experiments:
- Having each subject perform or answer the same task twice under different conditions, possibly at different times and in a randomized order - for example, holding a plank position with vs. without receiving encouragement/music/seeing a counter
- Analyses of existing data (sourced above)
- Endless options here, just stay focused on a single topic and don’t go on a fishing expedition
Other Lists (secondary sources)