Data sets and writing about data
A work-in-progress collection of easy (and some not so easy) to access data sets. Created for the Data Analytics for Economists course at the University of Wisconsin -- Madison, but all are welcome. Suggestions and corrections are always appreciated (firstname.lastname@example.org).
Aggregate Economic Data
FRED (St. Louis FRB) Massive repository of economic data. Somewhat U.S.-centric. Accessible through pandas_datareader.
COMTRADE (United Nations) Very detailed data on international trade in goods.
World Bank Data for many countries. Includes economic data, but also demographic, social, and environmental topics.
Penn World Table The big draw here is GDP at purchasing power parity, which allows for meaningful cross-country comparisons. We used read_excel() to directly import this from the web.
UN Population Data Demographic data by country, including forecasts.
BLS Quarterly Census of Employment and Wages Quarterly employment, wages, etc. at the county/metro/state levels.
BLS Occupational Employment Statistics Wages and employment by occupation and geography.
Data on Individuals
Current Population Survey Household level data on employment, income, and education.
NLSY79 These data follow a cohort of men and women who were 14-22 years old in 1979. They are then re-surveyed each year until 1994. Not the easiest data to access, but there is a lot to learn from it.
National Survey of Family Growth Interviews with females about pregnancy and associated topics. Includes demographic data.
FBI Crime Data Explorer What kinds of crimes are being committed? How are they changing over time?
WI Dept. of Health Services Data on Asthma, Zika, and lots in between. It takes some clicking around, but many of datasets can be visualized as a map to get you thinking. Look for the download button in the top right corner.
Wisconsin Voting Data A lot of detail. There is an api, too.
City of Madison More data on the city, including lots of spatial data. The tax rolls are interesting—-I can see my house in this dataset!
Micro Export Data
Brookings Export Monitor Exports by industry at the county, metro, and state levels. (aggregates, too) This data tracks goods according to where they were produced.
USA Trade Sign up for a free account to use. Imports and exports by product and U.S. state. This data tracks goods according to origin of movement rather than production.
Inside Airbnb Data on listing, reviews and calendar data. Doesn't have data for a Wisconsin city, but Minneapolis and Chicago are in there.
Yahoo Finance Historical and current financial data. The api in `pandas_datareader` is broken, but you can still download files from the site.
FDIC Aggregate data on US banks, including balance sheet and income statement data. The data on bank failures might make for an interesting analysis.
Airline routes (T-100) Route-segment based data. Monthly observations on number of passengers, seats, and cargo transported on a given route segment for each airline.
Airline itineraries (DB1B) Quarterly sample of 10% of passenger itineraries from major airlines. Includes price data.
Zillow Housing and rental data by metro area.
HRSA Data Grant, loan, and scholarship program data, as well as data about availability of healthcare. The data on health professional shortage areas looks interesting.
Dartmouth Atlas of Health Care Compiled from medicare data, the database provides information about health care at detailed levels, right down to the hospital.
Baseball Database by Sean Lahman: batting and pitching statistics from 1871-2017 plus much more.
Arts and Culture
Cooper Hewitt Open access to data about the collection.
MovieLens Movie ratings and demographic data about the raters. Some very large datasets, but some small ones for getting your code up and running.
New York Philharmonic Data on more than 20,000 performances.
College Scorecard University/College level data about the school and its student body.
Other Data Collections
NBER Datasets that go with NBER working papers. Some data is easy to access some is not (and some is missing). The associated papers are full of good questions, too.
ICPSR A large collection of social science data. We have not used this data—-let us know if you do, we would like to hear about it.
Kaggle This site runs competitions and warehouses lots of data and code.
Chicago data portal. Lots of data about the city.
Data blogs and other resources
The FRED blog: Short posts on topical economic questions and observations.
Reddit's data is beautiful: As much chaff as wheat, but worth an occasional look.
Fivethirtyeight: Economics, sports politcs...a bit of everything.
COMTRADE visualization: Showcases visualizations that use COMTRADE data. Some good stuff and some good examples of overwrought, hard to follow visualizations.
VizWiz by Andy Kriebel: Makeover Monday, Tableau Tip Tuesday, and Workout Wednesday