Depositing Data for the Greater Good

Sometimes, it is worthwhile to not convolute a particular useful dataset within a replication package. Pulling it out, depositing it in its own right, is easy, and yields much greater findability through search engines such as ICPSR or Google Dataset Search.

The approach is relatively simple.

Choose a repository

Choose any one of the open trusted repositories (see our document on data and code hosting). You want to maximize findability, but also credit.

Notes:

  • For US government data, datalumos.org is a site that “rescues” data, but it obfuscates a bit the contribution of the author.
    • For instance, I deposited this data at datalumos.org: but my contribution is not visible. It might be possible to add your name to the “Collection Notes” or similar. Not ideal.
    • Generically, you can achieve great findability of Census Bureau data by using one of the ICPSR properties, in this case probably openICPSR (similar to, but distinct from your AEA deposit).
  • Some examples are given in our document on data and code hosting.
  • If you are at an ICPSR member institution, or have a forthcoming paper in the AEA journals, you may submit to ICPSR, including leveraging the expertise of their professional data curation team.

Deposit the data and fill out metadata

  • Fill out the title and other things.
    • Choose a title that emulates the title for the official (digital-first). For instance, if you were to deposit historical files for the U.S. Economic Census, you would emulate Census Bureau publications, either per year (1987 Economic Census) or collected (Historical Economic Census).
  • Be sure to add yourself as the “Principal Investigator”
  • Fill out as much as you can about the metadata.
  • For “Collection Notes”, describe accurately where the files come from, and what you did to them to make them usable.
  • If you still have them, also upload the original formats, in addition to any converted files.

Going an additional extra mile

  • If you wish, check with one of the data librarians at your institution to dot the is and cross the ts.

Hit publish. Get DOI.

Cite DOI

Cite the data (via the DOI) in your article (Author, 2020: Historical Widget data for the United States, https://doi.org/1.2/3….)