Logo

Data and Code Guidance by Data Editors

Guidance for authors wishing to create data and code supplements, and for replicators.

Suggested Information for Data and Code Hosting

On this page:

Trusted Repositories

Journals and institutions have assessed a number of trusted repositories:

List of Additional Acceptable Trusted Repositories in Economics

A list of trusted repositories that have been found to be acceptable for the purpose of archiving social and economic data can be found here:

https://github.com/social-science-data-editors/guidance/blob/master/data/trusted-repositories.csv

The list is maintained by the editors collaborating on this site. To suggest an addition, please issue a pull request, or email one of the editors.

Permanent Identifiers: Digital Object Identifiers (DOI) et al

A sufficient, but not necessary criterion for a “trusted repository” is the assignment of permanent identifiers, such as Digital Object Identifiers (DOI).

https://doi.org/10.3886/ICPSR30261.v6

Some repositories (often university-based) ones will also assign handles:

https://hdl.handle.net/1813/45789

Others assign DOI upon demand. We generally suggest requesting a DOI if possible. Examples:

However, care must be taken when using permanent identifiers: the URL in the address bar is (almost) never the same as the DOI or handle. All permanent identifiers are redirects: they constitute a permanent entry that points to wherever the most recent version of the object can be found:

Only the first entry in each of the examples above should be used for citing, not the second.

NOT ACCEPTABLE

A variety of (unfortunately) commonly used web-accessible locations are not acceptable as data repositories for the purpose of an article’s supplementary materials:

Some examples

“Immigration Restrictions as Active Labor Market Policy: Evidence from the Mexican Bracero Exclusion, Replication files and raw data” (Michael Clemens)

  • Hosted on Harvard Dataverse at https://dataverse.harvard.edu/dataverse/bracero
  • Contains two datasets:
    • Clemens, Michael, 2017, “Raw scanned PDFs of primary sources for workers, wages, and crops”, https://doi.org/10.7910/DVN/DJHVHB, Harvard Dataverse, V1
    • Clemens, Michael, 2018, “Replication Data for: Immigration Restrictions as Active Labor Market Policy: Evidence from the Mexican Bracero Exclusion”, https://doi.org/10.7910/DVN/17M4ZP, Harvard Dataverse, V1

“United States Newspaper Panel, 1869-2004” (Gentzkow, Shapiro, Sinkinson)

  • Hosted on ICPSR at https://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/30261
  • Contains
    • Gentzkow, Matthew, Shapiro, Jesse M., and Sinkinson, Michael. United States Newspaper Panel, 1869-2004. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2014-12-10. https://doi.org/10.3886/ICPSR30261.v6

“Socioeconomic High-resolution Rural-Urban Geographic Dataset for India (SHRUG)” (Asher and Novosad)

Challenges in Hosting of Data and Code at Restricted-Access Data Centers

Users of restricted-access data centers (RADC, such as FSRDCs, CASD, etc.) face certain challenges in the handling of data and code as described in this document:

A few guidelines

Self-generated repositories

If a RADC has at least an archival or backup policy of sufficient length (e.g., 10 or more years), but does not offer a formal repository, then the following procedure allows users to find and request code and data

Some examples