Guidance for authors wishing to create data and code supplements, and for replicators.
Journals offer guidance on citation, you may want to check there first. The guidance provided here is in addition, and may be the basis for future official guidance.
Properly referencing data goes beyond just reproducibility - it is also proper scientific writing style. In the same way that authors use bibliographic references to “printed” resources, they should also be using such references for data resources, to give and receive credit where credit is due. Not referencing an article or book is at best an oversight, and at worst plagiarism - and the same should apply to data objects. Numerous guides and tutorials exist, some of which provide a variety of examples.
In a nutshell, every dataset is to be cited. This is true for the main article as well as online appendices. In the past, use of data or code has been acknowledged in footnotes, and only rarely through bibliographic references. However, if the dataset is used, it should appear in the bibliography. The same is true for code reused from previous papers, or provided by authors.
The DOI is thus public, and all repositories will provide a suggested citation. One can also use https://www.doi2bib.org/ or https://citation.crosscite.org/ to get a citation (see below for additional tools).
This is trickier. The data does not necessarily have a title that is related to the paper. Some repositories allow authors to “reserve” a DOI (Zenodo) or to delay publication. For some repositories, the DOI, while not officially reserved, can be derived from information already available (see this FAQ for openICPSR, something similar may be possible at Dataverse).
In some cases, authors may be able to delay publication, and coordinate it with the publication of the article (openICPSR, possibly Zenodo).
In all cases, data and code should be cited in the main manuscript. They should also be referenced in the data availability statement (some journals) or the README (other journals).
Many journals in economics require the Chicago style for citations and bibliographies 1. However, the Chicago Style Manual provides few relevant examples for data citations. Applications like Zotero and Mendeley Desktop also do not support data citations robustly, even though the underlying Citation Style Language has had the concept of a “data” entry for several years.
DataONE suggests content and style that resemble the generic working paper or article citation style (adapted to Chicago style):
Westbrook JW, Kitajima K, Burleigh JG, Kress WJ, Erickson DL, Wright SJ (2011) Data from: What makes a leaf tough? Patterns of correlated evolution between leaf toughness traits and demographic rates among 197 shade-tolerant woody species in a neotropical forest. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.8525
ICPSR notes that a citation should include the following items:
and provides a few examples, with some additional modifiers:
Esther Duflo; Rohini Pande, 2006, ``Dams, Poverty, Public Goods and Malaria Incidence in India’’, http://hdl.handle.net/1902.1/IOJHHXOOLZ UNF:5:obNHHq1gtV400a4T+Xrp9g== Murray Research Archive [Distributor] V2 [Version]
Finally, the AEA style guide suggests
Leiss, Amelia. 1999. ``Arms Transfers to Developing Countries, 1945–1968.’’ Inter-University Consortium for Political and Social Research, Ann Arbor, MI. ICPSR05404-v1. doi:10.3886/ICPSR05404 (accessed February 8, 2011).
This may be adjusted. An alternate citation may be
Leiss, Amelia. 1999. ``Arms Transfers to Developing Countries, 1945–1968.’’ Inter-University Consortium for Political and Social Research, Ann Arbor, MI. ICPSR05404-v1. https://doi.org/10.3886/ICPSR05404.
Citations to primary data can sometimes be hard to construct. If the data provider has a suggested citation, then you should use it. Alternatively, you can construct as per the above examples, for instance, as noted on the AEA style guide:
Bureau of Labor Statistics. 2000–2010. “Current Employment Statistics: Colorado, Total Nonfarm, Seasonally adjusted - SMS08000000000000001.” United States Department of Labor. http://data.bls.gov/cgi- bin/surveymost?sm+08 (accessed February 9, 2011).
No tool is perfect, but you will likely find the one that works for your particular workflow. Some examples for specific software:
There are several websites that will create a (data) citation from any DOI. Give these a try:
DOI Citation Formatter at citation.crosscite.org/ is supported by the DOI Registration Agencies. For instance, using 10.3886/ICPSR05404
from the above example, it generates (as of 2022-01-16):
Leiss, Amelia. 1984. “Arms Transfers to Developing Countries, 1945-1968.” ICPSR - Interuniversity Consortium for Political and Social Research. doi:10.3886/ICPSR05404.
Leiss, A. (1984). Arms Transfers to Developing Countries, 1945-1968 [Data set]. ICPSR - Interuniversity Consortium for Political and Social Research. https://doi.org/10.3886/ICPSR05404
doi2bib at doi2bib.org/ will generate a BibTeX entry from a DOI (for instance from 10.3886/ICPSR05404):
@misc{https://doi.org/10.3886/icpsr05404,
doi = {10.3886/ICPSR05404},
url = {http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/5404},
author = {Leiss, Amelia},
language = {en},
title = {Arms Transfers to Developing Countries, 1945-1968},
publisher = {ICPSR - Interuniversity Consortium for Political and Social Research},
year = {1984}
}
When the object does not have a DOI, it isn’t really that much more complicated, but little guidance is available.
Citation Machine - choose Source type
=Digital file
and put the URL and access date into the Annotation
field.
Try our own tool: