Logo

Data and Code Guidance by Data Editors

Guidance for authors wishing to create data and code supplements, and for replicators.

Requested Information for Data

On this page:

Overview

We request the following information from all articles:

Details follow.

Generic Diagram

graph LR; subgraph "Data Provider"; DB[(Data source)] ; end; DB -.- E>Extract] -.-> A; subgraph Researcher; A((Input data)) --> B{Further analysis}; end;

Data description

Practical guidance: Data description

It doesn’t need to be complicated, but should be complete. For more extensive descriptions, best practices, and possible support, see

Other services may be available at your own institution. Some services may charge a fee, or only be available to researchers affiliated with that institution.

We provide a few suggestions for how to do this in Stata, R, and SAS. However, if you have the ability to do a more robust data description, you should. See a self-citing example.

Data access description

The description of data access should provide enough information so that an uninformed user could theoretically access the data. Data access descriptions are often referred to as “Data Availability Statements”.

In the flow diagram above, it is the answer to the question: Who can access “Data Source” at “Data Provider” and create an extract.

The access should be persistent, i.e., not rely on a transitory website or the presence of a particular person who might change jobs at any time.

Practical guidance: data access

For additional sample “Data Access Descriptions”, see

Data persistence

Data should remain available for a sufficiently long time.

See Requested_information_hosting for more details on data repositories.

What is a data provider

A “data provider” in this sense can be

The author may also be the data provider, for instance, because the author conducted the survey used in the article. However, in many cases, the data provider may not be a data archive (see the page on Requested information about hosting).

Practical guidance: data provider and data archives

If the data provider is not an archive (i.e., the data persistence is insufficient, and data might go away), you should investigate depositing the data at a data archive.

Planning Ahead

Many of the items above can be planned ahead of time. In fact, funding agencies require data management plans, and these are core elements of data management plans.

Consider from the start what is needed to share the data.

If collecting data, consider early on the need to have just enough restrictions.

Consider breaking your data archives into manageable pieces. For instance, raw data might be in a different location/repository/archive than the analysis data, even if you are providing access to both for the purpose of replication. Examples:

Clemens, Michael, 2017, “Raw scanned PDFs of primary sources for workers, wages, and crops”, https://doi.org/10.7910/DVN/DJHVHB, Harvard Dataverse, V1