Sample code to generate descriptive statistics
Generating codebooks
Stata
In Stata,the native ‘codebook’ command can generate such information:
// Stata
use my_input_data
describe
codebook
See code/01_codebook_fancy.md for a fancier example, and code/02_codebook_plaintext.md for the code and output from the simpler example.
R
In R, the dataMaid [1], [2] can accomplish a similar task:
# use the dataMaid package
library(dataMaid)
makeCodebook(my_input_data)
See code/03_codebook_dataMaid for an example.
SAS
In SAS, PROC CONTENTS and PROC MEANS may very well provide all that is needed:
proc contents;
proc means;
run;
See code/04_codebook_SAS for an example.
Creationg “zero-obs” datasets
Alternatively, you can just provide an empty file that replicates the structure (schema) of your data. Often, this can be achieved by simply setting the number of observations to zero.
Stata
// Stata
use my_input_data
keep if 0
save zero_input_data, replace
Example:
sysuse auto
. data)
(1978 automobile
keep if 0
.
(74 observations deleted)
desc
.
data from /usr/local/stata/ado/base/a/auto.dta
Contains data
Observations: 0 1978 automobile
Variables: 12 13 Apr 2024 17:45_dta has notes)
(
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------model
make str18 %-18s Make and int %8.0gc Price
price int %8.0g Mileage (mpg)
mpg int %8.0g Repair record 1978
rep78 float %6.1f Headroom (in.)
headroom int %8.0g Trunk space (cu. ft.)
trunk weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
int %8.0g Turn circle (ft.)
turn int %8.0g Displacement (cu. in.)
displacement float %6.2f Gear ratio
gear_ratio byte %8.0g origin Car origin
foreign
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------by: foreign
Sorted last saved. Note: Dataset has changed since
R
# Read the RDS file into an R object
<- readRDS("path/to/my_input_data.rds")
my_data # Create a new, empty data frame with the same structure
<- my_data[FALSE, ]
zero_obs_df # To verify, check the dimensions
dim(zero_obs_df)
# [1] 0 X
# Save the empty data frame to a new RDS file
saveRDS(zero_obs_df, "path/to/zero_input_data.rds")
Alternatively, using the dplyr
package:
library(dplyr)
<- readRDS("path/to/my_input_data.rds")
my_data <- slice(my_data, 0)
zero_obs_df saveRDS(zero_obs_df, "path/to/zero_input_data.rds")
SAS
data zero_input_data;
set my_input_data (obs=0)
run;
or
proc sql;
create table zero_input_table like my_input_table;
quit;