2.2 Metadata

Now that your scripts are documented, you need to start documenting the inputs and outputs of your file.

Ideally all the non-original data can be recreated from your R code, and those instructions are placed in a master script file like a Makefile or bash script. But we’ll keep it simple for now, and just document the process.

2.2.0.1 Datasets

For each input dataset, a comment about what is in the dataset, where it came from, and what you used it for should all be listed. You can use a format like this:

- my_awesome_dataset.csv
    - ./data/folder/original/awesome/my_awesome_dataset.csv
    - contains data about how awesome the various datasets are
    - used to calculate the 'awesome' metric

We use the original/working/final folders in our data folder. The working and Final datasets should list what script it comes from.

- web_scraped_data.RData
    - data scraped from the web
    - comes from ./src/dan/web/scraping.R

2.2.0.2 Reports/Posters/etc

Create a doc folder on the top level. What ever your ‘final’ version of a poster or report should be here and checked in.

2.2.0.3 Figures and Tables

Each figure and table used in the poster should list the script it comes from. The exact method on how to regenerate that figure should be throughly listed

- ./output/poster_fig_1.png
    - scatter plot for the amazing data
    - generated by: `./src/dan/amazing/plot_code.R`