2.2 Metadata
Now that your scripts are documented, you need to start documenting the inputs and outputs of your file.
Ideally all the non-original
data can be recreated from your R
code,
and those instructions are placed in a master script file like a Makefile
or bash
script.
But we’ll keep it simple for now, and just document the process.
2.2.0.1 Datasets
For each input dataset, a comment about what is in the dataset, where it came from, and what you used it for should all be listed. You can use a format like this:
- my_awesome_dataset.csv
- ./data/folder/original/awesome/my_awesome_dataset.csv
- contains data about how awesome the various datasets are
- used to calculate the 'awesome' metric
We use the original
/working
/final
folders in our data folder.
The working
and Final
datasets should list what script it comes from.
- web_scraped_data.RData
- data scraped from the web
- comes from ./src/dan/web/scraping.R
2.2.0.2 Reports/Posters/etc
Create a doc
folder on the top level.
What ever your ‘final’ version of a poster or report should be here and checked in.
2.2.0.3 Figures and Tables
Each figure and table used in the poster should list the script it comes from. The exact method on how to regenerate that figure should be throughly listed
- ./output/poster_fig_1.png
- scatter plot for the amazing data
- generated by: `./src/dan/amazing/plot_code.R`