Project Setup

We will start by setting up a simple example of a reproducible report.

Create Quarto Project

First, we will need to create a new Quarto project.

If you haven’t already, open RStudio – see Note 1 for how to use the terminal instead. Then, click on File > New Project… to open the New Project Wizard.

Here, select New Directory

And choose the project type Quarto Project.

Finally, enter the name of the directory where our report will be created in, for example code-publishing-exercise.

As we will use Git to track the version history of files, be sure to check Create a git repository. If you don’t know what Git is, have a look at the tutorial “Introduction to version control with git and GitHub within RStudio”.

renv: A dependency management toolkit for R

Also, we will utilize the package renv to track the R packages our project depends on. Using it makes it easier for others to view and obtain them at a later point in time. Therefore make sure that the box Use renv with this project is checked. Again, if this is the first time you are hearing about renv, have a look at the tutorial “Introduction to {renv}”.

If you are already familiar with Markdown and Quarto, you can uncheck the box Use visual markdown editor.

Click on Create Project. Your RStudio window should now look similar to this:

The project `code-publishing-exercise` opened in RStudio. The source pane to the top left has a Quarto file open called "code-publishing-exercise.qmd". The console pane to the bottom left indicates by its output that renv is active in the current project. The environment pane to the top right indicates that the environment is currently empty. The output pane to the bottom right shows the files in the current project.

If, like in the image, a Quarto file with some demo content was opened automatically, you can close and delete it, for example, using RStudio’s file manager.

Make sure that your project is in a consistent state according to renv by running:

Console
renv::status()

If it reports packages that are not used, synchronize the lock file using:

Console
renv::snapshot()

Without RStudio, one can create a Quarto project with version control and renv enabled by typing the following into a terminal:

Terminal
quarto create project default code-publishing-exercise
cd code-publishing-exercise/
rm code-publishing-exercise.qmd
git init
git checkout -b main

Then, one can open an R session by simply typing R into the terminal. Next, make sure that getwd() indicates that the working directory is code-publishing-exercise. Then, initialize renv:

Console
renv::init()

You are now ready to stage and commit your files. You can either stage files separately or the whole project folder at once. If you do the latter, we recommend you to inspect the untracked changes before staging all of them:

In file paths, a period (.) means “the current directory”, while two periods (..) mean “the parent directory”. Therefore git add . means “stage the current directory for committing”.

Terminal
git status

Since no commits have been made so far, this should include every file that is not covered by the .gitignore file. If everything can be staged for committing – as is the case in this tutorial – you can follow up with:

Terminal
git add .
git commit -m "Initial commit"

If you see a file you’d rather not commit, delete it or add its name to the .gitignore file. If you don’t check your changes before committing, you might accidentally commit something you’d rather not.

Tip 1

If git commit fails with the message Author identity unknown, you need to tell Git who you are. Run the following commands to set your name and email address:

Terminal
git config user.name "YOUR NAME"
git config user.email "YOUR EMAIL ADDRESS"

Then, commit again.

Decide on Structure

Before adding your project files, it is helpful to decide on a directory structure, that is, how to call each file and where to put it. In general, the directory structure should facilitate understanding a project by breaking it into logical chunks. There is no single best solution, as a good structure depends on where a project’s complexity lies. However, it is usually helpful if the different files and folders reflect the execution order. For example, if there are multiple data processing stages, one can possibly differentiate input (raw data), intermediate (processed data), and output files (e.g., figures) and put them into separate folders. Similarly, the corresponding code files (e.g., preparation, modeling, visualization) can be prefixed with increasing numbers.

Luckily, there are already numerous proposals for how to organize one’s project files, both general (Project TIER, 2021; e.g., Wilson et al., 2017) as well as specific to a particular programming language (e.g., Marwick et al., 2018; Vuorre & Crump, 2021) or journal (Vilhuber, 2021). We recommend you to follow the standards of your field.

For the purpose of this tutorial, we will provide you with a data set and a corresponding analysis. They are simple enough to be put together in the root folder of your project.

Add Data

You can now download the data set we have prepared for you and put it into your project folder: data.csv

palmerpenguins: Palmer Archipelago (Antarctica) Penguin Data

The data set is from the package palmerpenguins (v0.1.1) and contains the recorded bill lengths and sex of penguins living on three islands in the Palmer Archipelago, Antarctica. It was made available under the license CC0 1.0.

When distributing a data set, it is important to document the meaning (e.g., units) and valid values of its variables. This is typically done with a data dictionary (also called a codebook). In the following, we will demonstrate how to create a simple data dictionary using the R package pointblank. You can install it now using:

pointblank: Data Validation and Organization of Metadata for Local and Remote Tables
Console
renv::install("pointblank")

You can put the code that follows for creating the data dictionary into a new file called create_data_dictionary.R.

First, we write down everything we know about the data set. This includes:

  • a general description of the data set
  • descriptions of all columns
  • valid values, where applicable
create_data_dictionary.R
table_info <- c(
  title = "palmerpenguins::penguins",
  description = "Size measurements for adult foraging penguins near Palmer Station, Antarctica"
)
descriptions <- c(
  species = "a character string denoting penguin species",
  island = "a character string denoting island in Palmer Archipelago, Antarctica",
  bill_length_mm = "a number denoting bill length (millimeters)",
  bill_depth_mm = "a number denoting bill depth (millimeters)",
  flipper_length_mm = "an integer denoting flipper length (millimeters)",
  body_mass_g = "an integer denoting body mass (grams)",
  sex = "a character string denoting penguin sex",
  year = "an integer denoting the study year"
)

vals <- list(
  species = c("Adelie", "Gentoo", "Chinstrap"),
  island = c("Torgersen", "Biscoe", "Dream"),
  sex = c("male", "female"),
  year = c(2007, 2008, 2009)
)

Depending on the type of data, it may also be necessary to describe measurement instruments, sampling procedures, appropriate weighting, or contact information. In this case, as the data have already been published, we only store a reference to its source:

dat_source <- "Horst A, Hill A, Gorman K (2022). _palmerpenguins: Palmer Archipelago (Antarctica) Penguin Data_. R package version 0.1.1, https://github.com/allisonhorst/palmerpenguins, <https://allisonhorst.github.io/palmerpenguins/>."

Then, we use pointblank to create a data dictionary with this information.

create_data_dictionary.R
vals <- sapply(vals, \(x) {
  paste0(
    "(",
    knitr::combine_words(x, and = " or ", before = "`", after = "`"),
    ")"
  )
})

dat <- read.csv("data.csv")

dict <- pointblank::create_informant(
  dat,
  tbl_name = NA,
  label = table_info[["title"]],
  lang = "en"
) |>
  pointblank::info_tabular(
    Description = table_info[["description"]],
    Source = dat_source
  ) |>
  pointblank::info_columns_from_tbl(stack(descriptions)[2:1]) |>
  pointblank::info_columns_from_tbl(stack(vals)[2:1]) |>
  pointblank::get_informant_report(
    title = "Data Dictionary for `data.csv`"
  )

dict
Data Dictionary for data.csv
palmerpenguins::penguins

data frameRows 344 Columns 8
Table

DESCRIPTION

Size measurements for adult foraging penguins near Palmer Station, Antarctica

SOURCE

Horst A, Hill A, Gorman K (2022). _palmerpenguins: Palmer Archipelago (Antarctica) Penguin Data_. R package version 0.1.1, https://github.com/allisonhorst/palmerpenguins, < allisonhorst.github.io palmerpenguins>.
Columns
species  character INFO a character string denoting penguin species (`Adelie`, `Gentoo`, or `Chinstrap`)
island  character INFO a character string denoting island in Palmer Archipelago, Antarctica (`Torgersen`, `Biscoe`, or `Dream`)
bill_length_mm  numeric INFO a number denoting bill length (millimeters)
bill_depth_mm  numeric INFO a number denoting bill depth (millimeters)
flipper_length_mm  integer INFO an integer denoting flipper length (millimeters)
body_mass_g  integer INFO an integer denoting body mass (grams)
sex  character INFO a character string denoting penguin sex (`male` or `female`)
year  integer INFO an integer denoting the study year (`2007`, `2008`, or `2009`)
2024-12-11 13:18:36 UTC < 1 s 2024-12-11 13:18:36 UTC

Finally, we can store the data dictionary inside an HTML file and put the HTML file into the project folder as well.

create_data_dictionary.R
pointblank::export_report(dict, filename = "data_dictionary.html")

For a more elaborate introduction to pointblank, you can read their Intro to Information Management.

One could go even further by making the information machine-readable in a standardized way. We provide an optional example of that in Note 2. If you want to learn more about the sharing of research data, have a look at the tutorial “FAIR research data management”.

This example demonstrates how the title and description of the data set, the description of the variables and their valid values are stored in a machine-readable way. As before, we also provide a reference to the source.

dat_source <- "https://allisonhorst.github.io/palmerpenguins/"

Generally, metadata are either stored embedded into the data or externally, for example, in a separate file. We will use the “frictionless data” standard, where metadata are stored separately. Another alternative would be RO-Crate.

Specifically, one can use the R package frictionless to create a schema which describes the structure of the data.2 For the purpose of the following code, it is just a nested list that we edit to include our own information. We also explicitly record in the schema that missing values are stored in the data file as NA and that the data are licensed under CC0 1.0. Finally, the package is used to create a metadata file that contains the schema.

Console
# Read data and create schema
dat_filename <- "data.csv"
dat <- read.csv(dat_filename)
dat_schema <- frictionless::create_schema(dat)

# Add descriptions to the fields
dat_schema$fields <- lapply(dat_schema$fields, \(x) {
  c(x, description = descriptions[[x$name]])
})

# Record valid values
dat_schema$fields <- lapply(dat_schema$fields, \(x) {
  if (x$name %in% names(vals)) {
    modifyList(x, list(constraints = list(enum = vals[[x$name]])))
  } else {
    x
  }
})

# Define missing values
dat_schema$missingValues <- c("", "NA")

# Create package with license info and write it
dat_package <- frictionless::create_package() |>
  frictionless::add_resource(
    resource_name = "penguins",
    data = dat_filename,
    schema = dat_schema,
    title = table_info[["title"]],
    description = table_info[["description"]],
    licenses = list(list(
      name = "CC0-1.0",
      path = "https://creativecommons.org/publicdomain/zero/1.0/",
      title = "CC0 1.0 Universal"
    )),
    sources = list(list(
      title = "CRAN",
      path = dat_source
    ))
  )
frictionless::write_package(dat_package, directory = ".")

This creates the metadata file datapackage.json in the current directory. Make sure it is located in the same folder as data.csv, as together they comprise a data package.

Having added the data and its documentation, one can view and record the utilized packages with renv

Console
renv::status()
renv::snapshot()

…and go through the commit routine:

Terminal
git status
git add .
git commit -m "Add data"

Add Code

In order to have some code which you can practice to share, we have prepared a simple manuscript for you, alongside a bibliography file. The manuscript contains code together with a written narrative. Download the two files to your computer and put them into your project folder.

The manuscript explores differences in bill length between male and female penguins, feel free to read through it.

As the manuscript uses some new packages, install them with:

Console
renv::install()

The manuscript also uses the Quarto extension “apaquarto”, which typesets documents according to the requirements of the American Psychological Association (2020). It can be installed in the project using the following command:

Terminal
quarto add --no-prompt wjschne/apaquarto
Tip 2: Not a Psychologist?

If you are not a psychologist, you can also skip installing apaquarto. If you installed it by accident, run quarto remove wjschne/apaquarto.

Note, however, that the file Manuscript.qmd we prepared for you uses apaquarto by default and you need to set a different format in the YAML header if you decide not to use apaquarto:

Manuscript.qmd
format:
  pdf:
    pdf-engine: lualatex
    documentclass: scrartcl
    papersize: a4

Also, you need to have a \(\TeX\) distribution installed on your computer, which is used in the background to typeset PDF documents. A lightweight choice is TinyTeX, which can be installed with Quarto as follows:

Terminal
quarto install tinytex

You should now be able to render the document using Quarto:

Terminal
quarto render Manuscript.qmd

This should create a PDF file called Manuscript.pdf in your project folder.

Tip 3

If the PDF file cannot be created, try updating Quarto. It comes bundled with RStudio, however, apaquarto sometimes requires more recent versions.

With the code being added, one can use renv again to view and record the new packages:

Console
renv::status()
renv::snapshot()
Tip 4

Always run renv::status() and resolve any inconsistencies before you commit code to your project. This way, every commit represents a working state of your project.

Finally, make your changes known to Git:

Terminal
git status
git add .
git commit -m "Add manuscript"
Warning 1: Beware of Credentials

Sometimes, a data analysis requires the interaction with online services:

  • Data may be collected from social network sites using their APIs3 or downloaded from a data repository, or
  • an analysis may be conducted with the help of AI providers.

In these cases, make sure that the code you check in to Git does not contain any credentials that are required for accessing these services. Instead, make use of environment variables which are defined in a location that is excluded from version control. When programming with R, you can define them in a file called .Renviron in the root of your project folder:

.Renviron
MY_FIRST_KEY="your_api_key_here"
MY_SECOND_KEY="your_api_key_here"

When you start a new session from the project root, the file is automatically read by R and the environment variables can be accessed using Sys.getenv():

query_api(..., api_key = Sys.getenv("MY_FIRST_KEY"))

Make sure that .Renviron is added to your .gitignore file in order to exclude it from the Git repository. If you already committed a file that contains credentials, you can follow Chacon & Straub (2024).

Coding Best Practices

Although we provide the code in this example for you, a few things remain to be said about best practices when it comes to writing code that is readable and maintainable.

  • Use project-relative paths. When you refer to a file within your project, write paths relative to your project root. For example, don’t write C:/Users/Public/Documents/my_project/images/result.png, instead write images/result.png.

  • Keep it simple. Add complexity only when you must. Whenever there’s a boring way to do something and a clever way, go for the boring way. If the code grows increasingly complex, refactor it into separate functions and files.

  • Don’t repeat yourself. Use variables and functions before you start to write (or copy-paste) the same thing twice.

  • Use comments to explain why you do things. The code already shows what you do. Use comments to summarize it and explain why you do it.

  • Don’t reinvent the wheel. With R, chances are that what you need to do is greatly facilitated by a package from one of many high-quality collections such as rOpenSci, r-lib, Tidyverse, or fastverse.

  • Think twice about your dependencies. Every dependency increases the risk of irreproducibility in the future. Prefer packages that are well-maintained and light on dependencies4. We also recommend you to read “When should you take a dependency?” by Wickham & Bryan (2023).

  • Fail early, often, and noisily. Whenever you expect a certain state, use assertions to be sure. In R, you can use stopifnot() to make sure that a condition is actually true.

  • Test your code. Test your code with scenarios where you know what the result should be. Turn bugs you discovered into test cases. Use linting tools5 to identify common mistakes in your code, for example, the R package lintr.

  • Read through a style guide and follow it. A style guide is a set of stylistic conventions that improve the code quality. R users are recommended to read Wickham’s (2022) “Tidyverse style guide” and use the R package styler. Python users may benefit from reading the “Style Guide for Python Code” by Rossum et al. (2013). And even if you don’t follow a style guide, be consistent.

This is only a brief summary and there is much more to be learned about coding practices. If you want to dive deeper we recommend the following resources:

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”

Fowler et al. (1999), p. 15

Cite Data and Software

If you rely on data or software by others in your research, the question arises whether and how to cite it in your publications.

Data

Put simply, all data relied upon should be cited to allow for precise identification and access. From the “eight core principles of data citation” by Starr et al. (2015), licensed under CC0 1.0:

Principle 1 – Importance: “Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications.”

Principle 3 – Evidence: “In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited.”

Principle 5 – Access: “Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.”

Principle 7 – Specificity and Verifiability: “Data citations should facilitate identification of, access to, and verification of the specific data that support a claim. Citations or citation metadata should include information about provenance and fixity sufficient to facilitate verifying that the specific time slice, version and/or granular portion of data retrieved subsequently is the same as was originally cited.”

Now, add an appropriate citation for the data set to the manuscript. Does your citation adhere to the principles above?

As the data set is from the R package palmerpenguins, one can use the function citation() to display a suggested citation:

citation("palmerpenguins")
To cite palmerpenguins in publications use:

  Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer
  Archipelago (Antarctica) penguin data. R package version 0.1.0.
  https://allisonhorst.github.io/palmerpenguins/. doi:
  10.5281/zenodo.3960218.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {palmerpenguins: Palmer Archipelago (Antarctica) penguin data},
    author = {Allison Marie Horst and Alison Presmanes Hill and Kristen B Gorman},
    year = {2020},
    note = {R package version 0.1.0},
    doi = {10.5281/zenodo.3960218},
    url = {https://allisonhorst.github.io/palmerpenguins/},
  }

As this can only be run with the package palmerpenguins installed, you can also find a suggested citation on its website.

Copy the BibTeX entry to the file Bibliography.bib and add an identifier between @Manual{ and the comma, such that the entry’s first line reads @Manual{horst2020,. Then, add a sentence to the manuscript such as follows:

The analyzed data are by @horst2020.

Render the document to check that the citation is displayed properly.

Terminal
quarto render Manuscript.qmd

Software

When it comes to software, the answer is a little more nuanced due to the large number of involved dependencies. You can consult Figure 1 for general advice whether to cite a particular piece of software or not. As with data, citations should allow for exact identification and access. From the six “software citation principles” by Smith et al. (2016), licensed under CC BY 4.0:

1. Importance: Software should be considered a legitimate and citable product of research. Software citations should be accorded the same importance in the scholarly record as citations of other research products, such as publications and data; they should be included in the metadata of the citing work, for example in the reference list of a journal article, and should not be omitted or separated. Software should be cited on the same basis as any other research product such as a paper or a book, that is, authors should cite the appropriate set of software products just as they cite the appropriate set of papers.

5. Accessibility: Software citations should facilitate access to the software itself and to its associated metadata, documentation, data, and other materials necessary for both humans and machines to make informed use of the referenced software.

6. Specificity: Software citations should facilitate identification of, and access to, the specific version of software that was used. Software identification should be as specific as necessary, such as using version numbers, revision numbers, or variants such as platforms.

In practice, the first step is to identify all pieces of software the project relies on. A few of them are obvious, such as R itself, Quarto, and the \(\TeX\) distribution we installed before. Then there are the individual R packages, Quarto extensions, and \(\TeX\) packages. All of them, in turn, may have dependencies and it is up to you decide when not to dig deeper. For example, some R packages are only thin wrappers around other R packages or around system dependencies which also might deserve credit. A system dependency is additional software that you require on your computer apart from the R package.

flowchart TB
  asks_for_citation("Does the software<br>ask you to cite it?")
  critical_or_unique_contribution("Did the software<br>play a critical part<br>in or contributed<br>something unique<br>to your research?")
  manipulate("Did the software<br>manipulate or create<br>your data, software,<br>or other outputs?")
  relies_on_credit("Do the authors of<br>the software rely<br>on academic credit<br>for funding?")
  cite[Cite!]
  nocite[Don't cite!]

  asks_for_citation --"Yes"--> cite
  asks_for_citation --"No"--> critical_or_unique_contribution
  critical_or_unique_contribution --"Yes"--> cite
  critical_or_unique_contribution --"No"--> manipulate
  manipulate --"Yes"--> cite
  manipulate --"No"--> relies_on_credit
  relies_on_credit --"Yes"--> cite
  relies_on_credit --"No"--> nocite

Figure 1: “Should I cite the software?” by Brown et al. (2016) licensed under CC BY-SA 4.0. Simplified from original.

Now, add references for the software you would like to cite to the manuscript. In the following, we will demonstrate this for R and all R packages by using the R package grateful. For arbitrary software, you can use the CiteAs service to create appropriate citations.

Add the following code chunk to the end of the discussion in the manuscript:

Manuscript.qmd
```{r}
#| echo: false

grateful::cite_packages(
    output = "paragraph",
    out.dir = ".",
    omit = NULL,
    dependencies = TRUE,
    passive.voice = TRUE,
    bib.file = "grateful-refs"
)
```

This will automatically create a paragraph citing all used packages and generate the bibliography file grateful-refs.bib.6 Then, in the YAML header, add grateful-refs.bib by setting the bibliography as follows:

Manuscript.qmd
bibliography:
  - Bibliography.bib
  - grateful-refs.bib

Use renv to view, install, and record the newly used package:

Console
renv::status()
renv::install()
renv::snapshot()

Finally, render the document again and commit the changes:

Terminal
quarto render Manuscript.qmd

git status
git add .
git commit -m "Cite data and software"

The Last Mile

renv only records the versions of R packages and of R itself. This means that everything we have not decided to cite in the previous step is not documented anywhere. We will cover system dependencies when creating a README. For now, however, there is one simple step you can take to record the version of Quarto (and a few other dependencies). Do run the following:

Terminal
quarto use binder

This will create a few additional files which facilitate reconstructing the computational environment in the future.7 As always, commit your changes:

Terminal
git status
git add .
git commit -m "Add repo2docker config"

You are now all set up to prepare your project for sharing!

Back to top

References

Brown, L., Crusoe, M. R., Miļajevs, D., & Romanowska, I. (2016). Should you cite this particular piece of software? https://mr-c.github.io/shouldacite/. https://doi.org/10.5281/zenodo.2559142
Chacon, S., & Straub, B. (2024). Removing a file from every commit. In Pro git (Second edition). Apress. https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#_removing_file_every_commit
Fowler, M., Beck, K., Opdyke, W., & Roberts, D. (1999). Refactoring: Improving the design of existing code. Addison-Wesley.
Marwick, B., Boettiger, C., & Mullen, L. (2018). Packaging data analytical work reproducibly using R (and friends). The American Statistician, 72(1), 80–88. https://doi.org/10.1080/00031305.2017.1375986
Mineault, P., & Nozawa, K. (2021, December 21). The good research code handbook. https://goodresearch.dev/. https://doi.org/10.5281/ZENODO.5796873
Project TIER. (2021). TIER protocol 4.0. https://www.projecttier.org/tier-protocol/protocol-4-0/
Publication manual of the american psychological association (7th ed.). (2020). American Psychological Association. https://doi.org/10.1037/0000165-000
Raymond, E. S. (2003). The art of UNIX programming: With contributions from thirteen UNIX pioneers, including its inventor, ken thompson. Addison-Wesley. https://www.arp242.net/the-art-of-unix-programming/
Rossum, G. van, Warsaw, B., & Coghlan, A. (2013, August 1). Style guide for python code. Python enhancement proposals (PEPs). https://peps.python.org/pep-0008/
Smith, A. M., Katz, D. S., Niemeyer, K. E., & FORCE11 Software Citation Working Group. (2016). Software citation principles. PeerJ Computer Science, 2, e86. https://doi.org/10.7717/peerj-cs.86
Starr, J., Castro, E., Crosas, M., Dumontier, M., Downs, R. R., Duerr, R., Haak, L. L., Haendel, M., Herman, I., Hodson, S., Hourclé, J., Kratz, J. E., Lin, J., Nielsen, L. H., Nurnberger, A., Proell, S., Rauber, A., Sacchi, S., Smith, A., … Clark, T. (2015). Achieving human and machine accessibility of cited data in scholarly publications. PeerJ Computer Science, 1, e1. https://doi.org/10.7717/peerj-cs.1
UK Government Analytical Community. (2020). Quality assurance of code for analysis and research. https://best-practice-and-impact.github.io/qa-of-code-guidance/
Vilhuber, L. (2021, April 8). Preparing your files for verification. Office of the AEA data editor. https://aeadataeditor.github.io/aea-de-guidance/preparing-for-data-deposit
Vuorre, M., & Crump, M. J. C. (2021). Sharing and organizing research products as R packages. Behavior Research Methods, 53(2), 792–802. https://doi.org/10.3758/s13428-020-01436-x
Wickham, H. (2022, July 24). The tidyverse style guide. https://style.tidyverse.org/
Wickham, H. (2023, November 20). Tidy design principles. https://design.tidyverse.org/
Wickham, H., & Bryan, J. (2023). When should you take a dependency? In R packages (Second edition). O’Reilly Media. https://r-pkgs.org/dependencies-mindset-background.html#sec-dependencies-pros-cons
Wilson, G., Bryan, J., Cranston, K., Kitzes, J., Nederbragt, L., & Teal, T. K. (2017). Good enough practices in scientific computing. PLOS Computational Biology, 13(6), 1–20. https://doi.org/10.1371/journal.pcbi.1005510

Footnotes

  1. For example, using Amnesia, ARX, sdcMicro, or Synthpop.↩︎

  2. In June 2024, version 2 of the frictionless data standard has been released. As of November 2024, the R package frictionless only supports the first version, though support for v2 is planned.↩︎

  3. An application programming interface provides the capability to interact with other software using a programming language.↩︎

  4. You can use the function pak::pkg_deps() to count the total number of package dependencies in R.↩︎

  5. A linting tool analyzes your code without actually running it. Therefore, this process is also called static code analysis.↩︎

  6. Note that this automatic detection can miss packages in some circumstances, therefore always verify the rendered result.↩︎

  7. Either using repo2docker or the public binder service.↩︎