Project Setup

This section guides you through setting up your R environment for data documentation and validation. You’ll create a reproducible project structure using Quarto (a tool for creating documents that combine code and text) and renv (a tool for managing R packages) that supports open science practices.

You might wonder: “I just want to learn about data dictionaries and validation. Why do I need Quarto, Git, and renv?”

These tools help you create reproducible, shareable documentation that follows open science best practices.

What each tool does:

  • Quarto: Lets you write documents that mix explanatory text with R code, then automatically generate professional HTML/PDF reports. Your data dictionary and validation results will live in one cohesive document.
  • Git: Tracks changes to your files over time. If something breaks, you can go back. When collaborating, everyone can see what changed.
  • renv: Records which package versions you used. This ensures your code still works months later (and works for collaborators).

The payoff: After this one-time setup, you’ll be able to generate beautiful, reproducible data documentation with a single click. Your future self and collaborators will thank you.

Create Quarto Project

First, we will need to create a new Quarto project.

If you haven’t already, open RStudio – see Note 1 for how to use the terminal instead. Then, click on File > New Project… to open the New Project Wizard.

Here, select New Directory

And choose the project type Quarto Project.

Finally, enter the name of the directory where our report will be created in, for example data-documentation-validation-exercise.

As we will use Git to track the version history of files, be sure to check Create a git repository. If you don’t know what Git is, have a look at the tutorial “Introduction to version control with git and GitHub within RStudio”.

renv: A dependency management toolkit for R

Also, we will utilize the package renv to track the R packages our project depends on. Using it makes it easier for others to view and obtain them at the exact same version at a later point in time. Therefore make sure that the box Use renv with this project is checked. Again, if this is the first time you are hearing about renv, have a look at the tutorial “Introduction to {renv}”.

If you are already familiar with Markdown and Quarto, you can uncheck the box Use visual markdown editor.

Click on Create Project. Your RStudio window should now look similar to this:

The project `data-documentation-validation-exercise` opened in RStudio. The source pane to the top left has a Quarto file open called "data-documentation-validation-exercise.qmd". The console pane to the bottom left indicates by its output that renv is active in the current project. The environment pane to the top right indicates that the environment is currently empty. The output pane to the bottom right shows the files in the current project.

If, like in the image, a Quarto file with some demo content was opened automatically, you can close and delete it, for example, using RStudio’s file manager.

Throughout this tutorial, you will need to run both R code and system commands (primarily git and quarto). Within RStudio, R code can be run by going to the tab Console, while system commands are executed in the tab Terminal. We also indicate where to run your code directly above each code snippet. If no indication is given, the code is only for demonstration purposes and does not need to be run.

The renv package tracks which R packages your project uses. As you install packages throughout this tutorial (in the next section), renv will automatically record them. You don’t need to do anything special right now.

Later in your work, before committing code to Git, you can run renv::status() to check if everything is synchronized. For this tutorial, we’ll remind you when it’s time to use renv commands.

Without RStudio, one can create a Quarto project with version control and renv enabled by typing the following into a terminal:

Terminal
quarto create project default data-documentation-validation-exercise
cd data-documentation-validation-exercise/
rm data-documentation-validation-exercise.qmd
git init
git checkout -b main

Then, one can open an R session by simply typing R into the terminal. Next, make sure that getwd() indicates that the working directory is data-documentation-validation-exercise. Now, initialize renv:

Console
renv::init()
TipCheckpoint: Verify Your Project Setup

Before moving to the next section, verify your setup is working:

  1. Check Git - Should show this is a git repository

    git status

    Expected output: Shows branch name (usually “main”) and current status

  2. Check working directory - Should end with your project name

    getwd()

    Expected output: Path ending in your project folder name (e.g., data-documentation-validation-exercise)

If any of these fail, review the steps above or ask for help before continuing. The renv setup will be verified in the next section when you install packages.

Next Steps

Your project is now set up with Quarto, Git, and renv! In the next section, you’ll install the R packages needed for data documentation and get familiar with the Palmer Penguins dataset.

Back to top