Welcome

We welcome you to the FAIR research data management workshop. This interactive course is centered around the FAIR principles and their practical application in managing your research data efficiently. FAIR is an acronym and encompasses:

The FAIR data principles state that it should be easy to find research data, they should contain information about how to gain access to them, they should be compatible with other data, and possible to reuse.1 For instance, ensuring your data is well-organized and documented can make it easier for humans and machines to find and understand it and enable better reuse of the data. If the concept of the FAIR principles is new to you, take a minute to check this comprehensive overview of the GO FAIR initiative before starting this workshop.

FAIR principles and Open data

Please note that FAIR and open are not synonymous. Not all FAIR data is necessarily open; some datasets fulfill the FAIR principles but are not considered open data. Economic or legal constraints (e.g. patents) may prevent the dataset from being publicly available. For FAIR data, the conditions and ways to access the data must be clearly defined.2 On the other hand, there are numerous datasets that are publicly available but do not comply with the FAIR principles, as you will most likely see in the course of this workshop.

Who is this workshop for?

This workshop was designed for early career researchers who are either about to start or want to manage their empirical research data more effectively. Although it was developed with this target group in mind, it can most probably benefit many other researchers who are also consolidating their data management skills.

What is the purpose of this workshop?

Appreciation of adequate data management usually occurs when looking into someone else’s data folder:

Piled Higher and Deeper by Jorge Cham

However, structures like this evolve over time and especially in the beginning of a project it is hard to imagine ending up with such a chaotic folder. Yet, more advanced participants can potentially think of an example where a folder tended to look similar, only that you are mastering your structures better than others.
This workshop will artificially advance your experience of navigating other researcher’s data, thereby hoping to point out good and bad practices that shall soon inspire your own data management. It also highlights the importance of planing the management of your data very early in the course of the project to benefit most from it: Having well-organized and documented data will not only increase their value and accessibility for the wider research community but also for your future self, when going back into your data folder e.g., for a revision after a year.

Benefits

We do know that the trend towards transparent and reproducible research is increasing, and more and more journals require researchers to publish the data on which their research results are based. The effort of publishing these data is negligible if you have a good data management strategy in place right from the beginning. Publishing FAIR data, i.e., data that are of actual use to others, also comes with numerous scientific benefits:

  • Data can be reused and cited in a timely manner. Citation of the data by other researchers increases visibility and can strengthen the reputation of your research.
  • New scientific collaborations may arise through data published according to FAIR, i.e., in reusable format. Collecting data is a time consuming business, so other researchers might as well benefit from already existing data (collected on public money), and you receive credit for producing the data.
  • Publishing data on which research results are based (in addition to the ‘traditional’ scientific articles) will support the credibility, reproducibility, and validity of the results. This increases public trust in science.

Additional benefits are displayed in this graphic.

Source: Hole, B. (2015). Open Science: A New publisher Perspective. Ubiquity press. (CC BY 4.0 International)

How to?

Setting out this workshop, we quickly agreed that data management is best understood through learning by doing. For this reason, the whole workshop evolves around four published datasets:

It is your job to investigate one of these datasets with a special focus on data organization, documentation, and publication aspects. We will conclude with an introduction to data management plans, a helpful planning tool to comply with the FAIR principles. In this section, we will also introduce important aspects of data storage.
The topics are accompanied by distinct boxes that are color-coded for their content:

Orange boxes contain information crucial for that topic
Blue boxes contain excursions to related topics
Green boxes contain hands-on exercises
Red boxes contain the solutions and are collapsed

Only open the box if you want to see the solution!

We recommend that you look at the hands-on exercises first and see whether you already know their solutions. If things are unclear, you may return to the text anytime, but be aware that it is not necessary (and time will probably not permit) to read every paragraph and every box very carefully. The materials will, however, remain available, so you can always go back and reread more carefully!