Github Actions License: CC BY 4.0 DOI

Overview đź‘‹

Welcome!

This website hosts the slides and other resources for my talks on research data management, reproducibility and beyond.

The current version of the slides can be found in the main section. Previous versions of the slides can be found in the archive section. You can find the GitHub repository with all the source code here. All contents are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0) license (for details, see here and here). If you notice any issues or have suggestions for improvement, I would be glad to hear from you! Please open an issue on GitHub or contact me via email. Thanks!

Main ✨

The current version of the slides is available below:

Archive đź“š

Open Science Initiative (OSIP) at the Department of Psychology at TU Dresden

DOI

The following slides were presented during a talk prepared for the Open Science Initiative at the Department of Psychology (OSIP) at Technische Universität Dresden, Germany on 18th of January 2023.

Slides

5th RDM-Workshop 2022 on Research Data Management in the Max Planck Society

DOI PDF

The following slides were presented during a talk prepared for the 5th RDM-Workshop 2022 on Research Data Management in the Max Planck Society at the Max Planck Institute for Human Cognitive and Brain Sciences in Leipzig, Germany from the 13th to 14th of September 2022. The talk was held in German (but slides were in English).

Abstract

Vorläufiger Titel: Forschung als Software Engineering - Lösung für alles?

Finaler Titel: Tools fĂĽr einen offenen und reproduzierbaren Forschungsprozess

Unzureichende Reproduzierbarkeit, mangelnde Transparenz und ineffiziente Arbeitsabläufe - viele wissenschaftliche Disziplinen haben einen großen Innovationsbedarf. Zudem wird der Forschungsprozess immer digitaler, komputationaler und interaktiver. In meinem Vortrag argumentiere ich, dass technische Lösungen für diese Herausforderungen bereits vorhanden und in der professionellen Softwareentwicklung zu finden sind. Dazu zählen vor allem Tools für interaktive Versionskontrolle von Daten wie Git oder DataLad und Software-Container wie Docker. Ich diskutiere wie diese Tools wissenschaftliche Arbeitsabläufe transformieren und welche (vorranging nicht technischen) Hürden bei ihrer Implementierung bestehen. Ähnliche vorherige Vorträge finden sich auf https://lennartwittkuhn.com/talk-rdm/.

Lifespan Neural Dynamics Group

DOI PDF

The following slides were presented during a talk prepared for the Lifespan Neural Dynamics Group (LNDG) at the Max Planck Institute for Human Development on 20th of October 2021.

Slides

Open Science Ambassadors Day 2021

DOI PDF

The following slides were presented during a talk prepared for the “Open Science Ambassadors Day 2021” hosted by the Max Planck PhDnet and the Max Planck Digital Library (see details here) on 18th of October 2021.

Max Planck Digital Library

DOI

The following slides were presented during a talk prepared for the “Discussion Series: Human Research Data in Practice” hosted by the Max Planck Digital Library (see details here) on 22nd of June 2021.

Slides

Abstract

Title: “Towards a workflow for open and reproducible fMRI studies”

Achieving computational reproducibility and accessible data sharing can be challenging, in particular for neuroimaging research that involves large amounts of heterogeneous data and code. Here, we showcase a workflow that combines several software tools to allow reproducibility and transparent sharing of code and data of a human fMRI study.

We recently published an open-access paper (Wittkuhn & Schuck, 2021, Nature Communications) together with the code, data and computational environments needed to reproduce the reported results. We shared > 10 datasets via GIN (G-Node Infrastructure) as modular version-controlled units, including fMRI data organized in BIDS format and derived data, such as pre-processed fMRI data and data quality metrics.

Research data was version-controlled using DataLad. Following the DataLad YODA principles, we nested datasets as modular units, allowing to better establish data provenance, i.e., a clear overview which code used which input data to produce which output data. Code that reproduced the analyses was integrated with additional documentation using RMarkdown notebooks. The notebooks were automatically executed using continuous integration on GitLab. In this process, data was retrieved from GIN using DataLad, the notebooks were rendered and deployed to a website (https://wittkuhn.mpib.berlin/highspeed/). Code execution was performed using software containers (Docker and Singularity) and virtual environments, allowing to reproduce the computational environment.

Keywords: data sharing, reproducibility, open science, version-control, fMRI