• Home
  • About
  • Schedule
  • Sessions
  • Code of Conduct

Contents

  • Objectives
  • Exercises
  • Slides

Project & Data Organization

Starts at:

Thursday, 09:30

Slides

Objectives

πŸ’‘ You understand the importance of well-structured data organization for research.
πŸ’‘ You can design logical and intuitive folder structures.
πŸ’‘ You can apply file naming best practices and unique identifiers.
πŸ’‘ You understand ISO 8601 timestamps and proper sorting methods.
πŸ’‘ You can choose appropriate file formats for preservation.
πŸ’‘ You can implement effective document versioning strategies.
πŸ’‘ You understand ASCII/UTF-8 encoding advantages for text files.
πŸ’‘ You can identify and solve common file organization problems.

Exercises

NoteExercise 1: Compare project structure templates

Exercise 1: Project template comparison

Task: Compare the following project templates and discuss advantages and disadvantages of each approach.

Turing Way template

.
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md
β”œβ”€β”€ CODE_OF_CONDUCT.md
β”œβ”€β”€ CONTRIBUTING.md
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ processed      # Final, canonical datasets
β”‚   └── raw            # Original, immutable data
β”œβ”€β”€ docs               # Sphinx documentation
β”œβ”€β”€ models             # Trained models, predictions, summaries  
β”œβ”€β”€ notebooks          # Jupyter notebooks (numbered)
β”œβ”€β”€ reports            # Generated analysis (HTML, PDF, LaTeX)
β”‚   └── figures        # Generated graphics and figures
β”œβ”€β”€ project_management # Meeting notes, planning resources
└── src                # Source code
    β”œβ”€β”€ data           # Scripts to download/generate data
    β”œβ”€β”€ models         # Scripts to train models
    └── visualisation  # Scripts for visualizations

Repository Structure Template by The Turing Way. Used under the LICENSE CC-BY 4.0. Reused without any modifications.

Heidi Seibold template

.
β”œβ”€β”€ README.md
β”œβ”€β”€ analysis            # All things data analysis
β”‚   └── src             # Functions and source files
β”œβ”€β”€ comm
β”‚   β”œβ”€β”€ internal_comm   # Internal communication, meeting notes
β”‚   └── journal_comm    # Communication with journal, peer review
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ data_clean      # Clean version of data
β”‚   └── data_raw        # Raw data (don't touch)
β”œβ”€β”€ dissemination
β”‚   β”œβ”€β”€ manuscripts
β”‚   β”œβ”€β”€ posters
β”‚   └── presentations
β”œβ”€β”€ documentation       # Data management plan, etc.
└── misc                # Miscellaneous files

Research Project Template by Heidi Seibold. No license use specified. For source code, see here. Reused without modifications.

analysistemplates template

.
β”œβ”€β”€ 01_Data
β”‚   β”œβ”€β”€ 01_Raw
β”‚   └── 02_Clean
β”œβ”€β”€ 02_Analysis
β”‚   β”œβ”€β”€ 01_Scripts
β”‚   β”œβ”€β”€ 02_Results  
β”‚   β”œβ”€β”€ 03_Figures
β”‚   └── 04_Tables
β”œβ”€β”€ 03_Manuscript
β”‚   β”œβ”€β”€ 01_Text
β”‚   └── 02_Final_figures
β”œβ”€β”€ 04_Presentation
β”œβ”€β”€ 05_Misc
β”œβ”€β”€ 06_Analysis_for_publication    # Optional
β”œβ”€β”€ README.md
β”œβ”€β”€ .gitignore                     # Optional
└── renv                           # Optional

analysis template packages by Jonas Hagenbeck. Used under the MIT License. Reused without any modifications.

WORCS template

File/Folder Description Usage
_pkgdown.yml YAML for package website do not edit
DESCRIPTION R-package DESCRIPTION do not edit
LICENSE.md Project license do not edit
README.md Read this file to get started! do not edit
README.Rmd R-markdown source for readme.md human editable
docs/ Package website machine-written
paper/ WORCS paper source files human editable
R/ R-package source code human editable
vignettes/ R-package vignettes human editable

WORCS project structure by Van Lissa et al. (2021). Used under the GNU General Public License. No changes were made.

NoteExercise 2: Design your project folder structure

Exercise 2: Design your project folder structure

  1. Create a new directory called my-research-project in your home directory.
  2. Design and create a folder structure for your own or a fictional research project e.g., studying β€œEffects of Social Media on Sleep Patterns in College Students”. Consider:
    • Where will you store raw survey data?
    • Where will you keep processed analysis results?
    • Where will you organize your documentation?
    • Where will you store your analysis scripts?
  3. Create at least 5-6 folders that reflect good organization principles from the slides.
  4. Navigate through your structure and verify all directories exist.
NoteExercise 3: File naming practice

Exercise 3: File naming practice

  1. In your project’s raw data folder, create the following files using good naming practices (remember: no spaces, use proper date formats, include leading zeros):
    • A survey data file from January 13, 2024
    • A survey data file from April 21, 2024
    • A survey data file from December 3, 2025
    • Sleep tracking data from participant 007
    • Sleep tracking data from participant 023
    • Sleep tracking data from participant 156
  2. List the files and verify they sort in logical order.
  3. Now create the bad versions of these filenames in a separate bad-examples folder:
    • Use spaces in names
    • Use inconsistent date formats
    • Avoid leading zeros
    • Use special characters
  4. Compare how the two folders look when listed.
NoteExercise 4: README documentation

Exercise 4: README documentation

  1. Create a README.md file in your main project directory.
  2. Document the following in your README:
    • Brief project description
    • Explanation of your folder structure
    • Your file naming convention rules
    • Data collection dates and methods
    • Contact information
  3. Include at least one example of your naming convention with explanation.
NoteExercise 5: File format decisions

Exercise 5: File format decisions

  1. Create a documentation folder in your project.
  2. In this folder, create files representing different types of documentation using appropriate file formats:
    • A data dictionary/codebook
    • A research protocol document
    • A list of participant information
    • Analysis notes
  3. Use the recommended file formats from the slides (.txt, .csv, .md, etc.).
  4. Add a comment in each file explaining why you chose that format.
NoteExercise 6: Versioning practice

Exercise 6: Versioning practice

  1. Create a file called analysis-script.R in your scripts folder.
  2. Add some sample R code (even if basic) to the file.
  3. Create 3 versions of this file using proper versioning conventions:
    • An initial draft version
    • A revised version with minor changes
    • A major revision with significant updates
  4. Practice both numbering and date-based versioning approaches.
  5. Document your versioning system in a versioning-notes.txt file.
NoteExercise 7: Challenges & Solutions

Challenges & solutions in research project organization

Reflect on your own experience organizing research projects and discuss with the group:

Common challenges:

  • What problems have you encountered in your own projects?
  • What made it hard to find files or understand the structure later?
  • Have you ever lost work or wasted time due to poor organization?
  • What mistakes did you or your team make that you would avoid now?

Potential solutions & what helps:

  • What strategies or conventions have worked well for you?
  • Which tools or systems do you rely on to keep projects organized?
  • What would you recommend to a colleague starting a new project?
  • What is one change you plan to make to your current workflow?

Slides

NoteHow can I download the slides as a PDF file?

To export the slides to PDF, do the following:

  1. Toggle into Print View using the E key (or using the Navigation Menu).
  2. Open the in-browser print dialog (CTRL/CMD+P).
  3. Change the Destination setting to Save as PDF.
  4. Change the Layout to Landscape.
  5. Change the Margins to None.
  6. Enable the Background graphics option.
  7. Click Save.

Note: This feature has been confirmed to work in Google Chrome, Chromium as well as in Firefox.

These instructions were copied from the Quarto documentation (MIT License) and slightly modified.

Β© 2026 Dr. Lennart Wittkuhn

 

License: CC BY 4.0