Session 1: Introduction to Version Control

Track, organize and share your work: An introduction to Git for research

Course at University of Hamburg & Erasmus University Rotterdam

Slides | Source

License: CC BY 4.0 DOI

October 18 2024 (10:15 am)

1 Logistics and admin

Team

Teaching Assistant

A portrait photo of Reza Hakimazar.

Reza Hakimazar

reza.hakimazar@uni-hamburg.de
GitHub

Teaching Collaborator

A portrait photo of Dr. Ana Martinovici.

Dr. Ana Martinovici

martinovici@rsm.nl
GitHub anamartinovici.com

International Teaching Collaboration

A new funding line allows researchers at University of Hamburg to develop virtual teaching […] formats with international partner universities. Students can also gain international experience without going abroad.

Rotterdam, The Netherlands

Zairon, CC BY-SA 4.0, via Wikimedia Commons

Research on “Cognitive Neuroscience of Learning & Change”

How does the brain use past experience to guide future decisions?


taken from Lake et al. (2016): “Building machines that learn and think like people”

Find out more on our group’s website: https://schucklab.gitlab.io/

Key research areas

  • Representation of task states
  • Memory reactivation and decision-making
  • The influence of task irrelevant information on decision-making
  • The effect of aging, genes and disease on (spatial) memory and learning

Research profile of Dr. Ana Martinovici

Background

  • PhD in Marketing (Tilburg University)
  • MSc in Econometrics and Mathematical Economics (Tilburg University)

Interests

  • (Visual) Attention
  • Privacy and data protection
  • Open Science

Experience with

  • Eye tracking data
  • Choice modeling
  • R, Stan, Git

Who are you?

  1. Your name?
  2. Your preferred pronouns?
  3. One (fun) fact about you? For example:
    • What did you study before and where?
    • What do you expect from this course?
    • What’s your hobby?
    • Do you have a pet?
    • What’s your favorite color?
    • Your mood on rubber duck scale?

Mood on a rubber duck scale.

Course overview

  • Event: Seminar
  • Credits: 4,0 (2 SWS)
  • Language: English
  • Tag: PsyM14-PsyWB-K01

What will the average seminar session look like?

The course will consist of up to 13 sessions (90 minutes each)

  1. Content Review (up to 30 minutes):
    Course participants engage with the online materials (aka. our “Version Control Book”), supplemented by short presentations by the instructors. Some course preparation may occur outside of the class.
  1. Interactive Discussions & Quizzes (up to 15 minutes):
    Course participants collectively address any inquiries related to the session’s content and online materials. Instructor-led quiz questions may also be interspersed throughout.
  1. Exercises & Implementation (up to 60 minutes):
    Course participants actively delve into hands-on exercises and assignments.

Logistics

  • You need a laptop. Talk to us if you don’t have a laptop.

Note, that course participants are sometimes required to work on course materials outside of class time.

Not all course contents will be covered during class time.

Schedule

No Date Title Contents Reading Survey/Quiz
1 2024-10-18 Introduction to version control Organizational matters
Overview of seminar sessions
Introduction to version control
Introduction to Git and its advantages
Intro to version control Course introduction Survey
2 2024-10-25 Command line File Systems
Benefits of the Command Line
Basic Command Line commands
Command Line Command Line Quiz
3 2024-11-01 Setup + Git Fundamentals Installation and configuration of Git
Initializing a Git repository
Basic Git commands
Installation, Setup, First steps with Git Installation Survey, Git Basics Quiz
4 2024-11-08 Basic Git workflow Practicing basic Git commands
Ignoring files with .gitignore
Good commit messages
First steps with Git Git Basics Quiz
5 2024-11-15 Git Essentials (Repetition & Practice) Practicing basic Git commands
Ignoring files with .gitignore
Good commit messages
Git Essentials Git Basics Quiz
6 2024-11-22 Git Branching and Merging Understanding branches in Git
Creating and switching between branches
Merging branches
Resolving merge conflict
Branches Git Branches Quiz
7 2024-11-29 Quarto Workshop Introduction to Quarto
8 2024-11-06 Introduction to GitHub Introduction to remote repositories
Creating a GitHub account
Creating and managing repositories on GitHub
Pushing and pulling changes
GitHub Intro GitHub Quiz
9 2024-12-13 GitHub with collaborators Cloning a remote repository
Branching and merging in a collaborative environment
Pull Requests
GitHub Issues
Graphical User Interfaces (GUIs), e.g., GitKraken
GitHub Intro, GitHub Issues GitHub Quiz
10 2024-12-20 Repetition and Practice Repetition and Practice
11 2025-01-10 GitHub with the world Forking a remote repository
README files
Project Management
GitHub Intro, GitHub Issues GitHub Quiz
12 2025-01-17 Publishing Creating Tags with Git
Creating Releases with GitHub
Using Zenodo for scientific publishing
Licences
Citation Files
Tags and Releases
13 2025-01-24 Graphical User Interfaces Repetition and Practice
Introduction to using GUIs
Graphical User Interfaces
14 2025-01-31 Summary & Wrap-Up Course evaluation
Repetition and Practice
Introduction to using GUIs
Graphical User Interfaces

Course Website

https://lennartwittkuhn.com/version-control-course-uhh-eur-ws24

Version Control Book

https://lennartwittkuhn.com/version-control-book/

Exercises, quizzes & surveys

  • We use online surveys to ask you questions and implement exercises or quizzes
  • Implemented in the formr survey framework (open-source, hosted in Germany)

Anonymity & data usage

  • all raw data are kept anonymous and will only be used for research and educational purposes
  • if responses are shared as part of the course, they will be aggregated to ensure anonymity is maintained
  • if you want your data to be deleted after the course, send an email with your personal codeword to sekretariat-luv.psych@uni-hamburg.de (Christine Manor). Our secretary forwards your codeword to us (without your name).

Your role: Questions and communication

Questions & discussions during class time

  • Ask questions! There are no stupid questions!
  • Participate in the discussions
  • We can use Zoom Breakout Rooms to address individual questions
  • Ask questions! There are no stupid questions!

Questions & discussions outside of class time

Your role: Active participation

Active participation

  • This is a pass / fail course. You pass if you fulfill all course requirements:
  • Requirement 1: Come to at least 12 out of 14 sessions (85%)
  • Requirement 2: Complete all surveys/quizzes
  • Requirement 3: Complete all mandatory exercises (implemented in Git)

How do we verify the course requirements?

  1. Requirement 1: Sign the attendance list
    • We take screenshots of the Zoom participants list
    • University of Hamburg: You may need to sign the attendance list (at the end of the semester; TBA)
  2. Requirement 2: Provide a personal codeword. At the end of the semester, send an email with your personal codeword to sekretariat-luv.psych@uni-hamburg.de (Christine Manor). We will send our secretary a list of personal codewords and she will return a list of names.
  3. Requirement 3: At the end of the semester, send a link to your completed exercises.

Course exercise: Building an online city guide

https://lennartwittkuhn.com/uhh-eur-city-guide/

Code of Conduct

During this course, we want to ensure a safe, productive, and welcoming environment for everyone who attends. All participants and speakers are expected to abide by this code of conduct. We do not tolerate any form of discrimination or harassment in any form or by any means. If you experience harassment or hear of any incidents of unacceptable behavior, please reach out to the course instructor, Lennart Wittkuhn (lennart.wittkuhn@uni-hamburg.de), so that we can take the appropriate action.

Unacceptable behavior is defined as:

  • Harassment, intimidation, or discrimination in any form, verbal abuse of any attendee, speaker, or other person. Examples include, but are not limited to, verbal comments related to gender, sexual orientation, disability, physical appearance, body size, race, religion, national origin, inappropriate use of nudity and/or sexual images in public spaces or in presentations, or threatening or stalking.
  • Disruption of presentations throughout the course. We ask all participants to comply to the instructions of the speaker with regard to dedicated discussion space and time.
  • Participants should not take pictures of any activity in the course room without asking all involved participants for consent and receiving this consent.

A first violation of this code of conduct will result in a warning, and subsequent violations by the same person can result in the immediate removal from the course without further warning. The organizers also reserve the right to prohibit attendance of excluded participants from similar future workshops, courses or meetings they organize.

2 Survey results

3 Introduction to version control

Learning objectives

At the end of this session, you should be able to answer the following questions and / or achieve the following learning objectives:

💡 You know what version control is.
💡 You can argue why version control is useful (for research).
💡 You can name benefits of Git compared to other approaches to version control.
💡 You can explain the difference between Git and GitHub.

Your turn

  1. Read Chapter 1: “Introduction to Version Control” in the Version Control Book.
  2. Introduce yourself to your course partner.
  3. Discuss the learning objectives with your course partner.

Learning objectives

💡 You know what version control is.
💡 You can argue why version control is useful (for research).
💡 You can name benefits of Git compared to other approaches to version control.
💡 You can explain the difference between Git and GitHub.

The issue of computational reproducibility in science

“… when the same analysis steps performed on the same dataset consistently produce the same answer.” 1

by Scriberia for The Turing Way Community (2022) (Link, CC BY 4.0)

The problem

  • about more than half of research is not reproducible 2
    • research data, code, software & materials are often not available “upon reasonable [sic] request”
    • if resources are shared, they are often incomplete
  • 90% of researchers: “reproducibility crisis” (N = 1576) 3

Why?

  • computational reproducibility is hard
  • researchers lack training
  • incentives are not (yet) aligned 4
  • “natural selection of bad science” 5

… accumulated evidence indicates […] substantial room for improvement with regard to research practices to maximize the efficiency of the research community’s use of the public’s financial investment.(Munafò et al., 2017)

We need a professional toolkit for digital research!

Why we need version control …

… for code (text files) © Jorge Cham (phdcomics.com)

… for data (binary files) © Jorge Cham (phdcomics.com)

When everything is relevant …

… track everything.

What is version control?

“Version control is a systematic approach to record changes made in a […] set of files, over time. This allows you and your collaborators to track the history, see what changed, and recall specific versions later […]” (Turing Way)

keep track of changes in a directory (a “repository”)

take snapshots (“commits”) of your repo at any time

know the history: what was changed when by whom

compare commits and go back to any previous state

work on parallel “branches” & flexibly “merge” them

“push” your repo to a “remote” location & share it

share repos on platforms like GitHub or GitLab

work together on the same files at the same time

others can read, copy, edit and suggest changes

make your repo public and openly share your work

What is Git?

  • most popular version control system
  • free, open-source command-line tool
  • graphical user interfaces exist, e.g., GitKraken
  • standard tool for most (all?) software developers
  • 100 million GitHub users 6

Schedule

No Date Title Contents Reading Survey/Quiz
1 2024-10-18 Introduction to version control Organizational matters
Overview of seminar sessions
Introduction to version control
Introduction to Git and its advantages
Intro to version control Course introduction Survey
2 2024-10-25 Command line File Systems
Benefits of the Command Line
Basic Command Line commands
Command Line Command Line Quiz
3 2024-11-01 Setup + Git Fundamentals Installation and configuration of Git
Initializing a Git repository
Basic Git commands
Installation, Setup, First steps with Git Installation Survey, Git Basics Quiz
4 2024-11-08 Basic Git workflow Practicing basic Git commands
Ignoring files with .gitignore
Good commit messages
First steps with Git Git Basics Quiz
5 2024-11-15 Git Essentials (Repetition & Practice) Practicing basic Git commands
Ignoring files with .gitignore
Good commit messages
Git Essentials Git Basics Quiz
6 2024-11-22 Git Branching and Merging Understanding branches in Git
Creating and switching between branches
Merging branches
Resolving merge conflict
Branches Git Branches Quiz
7 2024-11-29 Quarto Workshop Introduction to Quarto
8 2024-11-06 Introduction to GitHub Introduction to remote repositories
Creating a GitHub account
Creating and managing repositories on GitHub
Pushing and pulling changes
GitHub Intro GitHub Quiz
9 2024-12-13 GitHub with collaborators Cloning a remote repository
Branching and merging in a collaborative environment
Pull Requests
GitHub Issues
Graphical User Interfaces (GUIs), e.g., GitKraken
GitHub Intro, GitHub Issues GitHub Quiz
10 2024-12-20 Repetition and Practice Repetition and Practice
11 2025-01-10 GitHub with the world Forking a remote repository
README files
Project Management
GitHub Intro, GitHub Issues GitHub Quiz
12 2025-01-17 Publishing Creating Tags with Git
Creating Releases with GitHub
Using Zenodo for scientific publishing
Licences
Citation Files
Tags and Releases
13 2025-01-24 Graphical User Interfaces Repetition and Practice
Introduction to using GUIs
Graphical User Interfaces
14 2025-01-31 Summary & Wrap-Up Course evaluation
Repetition and Practice
Introduction to using GUIs
Graphical User Interfaces

Next week: The command line

Source: Wikimedia Commons (free license)

Homework

1. Check if you have Git / a command line installed

Windows Users

Apple or Linux Users

2. Install RStudio & Quarto

If you have any problems with installation, please get in touch!

3. Complete the pre-course survey

References

Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454. https://doi.org/10.1038/533452a.
Crüwell, S., Apthorp, D., Baker, B. J., Colling, L., Elson, M., Geiger, S. J., Lobentanzer, S., Monéger, J., Patterson, A., Schwarzkopf, D. S., Zaneva, M., & Brown, N. J. L. (2023). Whats in a Badge? A Computational Reproducibility Investigation of the Open Data Badge Policy in One Issue of Psychological Science. Psychological Science, 34(4), 512–522. https://doi.org/10.1177/09567976221140828.
Hardwicke, T. E., Bohn, M., MacDonald, K., Hembacher, E., Nuijten, M. B., Peloquin, B. N., deMayo, B. E., Long, B., Yoon, E. J., & Frank, M. C. (2021). Analytic reproducibility in articles receiving open data badges at the journal Psychological Science : an observational study. Royal Society Open Science, 8(1). https://doi.org/10.1098/rsos.201494.
Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2016). Building machines that learn and think like people. Behavioral and Brain Sciences, 40. https://doi.org/10.1017/s0140525x16001837.
Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Percie du Sert, N., Simonsohn, U., Wagenmakers, E.-J., Ware, J. J., & Ioannidis, J. P. A. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1(1). https://doi.org/10.1038/s41562-016-0021.
Obels, P., Lakens, D., Coles, N. A., Gottfried, J., & Green, S. A. (2020). Analysis of Open Data and Computational Reproducibility in Registered Reports in Psychology. Advances in Methods and Practices in Psychological Science, 3(2), 229–237. https://doi.org/10.1177/2515245920918872.
Poldrack, R. A. (2019). The Costs of Reproducibility. Neuron, 101(1), 11–14. https://doi.org/10.1016/j.neuron.2018.11.030.
Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society Open Science, 3(9), 160384. https://doi.org/10.1098/rsos.160384.
The Turing Way Community. (2022). The turing way: A handbook for reproducible, ethical and collaborative research. Zenodo. https://doi.org/10.5281/zenodo.3233853.
Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61(7), 726–728. https://doi.org/10.1037/0003-066x.61.7.726.

Footnotes

  1. The Turing Way Community (2022), see “Guide on Reproducible Research”

  2. for example, in Psychology: Crüwell et al. (2023); Hardwicke et al. (2021); Obels et al. (2020); Wicherts et al. (2006)

  3. see Baker (2016), Nature

  4. see e.g., Poldrack (2019)

  5. see Smaldino & McElreath (2016)

  6. (Source: Wikipedia)

  7. pull requests on GitHub, merge requests on GitLab