Version control with git for scientists

PY Barriat & F. Massonnet

May 05, 2022
some parts inspired on slides from CISM

Discuss :speech_balloon:

How do you manage different file versions :question:

How do you work with collaborators on the same files :question:


Notions of code versioning

Track the history and evolution of the project

think of it as a series of snapshots (commits) of your code


  • possibility to go back in time :calendar: > tracking bugs > recovering from mistakes
  • Information about the modification :clipboard: > who, when, why

  • team work :busts_in_silhouette:
    • Simultaneous work on a project > No need to send email to say "I'm working on that file" (dropbox organization)
    • Asynchronous synchronisation > Allow work Offline (opposite to overleaf project) > Need conflict resolution

Different usage

  • local
  • client-server (Subversion)
  • distributed (Git)


Testing new idea (and easy way to throw them out) :construction:

Multiple version of the code

  • Stable (1.x.y)
  • Debug (1.x.y+1)
  • Next "feature" release (1.x+1.0)
  • Next "huge" release (2.0.0)

Open-Source Code

Compare Repositories

What is git ?

Version control system

  • Manage different versions of files
  • Collaborate with yourself
  • Collaborate with other people

Why use git

"Always remember your first collaborator is your future self, and your past self doesn't answer emails" Christie Balhai :wink:

git workflow

Your local repository consists of three areas maintained by git

  • the first one is your Working Directory which holds the actual files
  • the second one is the INDEX which acts as a staging area
  • and finally the HEAD which points to the last commit you've made

Getting started with git

checkout a remote repository

create a local working copy of a remote repository

git clone

add & commit

you can propose changes (add it to the INDEX)

git add <filename>

you can commit these changes (to the HEAD)

git commit -m "Commit message"


git versioning is a succession of snapshot of your files at key time of their development

each snapshot is called commit which is :

  • all the files at a given time
  • a unique name (SLHA1)
  • metadata > who created, when, info
  • pointer to previous(es) commit(s)

git diff

    participant Workspace
    participant INDEX
    %%Note right of Workspace: Text in note
    Workspace-->INDEX: git diff
    INDEX-->HEAD: git diff --cached
    Workspace-->HEAD: git diff HEAD

git undo

In case you did something wrong (which for sure never happens :wink:)

    participant Workspace
    participant INDEX
    Note over Workspace,INDEX: wrong modification of a file <br/>in your workspace
    INDEX->>Workspace: git checkout -- file
    %%HEAD-->Workspace: .<br/>
    Note over Workspace,HEAD: wrong modification of a file <br/>that you put in your index
    HEAD->>INDEX: git reset HEAD file
    INDEX->>Workspace: git checkout -- file

Windows users

How commonly do programmers use Git GUIs instead of the command line ?

Use programs like SourceTree or TortoiseGit

But, to be familiar with Git, try the command line

clone, push/pull, merge, rebase, log, tag, format-patch/am, bisect, blame, etc

git branches

  • a branch is pointer to a commit (represent an history)
  • a branch can point at other commit > it can move !
  • a branch is a way to organize your work and working histories
  • since commit know which commits they are based on, branch represents a commit and what came before it
  • a branch is cheap, you can have multiple branch in the same repository and switch your working dir from one branch state to another

branches demo

git commit
git checkout -b newbranch
git checkout newbranch
git commit
git commit
git checkout master
git commit
git commit

default branch: master

graph LR;
    A[fffc93b] -->|commit| B(fc7f81f)
    B -->|commit| D(6fd1a5a)
    D -->|commit| E[newbranch <br/>187e6ab ]
    B ---->|commit| Z(6ff4c2e)
    Z -->|commit| Y[master <br/>c0f502f ]

  • create a new branch : git checkout -b newbranch
  • switch to a branch : git checkout newbranch
  • delete a branch : git branch -d newbranch
  • list all branches : git branch -a > see both local and remote branches

branch is cheap : do it often :+1:

branch allow to have short/long term parallel development

merging branches

the interest of branch is that you can merge them

include in one (branch) file the modification done somewhere else

git merge bx
git branch -d bx
git commit
graph LR;
    A(fffc93b<br/>b_x) -->E[187e6ab<br/>b_x merged <br/>in b_y] 
    D[6fd1a5a<br/>b_y] -->E
    E -->|commit| Y[c0f502f<br/>b_y ]

Difference between git & GitHub ?

git is the version control system service

git runs local if you don't use GitHub

GitHub is the hosting service : website

on which you can publish (push) your git repositories and collaborate with other people


  • It provides a backup of your files
  • It gives you a visual interface for navigating your repos
  • It gives other people a way to navigate your repos
  • It makes repo collaboration easy (e.g., multiple people contributing to the same project)
  • It provides a lightweight issue tracking system

... and GitLab vs GitHub vs others

GitLab is an alternative to GitHub

GitLab is free for unlimited private projects. GitHub doesn't provide private projects for free

And for ELIC, Gogs does the job:

  • shares the same features > dashboard, file browser, issue tracking, groups support, webhooks, etc
  • easy to install, cross-platform friendly
  • uses little memory, uses little CPU power
  • ... and 100% free :smile:

What is git good for ?


Backup, reproducibility


Backup, reproducibility, collaboration

Your changes are now in the HEAD of your local working copy.


to send those changes to your remote repository

git push

pull to update your local working directory to the newest commit, to fetch and merge remote changes

git pull

git conflict :boom:

multiple version of files are great

  • not always easy to know how to merge them
  • conflict will happen (same line modify by both user)

conflict need to be resolved manually ! :fearful:

  • boring task
  • need to understand why a conflict is present !
  • do not be afraid of conflict ! :muscle: > Do not try to avoid them at all cost !
  • stay in sync as most as possible and keep line short


Backup, reproducibility, collaboration, transparency

    participant Workspace
    participant INDEX
    participant HEAD
    participant Remote Repository
    Remote Repository->>HEAD: clone
    HEAD->>Workspace: checkout
    %%Note over Workspace,Remote Repository: "clone" = clone + checkout
    Workspace->>INDEX: add
    INDEX->>HEAD: commit
    Remote Repository->>HEAD: fetch
    HEAD->>Workspace: merge
    Remote Repository->>Workspace: pull
    %%Note over Workspace,Remote Repository: pull = fetch + merge
    HEAD->>Remote Repository: push


  • versioning is crucial both for small/large project :exclamation:
  • avoid dropbox for paper / project :confounded:
  • do meaningful commit
  • do meaningful message
  • git more complicated but the standard :smiley:

Simple Git Exercices

First, configure your environment (just once) :construction:

on your laptop, on your ELIC account, etc

git config --global "Your Name"
git config --global ""
git config --global color.ui auto
git config --global core.editor "vim"

git config --list

Now, clone

Theses are very simple exercices to learn to manipulate git. In each folder, simply run ./ and follow the guide :sunglasses:

Version control with Git for scientists :chart_with_upwards_trend: