Workflow for writing theses collaboratively
by Tullio Facchinetti
In the last few years I have been supervisor of tens of students (I keep an updated list here). In some periods I had to manage the concurrent writing of 5-6 theses. This huge parallel work called to develop a specific workflow to manage the collaborative writing of a thesis, which I find rather effective and that I will share in this article.
Well, to be more precise, in the mentioned “collaborative writing work” most of the writing is done by the student, while my job is mainly to give indications on the thesis organization and expected content, revise everything, providing feedback and updates (this is the role of a supervisor, doesn’t it? :-)).
The workflow is built around the following tools:
- LaTeX, to actually typeset the text - there is a template to avoid starting completely from scratch.
- git and Gitlab (or a similar service such as Github), to keep track and share the changes to the thesis.
On GitHub there is the thesis-templatex repository that contains the template for a thesis document. The frontpage is the standard for theses at the University of Pavia.
During the years, I found that writing relatively long documents that include tables, images with captions, equations and references can be a nightmare with MS Word or Libreoffice Writer, due to the endless adjustments required to every minor change to the document. They are plainly not adequate to the task.
The best option for writing a long document with consistent formatting and coping with all the needs of a thesis, is LaTeX. I do not enter into the details of the most common tips and tricks to write a thesis here. I wrote a dedicated article to that matter.
I just report a summary of the most important reasons that make LaTeX the best option for the task:
- It is based on text files, which are easy to edit and keep under versioning.
- The main document can be split into separate files, one file per chapter, and included with
\input; this makes the collaboration easier since it can be coordinated to avoid conflicts due to independent editing of the same file.
- It (mostly) painlessly handles the positioning of images with their captions.
- References are straightforward with the use of
\citefor the bibliography.
- Equations are rendered with the highest quality among all the available tools.
- The use of comments allows to effectively include feedback and notes directly within the file source
The student does not need to be a LaTeX ninjia. A template is made available with the most common setup for a thesis. Thus editing the document becomes a matter of simply writing the text in the appropriate place, plus some few specific commands to deal with images, equations, references, tables, etc. The template includes examples of such commands, which makes the inclusion of these elements a matter of copy-and-paste with the necessary minor adaptations.
When there is the need to draw high quality vector images and graphs, I suggest the use of Inkscape. Sometimes, to draw flowcharts I use yED. I rare occasions, I have used tikz, which is less user-friendly and may require some effort to obtain the desired results, especially if the picture is complex. Its “coding-oriented” approach is very stimulating.
All these tools fit very well in the process of writing a thesis with LaTeX.
git and Gitlab
git is nowadays the most popular tool for code versioning and code sharing. It powers services such as Github, Gitlab, SourceHut, and many others.
By using a central repository to host the thesis, git allows sharing of the whole directory containing
.tex files, images, and all the necessary other material.
With some cautions to avoid concurrent independent editing of the same file, the merge of changes is always done automatically. In many cases, git allows even the merge of concurrently edited files without “collisions”. A collision is when two changes to the same part of text are made by different authors, which leads to the impossibility to automatically sort out which is the change to keep in the next version of the document.
A feature that I find especially useful is the possibility to check the word-by-word diff between two versions of the same file. I Leverage this feature to find the changes made by students to my latest provided version of the file.
For this purpose, I firstly run a
git log, and I find the last commit from myself. E.g.:
$ git log ... commit c129ac2735b89fbdb6d6a81621e8c04d0947c04f Author: Tullio Facchinetti <email@example.com> Date: Sun Dec 26 13:16:14 2021 +0100 Minor rephrasing of introduction chapter. ...
Once I get the hash of my last change, I can inspect the modifications introduced by the student with the command:
git diff --word-diff c129ac27 chapter_intro.inc.tex
c129ac27 is the commit in which the last changes to the file happened, and
chapter_intro.inc.tex is the file that I want to examine.
This allows to easily check what has been changed by the student on that file.
The iterative process of writing and revising a thesis can take several weeks. When there are many theses to supervise at the same time, it be hard to track the progress on all of them.
For this reason, the repository with the LaTeX template includes a file named
TRACKING.md that is used to keep track of the status of development of each chapter in the thesis.
The file is versioned together with the
Beside some notes regarding the structure of the thesis, which are written during the planning with the student, the most important part of the file is organized as in the figure:
The table in the file has a row for each chapter, reporting the title of the chapter and the filename of the LaTeX file containing the text.
Two columns of the sheet are especially important for tracking the work on the thesis:
The status can take one of the following values:
- WRITING: Initial writing contents are required to be completed by the student.
- REVIEW: Contents are complete; nothing more to add by the student; chapter ready to be revised (or under revision).
- REVISED: Review done by the Professor; there are observations and comments to address.
- UPDATE: The student is addressing the comments.
- FINAL: The chapter completed; no more changes are required.
The responsibility tells who has the rights to access the file. The files include the rules to update the status. Basically, the idea is that there is only one person in charge of editing a given file, being either the student or myself. By having only one person at a time having such rights, the chance to have conflicts due to concurrent editing of the same file is minimized.
At the beginning, a template of the LaTeX repository is shared with the student through Gitlab.
The student starts writing the thesis. All the chapters are under his/her responsibility and in WRITING state.
As soon as one chapter is deemed complete by the student, so that he/she does not need to make any new change, he/she sends me a notification by email. The status becomes REVIEW and the responsibility becomes mine.
Once the review is completed, I notify the event to the student.
.tex file is updated with my changes.
Moreover, I typically insert comments, notes and observations to be addressed by the student.
I prepend every comment with the “keyword”
toOl, so that they can be easily searched in the text.
At this point, the responsibility goes back to the student, the status becomes REVISED, and the version number is increased.
I require the student to address my comments.
The student is asked to add a short comment him/herself to explain what was done.
If the change is trivial, a simple
ok is enough to understand that he/she did not miss my observation.
In case there are doubts, the student can simply write an “answer” to my comment, directly in the
.tex file, below my comment.
An important aspect is that the student must not remove any comment. It is my responsibility to remove the comments once I checked that it was addressed properly. This is done to avoid to miss some observations made during the review.
As can be expected, it may take a couple of iterations of REVIEW/REVISED. Once all the chapters are marked as FINAL, the thesis is completed.
The workflow and the simple tools that are involved in its implementation can be used not only by tech-savvy students, but are suited to everyone with a basic knowledge of the tools.
Probably git could be the most tricky to use. However, on one hand the used functionalities are the very basic ones, and on the other hand there is plenty of tutorials even for beginners on how to use it.
LaTeX requires more effort than MS Word or Libreoffice Writer to get started, but I am convinced that the initial effort pays off quite a lot during the writing and the cooperation on the thesis.