The first Workshop on Reproducible Software Environments for Research
and High-Performance Computing (HPC) will take place in Montpellier,
France, November 8–10:
Attendance is free but we kindly invite you to register.
The program features talks by scientists, engineers, and system
administrators from different backgrounds who will share their
experience with GNU Guix, as well as tutorial on GitLab, Guix, and other
tools that support scientific workflows—from bioinfo analyses to HPC and
source code archival:
We wish to make this workshop a hub for scientists and practitioners,
newcomers and experts.
See you in Montpellier!
For those who couldn’t make it in Montpellier, check out the live stream starting later today!
I must admit I first dismissed this workshop since I don’t have anything to do with HPC so it was a very nice surprise that:
- the workshop is live-streamed (ok, I’m not a big fan of YT, and apparently no-one is reading the chat, so it’s not really interactive, but this allowed me to watch the talks from Wednesday that happened while I was teaching, so, pretty nice! And the camera/streaming work is great, thanks to the local organizers);
- the workshop is not so much about HPC but much more general;
- there are great discussions about vocabulary, and not everyone agrees ;
- there is a lot about Guix but also about other ways to obtain/enforce reproducibility.
However, on that last topic, I’m a bit puzzled that a version control system, a forge, a continuous integration toolchain, Jupyter notebooks, Binder, etc. are presented as essential tools for reproducibility.
Don’t get me wrong, I love and use these tools on a daily basis. I also liked most of what I’ve seen from the talks/tutorials.
But a shell script with controlled environment is much more reproducible than a Jupyter notebook running on Binder in a Docker image built by Gitlab-CI, if anything inside that stack is not properly version-pinned for instance…
I feel that we’re mixing good software engineering practices (which are good, of course) and what really lies in the "reproducibility" part. That is indeed what Guix helps solve (among other things) as was demonstrated in several talks. But it can also be done, at least as much as possible in other environments, and I would have expected many more detailed practical sessions on what to do/what to avoid when you have to work with say… Conda/Docker/Nix-darwin/etc. (e.g. always pin the Docker image you use in FROM, never RUN
<package manager> update inside that Docker image, etc.)
Finally, I think some talks disappeared from the schedule (this morning, Thursday) and from the YT channel it was quite difficult to understand what was happening ^^
Still, great workshop, wish I were with you all!