University of Sussex
Browse
- No file added yet -

Ten simple rules for writing Dockerfiles for reproducible data science

Download (1.08 MB)
journal contribution
posted on 2024-10-02, 10:50 authored by D Nust, V Sochat, B Marwick, SJ Eglen, T Head, T Hirst, Benjamin EvansBenjamin Evans
Computational science has been greatly improved by the use of containers for packaging software and data dependencies. In a scholarly context, the main drivers for using these containers are transparency and support of reproducibility; in turn, a workflow's reproducibility can be greatly affected by the choices that are made with respect to building containers. In many cases, the build process for the container's image is created from instructions provided in a Dockerfile format. In support of this approach, we present a set of rules to help researchers write understandable Dockerfiles for typical data science workflows. By following the rules in this article, researchers can create containers suitable for sharing with fellow scientists, for including in scholarly communication such as education or scientific papers, and for effective and sustainable personal workflows.

Funding

PE 1632/17-1

History

Publication status

  • Published

File Version

  • Published version

Journal

PLoS Computational Biology

ISSN

1553-734X

Publisher

Public Library of Science (PLoS)

Issue

11

Volume

16

Page range

e1008316-

Article number

ARTN e1008316

Department affiliated with

  • Informatics Publications

Institution

University of Sussex

Full text available

  • Yes

Peer reviewed?

  • Yes

Editors

Markel S