next up previous [pdf]

Next: Paper organization Up: Introduction Previous: Reproducible research philosophy

Tools for reproducible research

The reproducible research system developed at Stanford is based on ``make (Stallman et al., 2004)'', a Unix software construction utility. Originally, SEP used ``cake'', a dialect of ``make'' (Claerbout and Karrenbach, 1993; Nichols and Cole, 1989; Claerbout, 1992b; Claerbout and Nichols, 1990). The system was converted to ``GNU make'', a more standard dialect, by Schwab and Schroeder (1995). The ``make'' program keeps track of dependencies between different components of the system and the software construction targets, which, in the case of a reproducible research system, turn into figures and manuscripts. The targets and commands for their construction are specified by the author in ``makefiles'', which serve as databases for defining source and target dependencies. A dependency-based system leads to rapid development, because when one of the sources changes, only parts that depend on this source get recomputed. Buckheit and Donoho (1995) based their system on MATLAB, a popular integrated development environment produced by MathWorks (Sigmon and Davis, 2001). While MATLAB is an adequate tool for prototyping numerical algorithms, it may not be sufficient for large-scale computations typical for many applications in computational geophysics.

``Make'' is an extremely useful utility employed by thousands of software development projects. Unfortunately, it is not well designed from the user experience prospective. ``Make'' employs an obscure and limited special language (a mixture of Unix shell commands and special-purpose commands), which often appears confusing to unexperienced users. According to Peter van der Linden, a software expert from Sun Microsystems (van der Linden, 1994),

``Sendmail'' and ``make'' are two well known programs that are pretty widely regarded as originally being debugged into existence. That's why their command languages are so poorly thought out and difficult to learn. It's not just you - everyone finds them troublesome.
The inconvenience of ``make'' command language is also in its limited capabilities. The reproducible research system developed by Schwab et al. (2000) includes not only custom ``make'' rules but also an obscure and hardly portable agglomeration of shell and Perl scripts that extend ``make'' (Fomel et al., 1997).

Several alternative systems for dependency-checking software construction have been developed in recent years. One of the most promising new tools is SCons, enthusiastically endorsed by Dubois (2003). The SCons initial design won the Software Carpentry competition sponsored by Los Alamos National Laboratory in 2000 in the category of ``a dependency management tool to replace make''. Some of the main advantages of SCons are:

In this paper, we propose to adopt SCons as a new platform for reproducible research in scientific computing.


next up previous [pdf]

Next: Paper organization Up: Introduction Previous: Reproducible research philosophy

2012-07-19