Reproducibility of speedup tests

June 10, 2010 Links No comments

Most of the time, when we talk about reproducibility in computational sciences, we assume that of numerical results. We expect computational experiments to produce identical results in different execution environments for same input data.
But it is not the case all the time – quite often, the goal of a research endeavor is to design a faster algorithm. Then, the result of the experiment is performance information and demonstration of a speedup over existing algorithms for finding a solution for the same problem. Speedup, just like numerical results, should be reproducible in different execution environments.
Sid-Ahmed-Ali Touati, Julien Worms, and Sebastien Briais of INRIA published an excellent work on methodology of reproducible speedup tests.
A part of the introduction from their paper is worth quoting on this blog:

Known hints for making a research result non reproducible
Hard natural sciences such as physics, chemistry and biology impose strict experimental methodologies and rigorous statistical measures in order to guarantee the reproducibility of the results with a measured confidence (probability of error/success). The reproducibility of the experimental results in our community of program optimisation is a weak point. Given a research article, it is in practice impossible or too difficult to reproduce the published performance. If the results are not reproducible, the benefit of publishing becomes limited. We note below some hints that make a research article non-reproducible:

  • Non using of precise scientific languages such as mathematics. Ideally, mathematics must always be preferred to describe ideas, if possible, with an accessible difficulty.
  • Non available software, non released software, non communicated precise data.
  • Not providing formal algorithms or protocols make impossible to reproduce exactly the ideas.
  • Hide many experimental details.
  • Usage of deprecated machines, deprecated OS, exotic environment, etc.
  • Doing wrong statistics with the collected data.

Part of the non-reproducibility (and not all) of the published experiments is explained by the fact that the observed speedups are sometimes rare events. It means that they are far from what we could observe if we redo the experiments multiple times. Even if we take an ideal situation where we use exactly the original experimental machines and software, it is sometimes difficult to reproduce exactly the same performance numbers again and again, experience after experience. Since some published performances numbers represent exceptional events, we believe that if a computer scientist succeeds in reproducing the performance numbers of his colleagues (with a reasonable error ratio), it would be equivalent to what rigorous probabilists and statisticians call a surprise. We argue that it is better to have a lower speedup that can be reproduced in practice, than a rare speedup that can be remarked by accident.

Read the full document for a thorough explanation of how to avoid creating non-reproducible and erroneous speedup tests by using proper scientific techniques.

Event of the year

June 9, 2010 Celebration No comments

Please reserve the date for the Madagascar “event of the year”: the School and Workshop in Houston on July 23-24, 2010. The program details and registration information will follow soon.

Raisings scientific standards

June 6, 2010 Links No comments

After returning back from the NSF Workshop Archiving Experiments to Raise Scientific Standards, here are some thoughts on reproducible research. Thanks to Dan Gezelter, Dennis Shasha, and others for inspiring discussions.

First of all, it is important to point out that reproducibility is not the goal in itself. There are many situations in which strict computational reproducibility is not achievable. The goal is an exposure of the scientific communication to a skeptic inquiry. A mathematical proof is an example of a scientific communication, which is constructed as a dialogue with a skeptic: someone who might say “What if your conclusions are not true?” Step by step, a mathematical proof is designed to convince the skeptic that the conclusion (a theorem) has to be true. As for computational results, even the simplest skeptic inquiry “What if there is a bug in your code?” cannot be answered unless the software code and every computational step that led to the published result are open for inspection.

If you attend a mathematical conference, you can notice that mathematicians do not usually go through every step in the proof to present a theorem, it is enough to sketch the main idea of the proof. However, the audience understands that the detailed proof should be available in the published work, otherwise the theorem cannot be accepted. Similarly, in a presentation of a computational work, one can simply show results of a computational experiment. However, such results cannot be accepted as scientific unless the full computation is disclosed for a skeptic inquiry. As stated by Dave Donoho (who paraphrased Jon Claerbout),

An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.

If you don’t want to disclose the details of your computation, then the work that you do is not science. As for reproducibility, there seems to be different degrees of it:

  1. Replicability: the ability to reproduce the computation as published
  2. Reusability: the ability to apply a similar algorithm to a different problem or different input data
  3. Reconfigurability: the ability to obtain a similar result when the parameters of the experiment are perturbed deliberately

Some algorithms are perfectly replicable but of limited use, because they are too sensitive to the choice of parameters to be reusable or reconfigurable. Nevertheless, such algorithms deserve a place in the scientific body of knowledge, because they may lead to a discovery or invention of more robust algorithms.

Those who read Italian may enjoy the philosophical article on open software and reproducible research by Alessandro Frigeri and Gisella Speranza: “Eppur si muove” Software libero e ricerca riproducibile Eppur si muove (and yet it moves) are the words attributed to Galileo Galilei, the father of modern science.

PLplot

May 18, 2010 Programs No comments

Vladimir Bashkardin contributes a Vplot plugin for PLplot, an open-source scientific plotting library. Here is an example of generating Vplot files with PLplot using sfplsurf.

Seislet transform

May 14, 2010 Documentation 1 comment

A new paper is added to the collection of reproducible documents:
Seislet transform and seislet frame

Top ranked programs and projects

April 28, 2010 Programs No comments

A previous entry ranked most popular Madagascar programs by the number of projects they are used in. A different approach to ranking is to use network analysis algorithms. If we declare that the two programs are linked if they are used in the same project, then all links define a network, and we can use the PageRank algorithm devised by Google to find the largest “hubs” in the network. Similarly, two projects are linked if the use the same program, which defines a network and ranking among projects. The admin/rank.py script does the job. In reverse order, the 10 top ranked programs in Madagascar are:

10. sftransp Transpose two axes in a dataset.

9. sfput Input parameters into a header.

8. sfgraph Graph plot.

7. sfcat Concatenate datasets.

6. sfspike Generate simple data: spikes, boxes, planes, constants.

5. sfadd Add, multiply, or divide RSF datasets.

4.sfdd Convert between different formats.

3.sfmath Mathematical operations on data files.

2.sfwindow Window a portion of a dataset.

1.sfgrey Generate raster plot.

More documentation on these and other programs – in Guide to Madagascar programs. The three top ranked programs in the “generic” category (signal processing programs applicable to any kind of data) are smooth (Multi-dimensional triangle smoothing), sfnoise (Add random noise to the data), and sfbandpass (Bandpass filtering). The three top ranked programs in the “seismic” category (signal processing programs applicable to seismic data) are sfricker1 (Convolution with a Ricker wavelet), sfmutter (Muting), and sfsegyread (Convert a SEG-Y or SU dataset to RSF).

The following projects share the top rank in terms of being hubs for programs:

Generalized moveout approximation

March 25, 2010 Documentation No comments

A new paper is added to the collection of reproducible documents:
Generalized nonhyperbolic moveout approximation

Putting life into the blog

March 15, 2010 Uncategorized 2 comments

Adapting slightly from a good article, in order to make it pertinent to the Madagascar blog:

“To really work, a blog has to be about something bigger than his or her company and his or her product. This sounds simple, but it isn’t. It takes real discipline to not talk about yourself and your company. Blogging as a medium seems so personal, and often it is. But when you’re using a blog to promote a project, that blog can’t be about you, Sierra said. It has to be about your readers, who will, it’s hoped, become involved in your project. It has to be about making them awesome. So, for example, if you’re selling a clever attachment to a camera that diffuses harsh flash light, don’t talk about the technical features or about your holiday sale (10 percent off!). Make a list of 10 tips for being a better photographer. If you’re opening a restaurant, don’t blog about your menu. Blog about great food. You’ll attract foodies who don’t care about your restaurant yet.”

Further elaboration on this can be found in Let’s Take This Offline, by Joel Spolsky (although the point of the article is not about how to make a good blog, but rather a summary made by one of the most read software bloggers, who is taking a break from blogging now)

I propose expanding the content of the blog to more topics of interest to the target developer and user base of Madagascar: computational sciences in general and geophysics in particular, HPC software architecture, coding tips, life in the industry and academia, and whatever is cool and interesting for the sort of people that we would want as developers and users.
Serendipitously, Sergey’s post on reproducibility is a good start 🙂 What does the community think?

madagascar-0.9.9

February 18, 2010 Celebration No comments

A new stable release of madagascar is another step toward the first non-beta version (madagascar-1.0) anticipated to be released later this year. This release features new reproducible papers and many other improvements. One new feature is the nonseismic package, which contains a subset of the full package for people who do not work with seismic data. The cumulative number of downloads for all stable versions has reached 8,000.

If you’re going to do good science, release the computer code too

February 14, 2010 Links No comments

A article published in Guardian argues for software openness in science.

.…if you are publishing research articles that use computer programs, if you want to claim that you are engaging in science, the programs are in your possession and you will not release them then I would not regard you as a scientist; I would also regard any papers based on the software as null and void.

For an emprical survey on why scientists do and do not share their software, see The Scientific Method in Practice: Reproducibility in the Computational Sciences by Victoria Stodden.