Madagascar Development Blog

Program of the month: sfgraph

August 9, 2011 Programs 4 comments

sfgraph belongs to the family of plotting programs and is used for plotting explicitly defined 2-D curves.

Here are 10 basic facts about this program:

sfgraph shares most of its parameters with some other 2-D plotting programs (sfgrey, sfcontour, sfwiggle). These common parameters can be accessed by running sfdoc stdplot. The following plot from rsf/rsf/sfgraph is using parameters grid=y gridcol=5 pad=n and some creative changes of fonts in the title.
If the input to sfgraph is real, it is understood as representing a regularly sampled 1-D function Y(X), where X is sampled according to n1=, o1=, and d1= parameters in the input file.
If the input is complex, its real part is taken as X, and the imaginary part is taken as Y. If the input is real initially, it is easy to turn it into complex by using sfcmplx or sfdd.
If the n2 parameter in the input is greater than 1, multiple curves are plotted. The following plot from rsf/rsf/sfmath shows plots of closed curves defined by a complex-valued input.
If the n3 or any of the larger dimensions is greater than 1, the plot becomes a movie.
By default, the graphs are plotted with lines. One can control the line appearance with generic parameters dash=, plotcol=, plotfat=.
The following plot from jsg/seislet/sin2 contained dashed lines created with dash=1,2,0.
If symbol= is specified, the graph is plotted with the given symbols. The size of the symbol is controlled with symbolsz=. The following plot from sep/precon/oned is created with symbol=”md” symbolsz=7.
The displayed function can be changed from $Y(X)$ to $X(Y)$ by using transp= parameter. The following plot from jsg/nmo3/azimuthtest is created with transp=y yreverse=y symbol=+ symbolsz=4.
The ranges of $X$ and $Y$ are selected automatically but can be controlled with min1=, max1=, min2=, max2=.
If you want automatic ranges, but no padding around minimum and maximum values, use pad=n.
sfgraph avoids plotting infinite or NaN (not a number) values.

The Madagascar school in Beijing was a blast! Many thanks to Yibo Wang and Yang Liu (x 2) for the excellent organization and tremendous hospitality. Nearly 230 people attended, representing 12 Chinese universities, 5 companies, and the Chinese Academy of Sciences. The school materials are available now on the website.

Running Madagascar in an interactive console

July 18, 2011 Systems 3 comments

The latest versions of IPython, a rich interactive shell for Python, provide a Qt-based console with enhanced GUI controls, such as embedded images, multiline editing, and session sharing. It works nicely with the Python interface to Madagascar. This allows for a Matlab-like or Mathematica-like experience with running Madagascar interactively. This style of computing is not recommended, however, because it may lead to computations that are hard to reproduce.

madagascar-1.2

July 14, 2011 Celebration No comments

The 1.2 release features 7 new reproducible papers, multiple bug fixes, and structural changes in the installation directories. In the first half of this year, according to the SourceForge statistics, stable versions have been dowloaded 1,859 times. During the same period, there have been 2,681 read transactions and 491 write transactions in the Subversion repository. According to Ohloh, the estimated project cost has exceeded $5 million.

Program of the month: sfnoise

July 3, 2011 Documentation No comments

Starting from this post, once a month we will blog about one of the popular Madagascar programs. The 30 highest ranking programs, according to admin/rank.py, are

grey window math dd spike put graph add cat transp scale pad spray smooth stack noise grey3 real dip ricker1 bandpass dots wiggle mask fft1 segyread contour pick mutter fft3

This month’s randomly selected feature is sfnoise.

As stated in the self-documentation, sfnoise is used for generating random noise and adding it to the data. This is useful for generating synthetic datasets or for multiple realizations in inversion. Computer-generated random numbers are often called pseudo-random, because algorithmically generated sequences of numbers are not truly random. This is actually a useful feature if you want to reproduce previous calculations. For reproducibility, it is imperative to use seed= parameter to set the initial “seed” for the random-number generator. If seed= is not used, it is set by the computer clock. Therefore, different runs of sfnoise will produce different sets of numbers.

Try the following on the command line:

bash$ sfspike n1=5 mag=0 | sfnoise | sfdisfil 
0: -2.455 1.197 -0.145 0.2394 0.7676 
bash$ sfspike n1=5 mag=0 | sfnoise | sfdisfil 
0: 2.203 -0.1106 -0.07494 -0.4916 0.2163

Your sets of numbers will be (most likely) different from the ones above and between the two runs. However, if you run

bash$ sfspike n1=5 mag=0 | sfnoise seed=2011 | sfdisfil 
0: 0.1917 0.3379 -0.9459 0.5841 -0.02078

you should see exactly the same set of number as above. You should also get the same numbers if using

bash$ sfspike n1=5 | sfnoise seed=2011 rep=y | sfdisfil 
0: 0.1917 0.3379 -0.9459 0.5841 -0.02078

The rep= parameter controls if the noise is replacing the data or being added to the data. The default operation is addition.

bash$ sfspike n1=5 | sfnoise seed=2011 | sfdisfil 
0: 1.192 1.338 0.05412 1.584 0.9792

The algorithm for pseudorandom number generation that sfnoise uses is known as Mersenne twister. It is a powerful algorithm that generates a nearly uniformly distributed sequence that does not repeat for a very large period of $2^{19937}-1$.

Matsumoto, M.; Nishimura, T. (1998). “Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator”. ACM Transactions on Modeling and Computer Simulation 8 (1): 330.

Uniformly-distributed numbers can be used to generate random numbers with other distributions. By default, sfnoise is using normal distribution. It can be changed to uniform distribution by setting type=n.

See rsf/rsf/sfnoise for simple examples of setting the noise distribution parameters (type=, mean=, var=, range=).

See geostats/simulate/rfield and Jim Jenning’s presentation at Houston-2010 school for more sophisticated examples of using sfnoise, together with FFT-based variogram computation, in geostatistical simulations.

Wide-azimuth angle gathers

June 30, 2011 Documentation No comments

A new paper is added to the collection of reproducible documents:
Wide-azimuth angle gathers for wave-equation migration

Traveltime approximations for TI media

June 28, 2011 Documentation No comments

A new paper is added to the collection of reproducible documents:
Traveltime approximations for transversely isotropic media with an inhomogeneous background

Executable papers

June 26, 2011 Links No comments

In addition to six different workshops and special sessions devoted to reproducible research, an important event of this year is the Executable Paper Grand Challenge organized by Elsevier.

The Grand Challenge was a “contest created to improve the way scientific information is communicated and used”. Many of the participants focused on implementing reproducible research. The winners were announced this month at the International Conference on Computational Science in Singapore, with winning entries, as well as other solutions, published in Procedia Computer Science.

The Madagascar approach to Reproducible Papers works but is starting to show its age. Perhaps we could learn from other people on how to make it more modern.

Velocity-independent tau-p moveout

June 26, 2011 Documentation No comments

A new paper is added to the collection of reproducible documents:
Velocity-independent tau-p moveout in a horizontally-layered VTI medium

This paper is the first contribution to Madagascar from Politecnico di Milano, Italy.

What are the design principles of Madagascar?

June 19, 2011 FAQ No comments

The Madagascar code is designed around several fundamental principles.

Modularity. This principle comes from Unix. Doug McIlroy, the inventor of Unix pipes, formulates it as

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

Madagascar makes one exception: RSF files are text streams but they point to binary data. This is the simplest way to handle large datasets while preserving the Unix approach.
KISS (Keep it simple, Stupid!). This principle is closely related to modularity. We try to make our tools and formats as simple as possible to achieve the given functionality.
Test-driven development This principle does not apply literally to scientific programming, because scientific computing is often exploratory: the result of the computational experiment is not always known beforehand. However, once the experiment is completed, it immediately becomes a test for future development, because we expect the results of the experiment to be reproducible. A Madagascar module is not included in the official distribution until there is an example of its usage in reproducible documents.
YAGNI (You ain’t gonna need it!). This principle comes from XP (Extreme Programming). Ron Jeffries, one of the founders of XP, states it as

Always implement things when you actually need them, never when you just foresee that you need them.

Madagascar is not developed for imaginary users. It is developed by people who use it and who add functionality as they need it. This is also known as “scratching a developer’s personal itch”, a feeling familiar to the creators of Unix. As Dennis Ritchie admits in a recent interview,

Apart from doing new and cool stuff, what guided us was really kind of selfishto write tools we could use ourselves to make our lives easier: “Id like such-and-such to do such-and-such, and thats hard to do now. What kind of tool can I write to make that easier?”

(images from Wikipedia)