Extending Python interface

August 3, 2008 Systems No comments

Work is under way on extending the Python interface to Madagascar. With new tools, you should be able to use an interactive Python session rather than a Unix shell to run Madagascar modules. Here are some examples:

import m8r as sf
import numpy, pylab

f = sf.spike(n1=1000,k1=300)[0]

# sf.spike is an operator
# f is an RSF file object

f.attr()

# Inspect the file with sfattr

b = sf.bandpass(fhi=2,phase=1)[f]

# Now f is filtered through sfbandpass

c = sf.spike(n1=1000,k1=300).bandpass(fhi=2,phase=1)[0]

# c is equivalent to b but created with a pipe

g = c.wiggle(clip=0.02,title='Welcome to Madagascar')

# g is a Vplot file object

g.show()

# Display it on the screen

d = b - c

# Elementary arithmetic operations on files are defined

g = g + d.wiggle(wanttitle=False)

# So are operations on plots

g.show()

# This shows a movie

pylab.show(pylab.plot(b))
c = numpy.clip(b,0,0.01)

# RSF file objects can be passed to pylab or numpy

c.attr()
s = c[300:310]

print s

# Taking a slice outputs a numpy array

c = sf.clip(clip=0.01)[b]
c.attr()

# Alternatively, apply sfclip

For doing reproducible research, one can also use the new syntax inside SConstruct files, as follows:

from rsf.proj import *
import m8r as sf

Flow('filter',None,sf.spike(n1=1000,k1=300).bandpass(fhi=2,phase=1))
Result('filter',sf.wiggle(clip=0.02,title='Welcome to Madagascar'))

End()

See also a 4-line dot-product test and 20-line conjugate-gradient algorithm.

The picture shows a screenshot of an interactive session in a SAGE web-based notebook

Reproducible Research

July 25, 2008 Celebration No comments

A newly created website reproducibleresearch.org collects experiences about reproducible computational research from different groups and scientific fields. Check out the associated blog for updates. The site is maintained by John Cook from M.D. Anderson Cancer Center.

OpendTect

July 15, 2008 Systems No comments

The latest version of OpendTect includes a Madagascar processing flow builder. The screenshot is contributed by Hesam Kazemeini.

Vplotdiff

June 7, 2008 Uncategorized No comments

Joe Dellinger contributes vplotdiff – a program for architecture-independent comparison of Vplot files. This is a major milestone in implementing an automatic testing system and in meeting Greg Wilson’s Grand Challenge for the open-source community.

When converting from SEG-Y to RSF, how do I deal with non-standard trace headers?

May 24, 2008 FAQ No comments

sfsegyread currently limits the number of trace header keys to 71. However, you can remap a non-standard header. For example,

sfsegyread < file.segy tfile=tfile.rsf swdep=188 > file.rsf

will map the header value at 189-193 bytes to the “swdep” standard header.
Note that numeration starts at 0 at and that the number of bytes in the
headers (2 or 4) should match. You can test the output by running

sfheaderattr < tfile.rsf

GPL vs. research confidentiality agreements

May 23, 2008 FAQ No comments

Some of the academic researchers that use Madagascar are sponsored by companies who require a period of confidentiality before deliverables can be made public. During discussions at the workshop, the issue of the interaction between the confidentiality agreement and the GPL was raised.
I am not a lawyer and what follows is not legal advice, but just describes my understanding of the issue.
Let us assume a researcher (called R) who wrote a program P linking with GPL-ed library G (in our case, G=rsflib). If Researcher R distributes the program P to anyone, then he will have to license P under the GPL because it is linking with a GPL-ed library. R is free to use G in any way he wants for his own purposes, and he is not required at all by the GPL to distribute program P. If, however, he chooses to distribute program P, he simply has no choice than to release it under the GPL, regardless of what contracts he may have with others. If he restricts the rights of the users by breaking the GPL, then he loses the right to use library G, it’s that simple. Notice again the starting assumption that researcher R received library G under the GPL. If he negotiated a private agreement with the author of G, and received G under another license (paying or not for it, it does not matter), then that is a different case, not treated by this analysis.
Researcher R, as the copyright holder, has the right to release the code to anyone he pleases, when he pleases, under whatever license he pleases. R is therefore free to release program P only to his sponsor companies Ci , (where i is a number between 1 and the number of sponsors) for the first Y years as agreed with the sponsors, then to release it to the general public. In all releases P will be licensed under the GPL.
Now comes the tricky part: Researcher R cannot break the GPL and restrict the freedom of whomever releases the code to, including companies Ci. This means that any of the companies Ci are free to redistribute the code under the GPL to whomever they see fit, whenver they see fit. Should there be any stipulations in the sponsorship contract stating that Ci cannot make public the code of program P during the first Y years, then such stipulations are illegal, and thus null and void. As expected, GPL is all about maximizing the freedom of the users!
Bottom line, there are no contradictions between GPL and sponsorship contract stipulations stating that researcher R has to release program P only to his sponsors for the first Y years. The contradictions can appear only if the contract stipulates that the sponsor Ci is restricted from redistributing the program. In reality, once sponsor Ci has received program P with a GPL license, he is free to redistribute it to anybody, as long as it is doing so under the GPL too.
This situation may get sponsors worried that one of them will “betray” the confidentiality pact. However, such cases are practically impossible to occur, since sponsor Ci, by entering the sponsorship contract, practically stated that it is in his best interest that the number of companies that get access to P is restricted for the first Y years. If Ci releases program P to a non-sponsor before the lapse of the Y year term, then he is in effect acting against its own interest. It is conceivable that Ci would release P to a non-competitor N (such as a contractor for optimizing codes or doing associated research). Then N would have the legal right to redistribute program P under the GPL, and would apparently not be constrained by self-interest since he is not competing with the competitors of Ci. However, he will in practice not do so, since Ci has the power to publicize that N acted against the interests of his patron (even if in a manner that is not punishable by law). Ci can thus ensure that N will lose other clients if he further released P. It is hard to conceive what benefit N would have from releasing P that would counterbalance the loss of clients. The chain of plausible transmissions before the end of the first Y years stops here. The only margin case remaining is the one in which N has a one-off relationship with Ci or is closing the business and does not want any more clients, and has a reason to bother to release P. However, this is totally improbable.
Therefore, in practice researcher R can release program P under the GPL to his sponsors Ci only for the first Y years, and to the general public after that, and both R and Ci can rest assured that the program will not be made public before the lapse of the Y-year term. The only assumptions made were that everybody obeys the law and acts in his own self-interest. The only case that allows P to be released before the term is entirely improbable. For all practical purposes, researchers sponsored by industry through typical agreements can use GPL-ed libraries without any problems.

madagascar-0.9.6

May 14, 2008 Celebration No comments

A new stable version is out. In comparison with the previous stable release, madagascar-0.9.6 features multiple structural changes and new reproducible papers.

New insights into one-norm solvers from the Pareto curve

March 31, 2008 Documentation No comments

A new paper has been added to the collection of reproducible papers:

Abstract:
Geophysical inverse problems typically involve a trade off between data misfit and some prior. Pareto curves trace the optimal trade off between these two competing aims. These curves are commonly used in problems with two-norm priors where they are plotted on a log-log scale and are known as L-curves. For other priors, such as the sparsity-promoting one norm, Pareto curves remain relatively unexplored. We show how these curves lead to new insights into one-norm regularization. First, we confirm the theoretical properties of smoothness and convexity of these curves from a stylized and a geophysical example. Second, we exploit these crucial properties to approximate the Pareto curve for a large-scale problem. Third, we show how Pareto curves provide an objective criterion to gauge how different one-norm solvers advance towards the solution.

A journal requires tick labels on my plots to be oriented vertically and aligned on the left. How do I achieve that?

March 26, 2008 FAQ No comments

  • To place tick labels perpendicular to an axis (rather than parallel to it), use parallel#=n, where # is 1, 2, or 3.
  • .

  • To control the tick selection manually, use n#tic=, o#=num, and d#num=.
  • To control the label format, use format#= (the argument is a printf-style string)

The following example is provided in rsf/rsf/sfgraph:

Can I use Pylab figures in reproducible documents?

March 21, 2008 FAQ No comments

Now you can. See an example, where a figure originally generated with Matlab is replicated with Pylab.

To prepare your figures, follow the rules similar to those for Matlab and Mathematica figures:

  1. Create a directory called Pylab.
  2. Put figure-generating python scripts in this directory.
  3. Each script should have a .py suffix
  4. Each script should end with a command like
savefig('junk_py.eps');

(the name junk_py.eps is important.)