Confusing terminology for data covariance

Next: Hermeneutics Up: MULTIVARIATE SPECTRUM Previous: What should we optimize?

Confusing terminology for data covariance

Confusion often stems from the mean of the data $E(\bold d)$ .

An experimentalist would naturally believe that the expectation of the data is solely a function of the data, that it can be estimated by averaging data.

On the other hand, a theoretician's idea of the expectation of the observational data $E(\bold d)$ is that it is the theoretical data $\bold F\bold m$ , that the expectation of the data $E(\bold d)=\bold F\bold m$ is a function of the model. The theoretician thinks this way because of the idea of noise $\bold n=\bold F\bold m-\bold d$ as having zero mean.

Seismological data is highly complex but also highly reproducible. In studies like seismology, the world is deterministic but more complicated than our ability to model. Thus, as a practical matter, the discrepancy between observational data and theoretical data is more realistically attributed to the theoretical data. It is not adequately modeled and computed.

This superficial difference in viewpoint becomes submerged to a more subtle level by statistical textbooks that usually define weighting functions in terms of variances instead of spectra. This is particularly confusing with the noise spectrum $(\bold A_n\T \bold A_n)^{-1}$ . It is often referred to as the ``data covariance'' defined as $E[(\bold d-E(\bold d))(\bold d-E(\bold d))\T]$ . Clearly, the noise spectrum is the same as the data covariance only if we accept the theoretician's definition that $E(\bold d)=\bold F\bold m$ .

There is no ambiguity and no argument if we drop the word ``variance'' and use the word ``spectrum''. Thus, (1) the ``inverse noise spectrum'' is the appropriate weighting for data-space residuals; and (2) the ``inverse model spectrum'' is the appropriate model-space weighting. Theoretical expositions generally require these spectra to be given as ``prior information.'' In this book we see how, when the model space is a map, we can solve for the ``prior information'' along with everything else.

The statistical words ``covariance matrix'' are suggestive and appealing, but I propose not to use them because of the ambiguity of $E(\bold d)$ . For example, we understand that people who say ``data covariance'' intend the ``multivariate noise spectrum'' but we cannot understand their meaning of ``model covariance''. They should intend the ``multivariate model spectrum'' but that implies that $E(\bold m)=\bold 0$ , which seems wrong. Avoiding the word ``covariance'' avoids the problem.

Next: Hermeneutics Up: MULTIVARIATE SPECTRUM Previous: What should we optimize?

2013-07-26