next up previous [pdf]

Next: Toy examples Up: Stationary and nonstationary regression Previous: Prediction-error filtering

Nonstationary regression

Non-stationary regression uses a definition similar to equation 1 but allows the coefficients $a_k$ to change with $\mathbf{x}$. The error turns into
\begin{displaymath}
e(\mathbf{x}) = m(\mathbf{x}) -
\sum_{k=1}^{N} b_k(\mathbf{x})\,s_k(\mathbf{x})\;,
\end{displaymath} (2)

and the problem of its minimization becomes ill-posed, because one can get more unknown variables than constraints. The remedy is to include additional constraints that limit the allowed variability of the estimated coefficients.

The classic regularization method is Tikhonov's regularization (Tikhonov, 1963; Engl et al., 1996), which amounts to minimization of the following functional:

\begin{displaymath}
F[\mathbf{b}] = \Vert e(\mathbf{x})\Vert^2 +
\epsilon^2...
...k=1}^{N} \Vert\mathbf{D}\left[b_k(\mathbf{x})\right]\Vert^2\;,
\end{displaymath} (3)

where $\mathbf{D}$ is the regularization operator (such as the gradient or Laplacian filter) and $\epsilon$ is a scalar regularization parameter. If $\mathbf{D}$ is a linear operator, least-squares estimation reduces to linear inversion
\begin{displaymath}
\mathbf{b} = \mathbf{A}^{-1}\,\mathbf{d}\;,
\end{displaymath} (4)

where $\mathbf{b} = \left[\begin{array}{cccc}b_1(\mathbf{x}) & b_2(\mathbf{x}) & \cdots & b_N(\mathbf{x})\end{array}\right]^T$,
$\mathbf{d} = \left[\begin{array}{cccc}s_1(\mathbf{x})\,m(\mathbf{x}) & s_2(\mat...
...})\,m(\mathbf{x}) & \cdots & s_N(\mathbf{x})\,m(\mathbf{x})\end{array}\right]^T$, and the elements of matrix $\mathbf{A}$ are
\begin{displaymath}
A_{ij}(\mathbf{x}) = s_i(\mathbf{x})\,s_j(\mathbf{x}) + \epsilon^2\,\delta_{ij}\,\mathbf{D}^T\,\mathbf{D}\;.
\end{displaymath} (5)

Shaping regularization (Fomel, 2007b) formulates the problem differently. Instead of specifying a penalty (roughening) operator $\mathbf{D}$, one specifies a shaping (smoothing) operator $\mathbf{S}$. The regularized inversion takes the form

\begin{displaymath}
\mathbf{b} = \widehat{\mathbf{A}}^{-1}\,\widehat{\mathbf{d}}\;,
\end{displaymath} (6)

where

\begin{displaymath}
\widehat{\mathbf{d}} = \left[\begin{array}{cccc}\mathbf{S}\l...
...t[s_N(\mathbf{x})\,m(\mathbf{x})\right]\end{array}\right]^T\;,
\end{displaymath}

the elements of matrix $\widehat{\mathbf{A}}$ are
\begin{displaymath}
\widehat{A}_{ij}(\mathbf{x}) = \lambda^2\,\delta_{ij} +
...
...thbf{x})\,s_j(\mathbf{x}) -
\lambda^2\,\delta_{ij}\right]\;,
\end{displaymath} (7)

and $\lambda$ is a scaling coefficient.The main advantage of this approach is the relative ease of controlling the selection of $\lambda$ and $\mathbf{S}$ in comparison with $\epsilon$ and $\mathbf{D}$. In all examples of this paper, I define $\mathbf{S}$ as Gaussian smoothing with an adjustable radius and choose $\lambda$ to be the median value of $s_i(\mathbf{x})$. As demonstrated in the next section, matrix $\widehat{\mathbf{A}}$ is typically much better conditioned than matrix $\mathbf{A}$, which leads to fast inversion with iterative algorithms.

In the case of $N=1$ (regularized division of two signals), a similar construction was applied before to define local seismic attributes (Fomel, 2007a).


next up previous [pdf]

Next: Toy examples Up: Stationary and nonstationary regression Previous: Prediction-error filtering

2013-07-26