Optimal weighting for rank reduction

The observed data $\mathbf{M}$ can be expressed as:

$\displaystyle \mathbf{M}=\mathbf{S}+\mathbf{N},$ (9)

where $\mathbf{S}$ and $\mathbf{N}$ denote the signal and noise components. It is worth noting that for the derivation convenience, we assume the noise component $\mathbf{N}$ to be white. Although not exactly correct for seismic data, the denoising model also works for seismic data with band-limited noise. In the following analysis, we assume $\mathbf{M}$ and $\mathbf{N}$ have full rank and $\mathbf{S}$ has deficient rank. $\mathbf{M}$, $\mathbf{S}$ and $\mathbf{N}$ are all of size $I\times J$.

The singular value decomposition (SVD) of $\mathbf{S}$ can be represented as:

$\displaystyle \mathbf{S} = [\mathbf{U}_1^{S}\quad \mathbf{U}_2^{S}]\left[\begin...
...egin{array}{c}
(\mathbf{V}_1^{S})^H\\
(\mathbf{V}_2^{S})^H
\end{array}\right].$ (10)

Because of the deficient rank, the matrix $\mathbf{S}$ can be written as:

$\displaystyle \mathbf{S}=\mathbf{U}_1^S\boldsymbol{\Sigma}_1^S(\mathbf{V}_1^S)^H.$ (11)

The singular value decomposition (SVD) of $\mathbf{M}$ can be represented as:

$\displaystyle \mathbf{M} = [\mathbf{U}_1^{M}\quad \mathbf{U}_2^{M}]\left[\begin...
...egin{array}{c}
(\mathbf{V}_1^{M})^H\\
(\mathbf{V}_2^{M})^H
\end{array}\right].$ (12)

The rank-reduction method by TSVD refers to

$\displaystyle \tilde{\mathbf{M}}=\mathbf{U}_1^M\boldsymbol{\Sigma}_1^M(\mathbf{V}_1^M)^H.$ (13)

Here, $\tilde{\mathbf{M}}$ is the estimated signal component via TSVD. We assume the rank of $\mathbf{M}$ is $N$, and thus the size of $\mathbf{U}_1^M$ is $I\times N$. $\boldsymbol{\Sigma}_1^M$ is of size $N\times N$. $\mathbf{V}_1^M$ is of size $J\times N$.

However, the $\tilde{\mathbf{M}}$ is still a mixture of the signal and noise subspaces. Combining equations 9, 10, and 11, it is easy to derive that (Chen et al., 2016c)

$\displaystyle \tilde{\mathbf{M}}=\mathbf{S}+\mathbf{U}_1^S(\mathbf{U}_1^S)^H\mathbf{N},$ (14)

where we can see that the estimated signal component via TSVD is still corrupted by the noise component, which is the projection of the noise component to the signal component.

This problem can be partially alleviated by a singular value thresholding step:

$\displaystyle \tilde{\tilde{\mathbf{M}}}=\mathbf{U}_1^M\tilde{\boldsymbol{\Sigma}}_1^M(\mathbf{V}_1^M)^H,$ (15)

where $\tilde{\boldsymbol{\Sigma}}_1^M$ is the thresholded singular values such that:

$\displaystyle \tilde{\boldsymbol{\Sigma}}_1^M = \mathbf{T}(\boldsymbol{\Sigma}_1^M,\tau).$ (16)

where $\mathbf{T}$ denotes a singular value thresholding operator and $\tau$ denotes the threshold. However, defining an optimal threshold is inconvenient and sometimes even difficult. It is because the noise component contributes differently to each singular value and a constant threshold is not plausible to deal with the inhomogeneous noise distribution.

Thus, in this paper, we propose an adaptive weighting algorithm to optimally define the singular values in order to best reconstruct the signal component. We introduce a weighting operator $\mathbf{W}$ to adjust the $\boldsymbol{\Sigma}_1^M$ after applying SVD to the observed noisy signal. To calculate the optimal weighting operator is equivalent to solving the following optimization problem:

$\displaystyle \hat{\mathbf{W}} = \arg \min_{\mathbf{W}} \parallel \mathbf{U}_1^...
... \mathbf{U}_1^M\mathbf{W}\boldsymbol{\Sigma}_1^M(\mathbf{V}_1^M)^H \parallel_F.$ (17)

By an optimally weighted combination of estimated left and right singular vectors, the optimization problem 17 can yield the optimally adjusted singular values to obtain the closest low-rank estimates of signal component.

The optimal solution for the optimization problem was given in Benaych-Georges and Nadakuditi (2012) and Nadakuditi (2013):

$\displaystyle \hat{\mathbf{W}} =$   diag$\displaystyle (\hat{w}_1,\hat{w}_2,\cdots,\hat{w}_N),$ (18)

where

$\displaystyle \hat{w}_i=\left(-\frac{2}{\sigma_i^M}\frac{D(\sigma_i^M;\boldsymbol{\Sigma})}{D'(\sigma_i^M;\boldsymbol{\Sigma})}\right).$ (19)

$\sigma_i^M$ denotes the $i$th diagonal entry of $\boldsymbol{\Sigma}_1^M$ ( $\boldsymbol{\Sigma}_1^M\in\mathbb{R}^{N\times N}$). $D$ denotes the D-transform:

\begin{displaymath}\begin{split}
D(\sigma;\boldsymbol{\Sigma})&= \frac{1}{N} \te...
...hbf{I}-\boldsymbol{\Sigma}^2)^{-1}\right)\right]^2
\end{split},\end{displaymath} (20)

where $D'$ is the derivative of $D$ with respect to $\sigma$, and Tr$(\cdot)$ is the trace of the input:

\begin{displaymath}\begin{split}
D'(\sigma;\boldsymbol{\Sigma})&= 2\left[\frac{1...
...athbf{I}-\boldsymbol{\Sigma}^2)^{-2}\right)\right].
\end{split}\end{displaymath} (21)

Using the optimally estimated weighting operator expressed in equation 18, we can expressed the optimally estimated signal component as:

$\displaystyle \hat{\mathbf{M}}=\mathbf{U}_1^M\hat{\mathbf{W}}\boldsymbol{\Sigma}_1^M(\mathbf{V}_1^M)^H.$ (22)

The optimal weighting strategy of the singular values is a substitute to directly truncating the singular values as used in the traditional rank-reduction method. An early investigation of the strategy to improve the rank-reduction performance in seismic data denoising and reconstruction is presented in Aharchaou et al. (2017). Besides, there are also a number of alternatives to these two approaches (weighting and truncating) such as automatic rank determination (Trickett, 2015; Gavish and Donoho, 2014) and randomized approaches such as the randomized SVD and randomized QR factorizations (Cheng and Sacchi, 2014).


2020-12-06