Optimally damped rank-reduction method

A shortcoming of the traditional rank-eduction and damped rank-reduction methods is that the output perfomance is quite sensitive to the input parameter

. The rank parameter varies greatly for different datasets, especially for field datasets. In order to alleviate the influence of the rank parameter, we take advantage of an adaptive singular-value weighting algorithm developed by Benaych-Georges and Nadakuditi (2012) and Nadakuditi (2013). An adaptive singular-value weighting matrix can be obtained by solving the following problem:

$\displaystyle \hat{\boldsymbol{\omega}}=\arg\min_{\boldsymbol{\omega}} \paralle... ...i^S(\mathbf{v}_i^S)^H - \omega_i\sigma_i\mathbf{u}_i\mathbf{v}_i^H \parallel_F,$

(11)

where $\mathbf{u}_i^S$ and $\mathbf{v}_i^S$ denote

th left and right singular vectors corresponding to the estimated signal component, $\sigma_i$ denotes the

th singular value in $\boldsymbol{\Sigma}$ . $\boldsymbol{\omega}$ is the adaptive weight vector, which can be used to construct the adaptive weighting matrix $\hat{\mathbf{W}}=$ diag $(\hat{\boldsymbol{\omega}})$ . The solution to the optimization problem can be obtained as:

$\displaystyle \hat{w}_i=\left(-\frac{2}{\sigma_i}\frac{\mathcal{D}(\sigma_i;\boldsymbol{\Sigma})}{\mathcal{D}'(\sigma_i;\boldsymbol{\Sigma})}\right).$

(12)

where $\mathcal{D}$ represents a transform expressed as:

$\displaystyle \mathcal{D}(\sigma;\boldsymbol{\Sigma})= \frac{1}{N^2}$ Tr $\displaystyle \left(\sigma(\sigma^2\mathbf{I}-\boldsymbol{\Sigma}\boldsymbol{\Sigma}^H)^{-1}\right)$ Tr $\displaystyle \left(\sigma(\sigma^2\mathbf{I}-\boldsymbol{\Sigma}^H\boldsymbol{\Sigma})^{-1}\right),$

(13)

and $\mathcal{D}'$ denotes its derivative. The expression of $\mathcal{D}'$ can be expressed as:

$\begin{displaymath}\begin{split} D'(\sigma;\boldsymbol{\Sigma})&= 2\left[\frac{1... ...athbf{I}-\boldsymbol{\Sigma}^2)^{-2}\right)\right]. \end{split}\end{displaymath}$

(14)

The symbol Tr $(\cdot)$ denotes the trace of the input. The trace of a square matrix $\mathbf{X}$ is defined to be the sum of elements on the main diagonal of $\mathbf{X}$ .

Tr $\displaystyle (\mathbf{X}) = \sum_{i=1}^{N}X_{i,i},$

(15)

where $X_{i,i}$ denotes the main diagonal elements of $\mathbf{X}$ .

The adaptive weighting operator can be applied to make the traditional rank-reduction method adaptive:

$\displaystyle \hat{\mathbf{S}}=\mathbf{U}_N\hat{\mathbf{W}}\boldsymbol{\Sigma}_N\mathbf{V}_N^H.$

(16)

To incorporate the adaptive weighting operator into the damped rank-reduction framework, we can introduce intermediate variables as

$\displaystyle \mathbf{U}_N^{Q} = \mathbf{U}_N,$	(17)
$\displaystyle \boldsymbol{\Sigma}_N^{Q} = \hat{\mathbf{W}}\boldsymbol{\Sigma}_N,$	(18)
$\displaystyle \mathbf{V}_N^{Q} = \mathbf{V}_N,$	(19)

then equation 16 turns into

$\displaystyle \hat{\mathbf{S}}=\mathbf{U}_N^{Q}\boldsymbol{\Sigma}_N^{Q}\mathbf{V}_N^{Q}.$

(20)

It can be derived that the damping formula (equation 8) also holds for equation 20 but with the damping threshold matrix expressed as

$\displaystyle \boldsymbol{\Gamma} = \hat{\sigma}^K\left(\boldsymbol{\Sigma_N^Q}\right)^{-K},$

(21)

where the subscript

denotes a sufficiently large rank parameter, as required by the derivations detailed in Chen et al. (2019b). The resulted final form of the optimally damped rank-reduction method then can be expressed as

$\displaystyle \hat{\mathbf{S}} = \mathbf{U}_N\mathbf{P}\hat{\mathbf{W}}\boldsymbol{\Sigma}_N\mathbf{V}_N^H.$

(22)

The algorithm workflow for the optimally damped rank-reduction method (ODRR) is outlined in Algorithm 3. Due to the existence of a weighting matrix in the proposed method, it is more convenient to choose the input rank parameter

empirically in practice. As mentioned previously, the rank parameter

is in practice set to a relatively large value to forestall damaging the signal. But in the proposed algorithm, even if a large

is used, the algorithm can adaptively shrink the singular-values. Thus, its performance is not sensitive to the input rank parameter, in contrast to other related approaches (Oropeza and Sacchi, 2011). Because of the insensitivity, one can use a sufficiently large

in processing complicated datasets without leaving strong residual noise in the result. The convenience in tuning parameters makes the rank-reduction related methods more computationally feasible in large-scale data processing, e.g., the 5D reconstruction problem (Chen et al., 2019b), since one no longer needs to tune the parameters many times while one trial is already computationally demanding. A glossary describing the main mathematical notations are presented in Table 1. The three methods are all variations of how the singular-values are thresholded. Simply speaking, the DRR method improves the RR method by introducing a damping operation and the ODRR method improves DRR method by further introducing a weighting operation.

**Table 1:** Glossary of main mathematical notations used in this paper.
Symbols	Meanings
$\mathbf{D}(t/w,x,y)$	3D noisy seismic data
$\mathbf{D}(w_0)$ or $\mathbf{D}$	abbreviated notation of 2D frequency slice
$\mathbf{R}_i$	th Hankel matrix
$\mathbf{M}$	block Hankel matrix
$\mathbf{S}$	signal component
$\hat{\mathbf{S}}$	estimated signal component
$\mathbf{U}$	left singular vector matrix
$\boldsymbol{\Sigma}$	singular-value matrix
$\mathbf{V}$	right singular vector matrix
$\hat{\mathbf{W}}$	weighting matrix
$\mathbf{P}$	damping matrix
$\boldsymbol{\Gamma}$	damping threshold matrix
$\mathbf{Q}$	temporary variable
$\mathcal{D}$	D-transform
$\mathcal{H}$	Hankelization operator
$\mathcal{T}$	thresholding operator
$\mathcal{A}$	averaging operator
	rank
	damping factor

**Table 2:** Comparison of SNRs in dB for different rank-reduction based methods. RR denotes the rank-reduction method. DRR denotes the damped rank-reduction method. ODRR denotes the optimally damped rank-reduction method.
Tests	Noisy (dB)	RR (dB)	DRR (dB)	ODRR (dB)
Linear synthetic (N=3)	-8.39	6.57	10.29	11.27
Linear synthetic (N=6)	-8.39	3.45	8.35	10.86
Hyperbolic synthetic (N=10)	-2.17	8.27	9.58	9.65
Hyperbolic synthetic (N=20)	-2.17	7.04	10.08	11.00

**Table 3:** Comparison of computational costs in seconds for different rank-reduction based methods.
Tests	FK (s)	RR (s)	DRR (s)	ODRR (s)
Linear synthetic (N=3)	0.13	2.27	2.45	2.43
Linear synthetic (N=6)	0.17	3.02	3.04	3.17
Hyperbolic synthetic (N=10)	0.43	1.69	1.71	1.83
Hyperbolic synthetic (N=20)	0.49	2.75	2.69	2.83

2020-12-06