Random noise can seriously affect the stability and precision of seismic data processing and imaging steps including inversion-based migration, full waveform inversion, AVO inversion, and post-stack seismic interpretation. Thus, its removal is very important (Li et al., 2020; Amani et al., 2017; Galbraith, 1991; Zhao et al., 2018).

The sparse transform based denoising methods assume the seismic data to be sparse in the transform domain and the spreading noise in the transform domain can be suppressed by applying a thresholding operation. These methods include those based on the Fourier transform (Bracewell and Bracewell, 1986), Radon transform (Beylkin, 1987), seislet transform (Chen and Fomel, 2018; Fomel and Liu, 2010), curvelet transform (Candès et al., 2006), wavelet transform (Mousavi et al., 2016; Gilles, 2013), dreamlet transform (Huang et al., 2018), dictionary-learning based sparse transform (Siahsar et al., 2017b,a; Zu et al., 2019; Zhou et al., 2016). Prediction based methods are another group of popular denoising methods, such as time-space domain prediction method (Abma and Claerbout, 1995), frequency-space predictive filtering (Canales, 1984), regularized non-stationary prediction method (Liu et al., 2012; Liu and Chen, 2013), and the polynomial fitting method (Liu et al., 2011). The decomposition based denoising methods consider the separability of seismic signal and random noise and attempt to extract useful information from the principal components of the noisy data. Typical methods include the empirical-mode decomposition (EMD) related methods (Huang et al., 1998), e.g., the ensemble EMD (Wu and Huang, 2009), complete ensemble empirical-mode decomposition (Colominas et al., 2012), improved complete ensemble empirical-mode decomposition (Colominas et al., 2012), singular-value decomposition (SVD) related methods (Bekara and van der Baan, 2007), and non-stationary decomposition with regularization (Li et al., 2018).

In this paper, we aim to improve on the multi-dimensional Cadzow filter applied to constant-frequency slices (Cadzow, 1988; Trickett, 2008), also referred to as multichannel singular spectrum analysis (MSSA) (Oropeza and Sacchi, 2011; Qiao et al., 2016; Chiu, 2013). This filter has been widely adopted for seismic data processing due to its good performance (Gao et al., 2017; Ginolhac et al., 2013; Wang et al., 2018). This algorithm is based on lowrank (LR) matrix approximation. The main requirement of the LR methods is the low rank of the frequency-domain Hankel matrix. The rank of the Hankel matrix equals the number of distinct dips (Wang et al., 2020; Chen et al., 2016; Oropeza and Sacchi, 2011). However, the real seismic data are complicated, where the linear-events assumption is not met. To apply the LR methods, one needs to divide the field data into small time-space windows for separate processing (Zu et al., 2017; Zhang et al., 2017). Nevertheless, it will cause another problem in local processing windows, i.e., if we use a fixed rank for all the local windows, then it is possible that this rank is too large for some windows (so that the LR approximation keeps too much noise) and is too small for some other windows (such that the method loses the useful information). Thus, to optimize the denoising performance, it is desirable to find the appropriate rank for each local window. The rank can also be adaptively selected according to the ratio of two consecutive singular-values (Wu and Bai, 2018). However, all these strategies work only when the data structure is not complex and may not be applicable when noise is extremely strong.

In practice, the predefined rank is usually set large enough to preserve the useful signals without damaging weak and curving energy. The selection of a large rank could leave significant residual noise in the filtered data. One possible solution is to use a threshold to further suppress the residual noise after the rank-reduction step. This second-step thresholding can be understood conveniently in the framework of nuclear norm minimization (Zhou and Zhang, 2017). This strategy brings another challenging question on how to optimally choose the threshold for damping the residual noise. Considering that the thresholding step can be interpreted as a re-weighting process for the singular-values, we apply an adaptive singular-value weighting method following Nadakuditi (2013). This weighting method is an adaptive to shrinkage the singular values as compared with the direct truncation strategy. Aharchaou et al. (2017) introduced this adaptive weighting method to seismic data reconstruction problems. Some other alternatives to the presented approach in this paper, such as those automated methods in (Gavish and Donoho, 2014) or Trickett (2015). One can further improve the optimal weighting based rank-reduction method by cascading the weighting strategy into the damed rank-reduction framework. The resulting algorithm is referred to as the optimal damped rank-reduction (ODRR) method. It has the potential to make the damped rank-reduction method effective for a wide range of rank selection in an adaptive way. We use different synthetic and field seismic datasets to show the advantages of the presented algorithm.