Double sparsity dictionary

Next: Learning DSD in data Up: Theory Previous: Theory

Double sparsity dictionary

Sparse representation for a given signal $\mathbf{d}$ can be formulated as the following optimization problem (Ophir et al., 2011; Rubinstein et al., 2010):

$\displaystyle \hat{\mathbf{m}} = \arg \min_{\mathbf{m}} \parallel\mathbf{m} \parallel_0, s.t. \parallel \mathbf{d}-\mathbf{A}\mathbf{m}\parallel_2 \le \epsilon.$

(1)

Here, $\mathbf{m}$ is the sparse representation of the observed signal $\mathbf{d}$ , $\mathbf{A}$ is the sparsity-promoting transform (or dictionary), and $\hat{\mathbf{m}}$ is the estimated sparse representation. $\parallel \cdot \parallel_0$ and $\parallel \cdot \parallel_2$ denote the

norm and

norm of a vector, respectively, and $\epsilon$ is the error tolerance. The double sparsity model states that the sparsity promoting transform $\mathbf{A}$ can be implemented as a cascaded combination of two dictionaries:

$\displaystyle \mathbf{A} = \mathbf{BW},$

(2)

where $\mathbf{B}$ is a base analytic transform and $\mathbf{W}$ is an adaptive learning-based dictionary. The base dictionary can be a fixed-basis sparsity-promoting dictionary, while the adaptive dictionary can be adaptively learned over the compact representation provided by $\mathbf{B}$ .

Compared with a fixed-basis dictionary, the double sparsity model as defined in equation 2 can provide more adaptability by modifying $\mathbf{W}$ . It acts as an extension to the initial base dictionary $\mathbf{B}$ . Compared with learning-based dictionary, the double sparsity model can be more efficient and stable because the data to be learned have a relatively more compact representation and thus the initial base transform provides a structure regularizer to the learning process. The double-sparsity transform domain coefficients $\mathbf{m}$ obtain two-level sparsity: base transform provides the first-level sparsity and the learning dictionary provides the second-level sparsity. In this respect, the philosophy of our work is also similar to the bandlets (LePennec and Mallat, 2005): rather than seeking the direct transform to get the ultimate sparsification of the input signals, one can first use an existing transform that does reasonably well in that respect, and then add another layer of processing that squeezes more over the already simplified signals (Ophir et al., 2011). In the next two sections, we introduce two models to learn the double sparsity dictionary (DSD) for seismic data: synthesis-based DSD and analysis-based DSD.

Next: Learning DSD in data Up: Theory Previous: Theory

2016-02-27