next up previous [pdf]

Next: IN SEARCH OF THE Up: Fomel: Conjugate directions Previous: Introduction

IN SEARCH OF THE MINIMUM

We are looking for the solution of the linear operator equation

\begin{displaymath}
{\bf d = A m}\;,
\end{displaymath} (1)

where ${\bf m}$ is the unknown model in the linear model space, ${\bf
d}$ stands for the given data, and ${\bf A}$ is the forward modeling operator. The data vector ${\bf
d}$ belongs to a Hilbert space with a defined norm and dot product. The solution is constructed by iterative steps in the model space, starting from an initial guess ${\bf
m}_0$. Thus, at the $n$-th iteration, the current model ${\bf m}_n$ is found by the recursive relation
\begin{displaymath}
{\bf m}_n = {\bf m}_{n-1} + \alpha_n {\bf s}_n\;,
\end{displaymath} (2)

where ${\bf s}_n$ denotes the step direction, and $\alpha_n$ stands for the scaling coefficient. The residual at the $n$-th iteration is defined by
\begin{displaymath}
{\bf r}_n = {\bf d - A m}_{n}\;.
\end{displaymath} (3)

Substituting (2) into (3) leads to the equation
\begin{displaymath}
{\bf r}_n = {\bf r}_{n-1} - \alpha_n {\bf A s}_n\;.
\end{displaymath} (4)

For a given step ${\bf s}_n$, we can choose $\alpha_n$ to minimize the squared norm of the residual
\begin{displaymath}
\Vert{\bf r}_n\Vert^2 = \Vert{\bf r}_{n-1}\Vert^2 -
2 \alp...
...,{\bf A s}_n\right) +
\alpha_n^2 \Vert{\bf A s}_n\Vert^2\;.
\end{displaymath} (5)

The parentheses denote the dot product, and $\Vert{\bf x}\Vert=\sqrt{({\bf x, x})}$ denotes the norm of $x$ in the corresponding Hilbert space. The optimal value of $\alpha_n$ is easily found from equation (5) to be
\begin{displaymath}
\alpha_n = {{\left({\bf r}_{n-1}, {\bf A s}_n\right)} \over
{\Vert{\bf A s}_n\Vert^2}}\;.
\end{displaymath} (6)

Two important conclusions immediately follow from this fact. First, substituting the value of $\alpha_n$ from formula (6) into equation (4) and multiplying both sides of this equation by ${\bf
r}_n$, we can conclude that
\begin{displaymath}
\left({\bf r}_n, {\bf A s}_n\right) = 0\;,
\end{displaymath} (7)

which means that the new residual is orthogonal to the corresponding step in the residual space. This situation is schematically shown in Figure 1. Second, substituting formula (6) into (5), we can conclude that the new residual decreases according to
\begin{displaymath}
\Vert{\bf r}_n\Vert^2 = \Vert{\bf r}_{n-1}\Vert^2 -
{{\left...
..., {\bf A s}_n\right)^2} \over
{\Vert{\bf A s}_n\Vert^2}}\;,
\end{displaymath} (8)

(``Pythagoras's theorem'' ), unless ${\bf r}_{n-1}$ and ${\bf A s}_n$ are orthogonal. These two conclusions are the basic features of optimization by the method of steepest descent. They will help us define an improved search direction at each iteration.

dirres
Figure 1.
Geometry of the residual in the data space (a scheme).
dirres
[pdf] [png] [xfig]


next up previous [pdf]

Next: IN SEARCH OF THE Up: Fomel: Conjugate directions Previous: Introduction

2013-03-03