Published as Geophysics, 84(1), F1-F15, (2019)

CuQ-RTM: A CUDA-based code package for stable and efficient Q-compensated reverse time migration

Yufeng Wang% latex2html id marker 4723
\setcounter{footnote}{1}\fnsymbol{footnote}% latex2html id marker 4723
ootnote}{1}<SPAN CLASS=% latex2html id marker 4724
\setcounter{{}{0}\fnsymbol{{}% latex2html id marker 4724
}{0}<SPAN CLASS=% latex2html id marker 4725
\setcounter{{}{0}\fnsymbol{{}% latex2html id marker 4725
}{0}<SPAN CLASS=% latex2html id marker 4726
\setcounter{{}{0}\fnsymbol{{}, Hui Zhou% latex2html id marker 4727
\setcounter{footnote}{1}\fnsymbol{footnote}% latex2html id marker 4727
ootnote}{1}<SPAN CLASS=% latex2html id marker 4728
\setcounter{{}{0}\fnsymbol{{}% latex2html id marker 4728
}{0}<SPAN CLASS=% latex2html id marker 4729
\setcounter{{}{0}\fnsymbol{{}, Xuebin Zhao% latex2html id marker 4730
\setcounter{footnote}{1}\fnsymbol{footnote}, Qingchen Zhang% latex2html id marker 4731
\setcounter{footnote}{2}\fnsymbol{footnote}, Poru Zhao% latex2html id marker 4732
\setcounter{footnote}{3}\fnsymbol{footnote}% latex2html id marker 4732
ootnote}{3}<SPAN CLASS=% latex2html id marker 4733
\setcounter{{}{0}\fnsymbol{{}, Xiance Yu% latex2html id marker 4734
\setcounter{footnote}{3}\fnsymbol{footnote}, and Yangkang Chen% latex2html id marker 4735
% latex2html id marker 4736
\setcounter{footnote}{1}\fnsymbol{footnote}State Key Laboratory of Petroleum Resources and Prospecting
China University of Petroleum
Fuxue Road 18th
Beijing, China, 102200
Research Center for Computational and Exploration Geophysics
State Key Laboratory of Geodesy and Earth’s Dynamics
Institute of Geodesy and Geophysics
Chinese Academy of Sciences
Wuhan, Hubei Province, China, 430077
China National Petroleum Corporation
Beijing, China.
School of Earth Sciences
Zhejiang University
Hangzhou, Zhejiang Province, China, 310027


Reverse time migration (RTM) in attenuating media should take the absorption and dispersion effects into consideration. The latest proposed viscoacoustic wave equation with decoupled fractional Laplacians (DFLs) facilitates separate amplitude compensation and phase correction in $Q$-compensated RTM ($Q$-RTM). However, intensive computation and enormous storage requirements of $Q$-RTM prevent it from being extended into practical application, especially for large-scale 2D or 3D case. The emerging graphics processing unit (GPU) computing technology, built around a scalable array of multithreaded Streaming Multiprocessors (SMs), presents an opportunity for greatly accelerating $Q$-RTM by appropriately exploiting GPU's architectural characteristics. We present the cu$Q$-RTM, a CUDA-based code package that implements $Q$-RTM based on a set of stable and efficient strategies, such as streamed CUFFT, checkpointing-assisted time-reversal reconstruction (CATRC) and adaptive stabilization. The cu$Q$-RTM can run in a multi-level parallelism (MLP) fashion, either synchronously or asynchronously, to take advantages of all the CPUs and GPUs available, while maintaining impressively good stability and flexibility. We mainly outline the architecture of the cu$Q$-RTM code package and some program optimization schemes. The speedup ratio on a single GeForce GTX760 GPU card relative to a single core of Intel Core i5-4460 CPU can reach above 80 in large-scale simulation. The strong scaling property of multi-GPU parallelism is demonstrated by performing $Q$-RTM on a Marmousi model with one to six GPU(s) involved. Finally, we further verify the feasibility and efficiency of the cu$Q$-RTM on a field data set. The “living” package is available from GitHub at, and peer-reviewed code related to this article can be found at