next up previous [pdf]

Next: Theory Up: Weiss & Shragge: GPU-based Previous: Weiss & Shragge: GPU-based

Introduction

Efficiently calculating 3D elastic wavefields and data with algorithms capable of handling large-scale acquisition geometries (e.g., wide-azimuth surveys) and/or complex anisotropic media [e.g., horizontal transversely isotropic (HTI) or orthorhombic symmetries] remains a significant computational challenge. While there are a number of commercially available modelling packages that satisfy these requirements, generally they are aimed at users of high-performance computing (HPC) facilities and require significant dedicated cluster computing resources that remain too expensive for smaller-scale research and development groups. Based on these arguments, there is a strong impetus for developing freely available, open-source, 3D elastic modeling solutions that can be run efficiently without the need for significant computing infrastructure.

Within the last half decade there has been a significant increase of interest in the exploration geophysics community of using general-purpose graphics processing units (GPUs) as accelerators for key seismic modelling, imaging and inversion kernels [e.g., Kuzma et al. (2007); Morton et al. (2008); Foltinek et al. (2009); Ohmer et al. (2005)]. Owing to their wider memory bandwidth and often two orders of magnitude more processors, albeit slower and lighter-weight than central processing unit (CPU) architectures, GPUs have emerged as an excellent parallel computing platform for problems characterized by a single-instruction multiple-data (SIMD) pattern. By allowing many thousands of GPU threads to run concurrently, significant speedups of SIMD-type problems on GPUs relative to CPUs have been documented in numerous studies in many branches of applied computer science (Nguyen, 2007; Pharr and Fernando, 2005).

For finite-difference time-domain (FDTD) solutions of wave equations (WEs) that form the basis of many seismic modeling, migration and velocity inversion applications, a number of studies have developed compact wave-equation FD stencils and algorithmic strategies that are well-suited for GPU implementation. Micikevicius (2009) and Abdelkhalek et al. (2009) discuss GPU implementations of the 3D acoustic WE. Komatitsch et al. (2010) discuss a GPU-based finite-element formulation of 3D anisotropic elastic wave propagation. Nakata et al. (2011) present results for solving the 3D isotropic elastic WE on multiple GPUs. These studies present impressive GPU runtimes of roughly one-tenth to one-twentieth of their corresponding multi-core CPU-based implementations.

When aiming to compute seismic data and/or wavefields for realistic 3D model sizes (i.e., $ N^3=1000^3$ where $ N$ is sample number in one dimension), though, the relatively small global memory available on an individual GPU card relative to a multi-core CPU chip ($ \le6$  GBytes versus $ \gg6$  GBytes, respectively) makes single-device GPU solutions of the 3D elastic WE intractable for realistic industry-sized models. This issue is compounded for 3D anisotropic media because the additional stiffness components (or equally anisotropic parameters) must also be held in memory. Fortunately, this issue can be addressed by parallel computing strategies that utilize domain decomposition to divide the computation across multiple GPU devices that work in concert through a communication protocol (Nakata et al., 2011; Micikevicius, 2009).

Starting with the ewefd2d and ewefd3d modeling codes that are freely available in the archives of the Madagascar project (Fomel, 2012), we develop 2D/3D GPU-based elastic wavefield modeling codes using NVIDIA's CUDA application programming interface (API). We similarly use a domain-decomposition strategy and present two different protocols for communicating between GPU devices. For individual nodes containing multiple GPUs (herein termed a consolidated node), we use direct peer-to-peer (P2P) communication that allows GPU devices situated on the same PCIe bus to communicate without requiring intermediate data staging in system memory. In a distributed computing environment the P2P strategy does not work because GPUs do not share a common PCIe bus , and we must turn to the comparatively slower message-passing interface (MPI) to handle inter-device communication.

Our goals in communicating the results of our modeling efforts - and the code itself - are twofold. Firstly, to present a 3D FDTD elastic modeling code capable of handing various transversely isotropic (TI) symmetries that can scale to computational domains sufficiently large to model realistic 3D acquisition geometries without requiring massive CPU clusters to finish modeling runs in a ``reasonable'' duration of time. Secondly, to release a set of open-source GPU-based modeling codes and reproducible examples through the Madagascar project for both educational purposes and to facilitate innovation and collaboration throughout the geophysics community.

We begin by discussing the governing equations for 3D elastic-wave propagation in the stress-stiffness formulation, and presenting the discritization approach adopted for a regular computational mesh. We then highlight the numerical algorithm and discuss a number of issues regarding the GPU implementation strategy, including domain decomposition and how we target multiple devices within consolidated and distributed computing environments. We then provide a number of reproducible 2D/3D modeling examples for different TI media, and present GPU-versus-CPU runtime and speedup metrics that demonstrate the utility of the GPU-based modeling approach.


next up previous [pdf]

Next: Theory Up: Weiss & Shragge: GPU-based Previous: Weiss & Shragge: GPU-based

2013-12-07