k-Wave
A MATLAB toolbox for the time-domain
simulation of acoustic wave fields
- Getting Started
- Examples
- Initial Value Problems
- Example: Homogenous Propagation Medium
- Example: Using A Binary Sensor Mask
- Example: Defining A Sensor Mask By Opposing Corners
- Example: Loading External Image Maps
- Example: Heterogeneous Propagation Medium
- Example: Saving Movie Files
- Example: Recording The Particle Velocity
- Example: Defining A Gaussian Sensor Frequency Response
- Example: Comparison Of Modelling Functions
- Example: Setting An Initial Pressure Gradient
- Example: Simulations In One Dimension
- Example: Simulations In Three Dimensions
- Example: Photoacoustic Waveforms in 1D, 2D and 3D
- Time Varying Source Problems
- Example: Monopole Point Source In A Homogeneous Propagation Medium
- Example: Dipole Point Source In A Homogeneous Propagation Medium
- Example: Simulating Transducer Field Patterns
- Example: Steering A Linear Array
- Example: Snell's Law And Critical Angle Reflection
- Example: The Doppler Effect
- Example: Diffraction Through A Slit
- Example: Simulations In Three-Dimensions
- Sensor Directivity
- Example: Focussed Detector in 2D
- Example: Focussed Detector in 3D
- Example: Modelling Sensor Directivity in 2D
- Example: Modelling Sensor Directivity in 3D
- Example: Sensor Element Directivity in 2D
- Example: Focussed 2D Array with Directional Elements
- Photoacoustic Image Reconstruction
- Example: 2D FFT Reconstruction For A Line Sensor
- Example: 3D FFT Reconstruction For A Planar Sensor
- Example: 2D Time Reversal For A Line Sensor
- Example: 2D Time Reversal For A Circular Sensor
- Example: 3D Time Reversal For A Planar Sensor
- Example: 3D Time Reversal For A Spherical Sensor
- Example: Image Reconstruction With Directional Sensors
- Example: Image Reconstruction With Bandlimited Sensors
- Example: Iterative Image Improvement Using Time Reversal
- Example: Attenuation Compensation Using Time Reversal
- Example: Attenuation Compensation Using Time Variant Filtering
- Example: Automatic Sound Speed Selection
- Diagnostic Ultrasound Simulation
- Example: Defining An Ultrasound Transducer
- Example: Simulating Ultrasound Beam Patterns
- Example: Using An Ultrasound Transducer As A Sensor
- Example: Simulating B-mode Ultrasound Images
- Example: Simulating B-mode Images Using A Phased Array
- Numerical Analysis
- Example: Controlling The Absorbing Boundary Layer
- Example: Source Smoothing
- Example: Filtering A Delta Function Input Signal
- Example: Modelling Power Law Absorption
- Example: Modelling Nonlinear Wave Propagation
- Example: Optimising k-Wave Performance
- Using The C++ Code
- Elastic Wave Propagation
- Example: Explosive Source In A Layered Medium
- Example: Plane Wave Absorption
- Example: Shear Waves And Critical Angle Reflection
- Example: Simulations In Three Dimensions
- Functions - By Category
- Functions - Alphabetical List
- Release Notes
- License
k-Wave Toolbox |
Optimising k-Wave Performance Example
On this page… |
---|
Overview
This example demonstrates how to increase the computational performance of k-Wave using optional input parameters and data casting. A separate standardised benchmarking script benchmark
is also included within the k-Wave toolbox to allow computational times to be compared across different computers and GPUs.
Controlling input options
To investigate where the computational effort is spent during a k-Wave simulation, it is useful to use the inbuilt MATLAB profiler which examines the execution times for the various k-Wave and inbuilt functions. Running the profiler on a typical forward simulation using kspaceFirstOrder2D
with a Cartesian sensor mask and no optional inputs gives the following command line output (set example_number = 1
within the example m-file):
Running k-Wave simulation... start time: 25-Aug-2014 17:28:20 reference sound speed: 1500m/s dt: 3.9062ns, t_end: 9.4258us, time steps: 2414 input grid size: 512 by 512 grid points (10 by 10mm) maximum supported frequency: 38.4MHz smoothing p0 distribution... calculating Delaunay triangulation... precomputation completed in 2.7405s starting time loop... estimated simulation time 2min 2.4576s... simulation completed in 1min 55.4231s total computation time 1min 58.2665s
The corresponding profiler output is given below.
Aside from computations within the parent functions, it is clear the majority of the time is spent running ifft2
and fft2
. These are the fast Fourier transforms (FFTs) used to calculate spatial gradients as part of the k-space pseudospectral method used in k-Wave. Several seconds are also spent computing the Delaunay triangulation used for calculating the pressure over the Cartesian sensor mask using interpolation. The triangulation is calculated once during the precomputations and this time is encapsulated within the precomputation time printed to the command line. The Delaunay triangulation can be avoided by setting the optional input 'CartInterp'
to 'nearest'
, or by using a binary sensor mask. A Cartesian sensor mask can be converted to a binary mask using cart2grid
as shown below.
% convert Cartesian sensor mask to binary mask sensor.mask = cart2grid(kgrid, sensor.mask);
Several seconds are also spent running the various functions associated with the animated visualisation (imagesc
, newplot
, cla
, etc). The visualisation can be switched off by setting the optional input 'PlotSim'
to false
. Re-running the profile with these two changes gives the following command line output (set example_number = 2
within the example m-file):
Running k-Wave simulation... start time: 25-Aug-2014 17:45:15 reference sound speed: 1500m/s dt: 3.9062ns, t_end: 9.4258us, time steps: 2414 input grid size: 512 by 512 grid points (10 by 10mm) maximum supported frequency: 38.4MHz smoothing p0 distribution... precomputation completed in 0.75205s starting time loop... estimated simulation time 1min 44.8867s... simulation completed in 1min 41.0669s total computation time 1min 41.8285s
The precomputation time has been reduced, and the loop computation time has also been reduced by several seconds. The corresponding profiler output is given below.
Data casting
Even after the modifications above, the majority of the computational time is still spent computing the FFT and the point-wise multiplication of large matrices (within the function kspaceFirstOrder2D
). It is possible to decrease this burden by capitalising on MATLAB's use of overloaded functions for different data types. For example, computing an FFT of a matrix of single
type takes less time than for double
(the standard data format used within MATLAB). For almost all simulations, the loss in precision as a result of performing calculations in single
type is negligible, especially as the perfectly matched layer (PML) is only accurate to a few decimal points at best. Within kspaceFirstOrder1D
, kspaceFirstOrder2D
, and kspaceFirstOrder3D
, the data type used for the variables within the time loop can be controlled via the optional input parameter 'DataCast'
. Re-running the profile with 'DataCast'
set to 'single'
gives the following command line output (set example_number = 3
within the example m-file):
Running k-Wave simulation... start time: 25-Aug-2014 17:48:00 reference sound speed: 1500m/s dt: 3.9062ns, t_end: 9.4258us, time steps: 2414 input grid size: 512 by 512 grid points (10 by 10mm) maximum supported frequency: 38.4MHz smoothing p0 distribution... casting variables to single type... precomputation completed in 0.73794s starting time loop... estimated simulation time 1min 19.7407s... simulation completed in 1min 23.9156s total computation time 1min 24.6641s
The overall computational speed has been noticeably reduced. The corresponding profiler output is given below.
Running k-Wave on the GPU
The computational time can be further improved by using other data types, in particular those which force program execution on a graphics processing unit (GPU). In particular, the MATLAB parallel computing toolbox contains overloaded MATLAB functions (such as the FFT) that work with any NVIDIA CUDA-enabled GPU. These toolboxes utilise an interface developed by NVIDIA called the CUDA SDK which allows programs written in C to run on the GPU, and then a MEX interface to allow the C programs to be run from MATLAB. Within MATLAB, the execution is as simple as casting the variables to the required data type. For example, to use the parallel computing toolbox within k-Wave, the optional input parameter 'DataCast'
is set to 'gpuArray-single'
or 'gpuArray-double'
.
To illustrate, the command line output obtained by setting 'DataCast'
to 'gpuArray-single'
is given below (set example_number = 4
within the example m-file). For this particular hardware configuration, the computational speed is increased by more than 5 times compared to setting 'DataCast'
to 'single'
.
Running k-Wave simulation... start time: 25-Aug-2014 17:38:26 reference sound speed: 1500m/s dt: 3.9062ns, t_end: 9.4258us, time steps: 2414 input grid size: 512 by 512 grid points (10 by 10mm) maximum supported frequency: 38.4MHz smoothing p0 distribution... casting variables to gpuArray type... precomputation completed in 0.80705s starting time loop... estimated simulation time 20.205s... GPU memory used: 0.12216 GB (of 5.9997 GB) simulation completed in 14.8845s total computation time 15.696s
The corresponding profiler output is given below. The majority of time is now spent on computing matrix operations and the FFT on the GPU (within the function kspaceFirstOrder2D
). Further details on the speed-up obtained when using different GPUs is given in benchmark
.
Multicore support
The command line and profile outputs shown here were generated using MATLAB R2013a. Some earlier MATLAB versions do not include multicore support for parallelisable functions such as the FFT. If using a very old version of MATLAB, it is possible to get a noticeable increase in computational speed simply by changing MATLAB versions.
Modelling Nonlinear Wave Propagation | Using the C++ Code |
© 2009-2014 Bradley Treeby and Ben Cox.