<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="bbPress/1.0.2" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>k-Wave User Forum &#187; Forum: Graphics Processing Unit (GPU) - Recent Posts</title>
		<link>http://www.k-wave.org/forum/forum/graphics-processing-unit-gpu</link>
		<description>Support for the k-Wave MATLAB toolbox</description>
		<language>en-US</language>
		<pubDate>Mon, 18 May 2026 08:47:20 +0000</pubDate>
		<generator>http://bbpress.org/?v=1.0.2</generator>
		<textInput>
			<title><![CDATA[Search]]></title>
			<description><![CDATA[Search all topics from these forums.]]></description>
			<name>q</name>
			<link>http://www.k-wave.org/forum/search.php</link>
		</textInput>
		<atom:link href="http://www.k-wave.org/forum/rss/forum/graphics-processing-unit-gpu" rel="self" type="application/rss+xml" />

		<item>
			<title>ejoa1234561 on "Linux vs windows GPU memory allocation"</title>
			<link>http://www.k-wave.org/forum/topic/linux-vs-windows-gpu-memory-allocation#post-9218</link>
			<pubDate>Wed, 28 May 2025 04:03:52 +0000</pubDate>
			<dc:creator>ejoa1234561</dc:creator>
			<guid isPermaLink="false">9218@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Hello,&#60;/p&#62;
&#60;p&#62;I am running this code in 2 workstations.&#60;br /&#62;
The first workstation is a Windows 10 with a GPU ADA A4000 of 12gb VRAM.&#60;br /&#62;
The second is a cluster linux with 2 GPUs A100 of 48gb of VRAM each.&#60;/p&#62;
&#60;p&#62;In the W10 the code can allocate until t_end = 0.12; (11.48gb of 12gb) without problems. However, when I try to do the same in the cluster the VRAM only accepts (t_end=0.1) using 39.8gb of 40gb. What is happening? Linux manages different the VRAM? Is there a way to fix this? I installed the binaries in both cases and the problem is on pstdElastic2D(kgrid, medium, source, sensor, input_args{:}) function.&#60;/p&#62;
&#60;p&#62;clc; clear; close all;&#60;br /&#62;
gpuDevice(1)&#60;/p&#62;
&#60;p&#62;format compact;&#60;br /&#62;
set(0,'defaultAxesFontSize',18);&#60;br /&#62;
set(groot, 'defaultAxesTickLabelInterpreter','tex');&#60;br /&#62;
set(groot, 'defaultLegendInterpreter','tex');&#60;br /&#62;
set(0,'defaulttextInterpreter','tex');&#60;/p&#62;
&#60;p&#62;DATA_CAST = 'gpuArray-single'; &#60;/p&#62;
&#60;p&#62;%% Create the computational grid&#60;br /&#62;
Nx = 60;           % number of grid points in the x direction&#60;br /&#62;
Ny = 100;           % number of grid points in the y direction&#60;br /&#62;
dx = 2e-3;          % grid spacing in x [m]&#60;br /&#62;
dy = 2e-3;          % grid spacing in y [m]&#60;br /&#62;
kgrid = kWaveGrid(Nx, dx, Ny, dy);&#60;/p&#62;
&#60;p&#62;%% Define medium properties&#60;br /&#62;
medium.sound_speed_compression = 1500 * ones(Nx, Ny);   % [m/s] realistic compressional speed&#60;br /&#62;
medium.sound_speed_shear = 2.5 * ones(Nx, Ny);            % [m/s] for gelatin-like shear&#60;br /&#62;
medium.density = 1000 * ones(Nx, Ny);                   % [kg/m^3]&#60;/p&#62;
&#60;p&#62;% Add boundary&#60;br /&#62;
rigid_thickness = 5;&#60;/p&#62;
&#60;p&#62;%AIR&#60;br /&#62;
medium.density(:,1:rigid_thickness) = 12;&#60;br /&#62;
% medium.density(:,end-rigid_thickness:end) = 1.2;&#60;br /&#62;
medium.density(1:rigid_thickness,:) = 12;&#60;br /&#62;
medium.density(end-rigid_thickness:end,:) = 12;&#60;/p&#62;
&#60;p&#62;medium.sound_speed_shear(:, 1:rigid_thickness) = 1;&#60;br /&#62;
% medium.sound_speed_shear(:,end-rigid_thickness:end) = 0.1;&#60;br /&#62;
medium.sound_speed_shear(1:rigid_thickness,:) = 1;&#60;br /&#62;
medium.sound_speed_shear(end-rigid_thickness:end,:) = 1;&#60;/p&#62;
&#60;p&#62;medium.sound_speed_compression(:, 1:rigid_thickness) = 1500;&#60;br /&#62;
% medium.sound_speed_compression(:,end-rigid_thickness:end) = 340;&#60;br /&#62;
medium.sound_speed_compression(1:rigid_thickness,:) = 1500;&#60;br /&#62;
medium.sound_speed_compression(end-rigid_thickness:end,:) = 1500;&#60;/p&#62;
&#60;p&#62;%TX&#60;br /&#62;
medium.density(1:rigid_thickness,40:60) = 12;&#60;br /&#62;
medium.sound_speed_shear(1:rigid_thickness,40:60) = 1000;&#60;br /&#62;
medium.sound_speed_compression(1:rigid_thickness,40:60) = 1500;&#60;/p&#62;
&#60;p&#62;% Attenuation&#60;br /&#62;
medium.alpha_coeff_compression = 0.05;    % [dB/(MHz^2 cm)]&#60;br /&#62;
medium.alpha_coeff_shear = 1000;           % [dB/(MHz^2 cm)]&#60;br /&#62;
% medium.alpha_power_shear = 1.3;&#60;/p&#62;
&#60;p&#62;%% Time array (safe dt based on c_max)&#60;br /&#62;
t_end = 0.1; %0.12;  % [s] to allow steady-state observation&#60;br /&#62;
c_max = max([max(medium.sound_speed_compression(:)), max(medium.sound_speed_shear(:))]);&#60;br /&#62;
cfl = 0.3;&#60;br /&#62;
kgrid.makeTime(c_max, cfl, t_end);&#60;/p&#62;
&#60;p&#62;%% Define source&#60;br /&#62;
cx1 = Nx/2;&#60;br /&#62;
cy1 = Ny-14;&#60;br /&#62;
radius = 10;&#60;br /&#62;
plot_disc = 'true';&#60;br /&#62;
disc1 = makeDisc(Nx, Ny, cx1, cy1, radius, plot_disc);&#60;br /&#62;
source.u_mask = disc1;&#60;/p&#62;
&#60;p&#62;source_freq = 200; % [Hz]&#60;br /&#62;
source_mag = 1e-6;&#60;br /&#62;
source_signal = source_mag * sin(2 * pi * source_freq * kgrid.t_array);&#60;br /&#62;
source.uy= source_signal;&#60;/p&#62;
&#60;p&#62;% source_points = find(source.u_mask);&#60;br /&#62;
% num_sources = length(source_points);&#60;br /&#62;
% source.ux = zeros(length(kgrid.t_array), num_sources);&#60;br /&#62;
% for i = 1:num_sources&#60;br /&#62;
%     source.ux(:, i) = source_signal;&#60;br /&#62;
% end&#60;/p&#62;
&#60;p&#62;%% Define sensor&#60;br /&#62;
mask = zeros(Nx, Ny);&#60;br /&#62;
mask(:, 10:end) = 1;&#60;br /&#62;
sensor.mask = mask;&#60;br /&#62;
sensor.record = {'u', 'u_split_field'};&#60;/p&#62;
&#60;p&#62;%% Run simulation&#60;br /&#62;
input_args = {'PMLAlpha', 2, 'PlotPML', false, 'PMLInside', false, ...&#60;br /&#62;
              'DisplayMask', 'off', 'DataCast', DATA_CAST};&#60;/p&#62;
&#60;p&#62;sensor_data = pstdElastic2D(kgrid, medium, source, sensor, input_args{:});&#60;/p&#62;
&#60;p&#62;ux_s = gather(reshape(sensor_data.ux_split_s, Nx, Ny-9, []));&#60;br /&#62;
ux_p = gather(reshape(sensor_data.ux_split_p, Nx, Ny-9, []));&#60;br /&#62;
ux_total = gather(reshape(sensor_data.ux, Nx, Ny-9, []));&#60;/p&#62;
&#60;p&#62;dinf.dx = dx;&#60;br /&#62;
dinf.dy = dy;&#60;br /&#62;
dinf.dt = kgrid.dt;&#60;/p&#62;
&#60;p&#62;%% Visualization&#60;br /&#62;
figure&#60;br /&#62;
xx = dinf.dx * (0:size(ux_total,1)-1);&#60;br /&#62;
yy = dinf.dy * (0:size(ux_total,2)-1);&#60;br /&#62;
for ii = 1:500:20000&#60;br /&#62;
    im = real(squeeze(ux_total(:,:,ii)));  % amplitude of shear wave&#60;br /&#62;
    imagesc(yy(10:end),xx, im);&#60;br /&#62;
    xlabel('x (m)'); ylabel('y (m)');&#60;br /&#62;
    axis image;&#60;br /&#62;
    clim([-1 1]);&#60;br /&#62;
    title(['Frame ' num2str(ii)]);&#60;br /&#62;
    grid on;&#60;br /&#62;
    colormap('jet');&#60;br /&#62;
    colorbar;&#60;br /&#62;
    drawnow&#60;br /&#62;
    pause(0.1)&#60;br /&#62;
end&#60;/p&#62;
&#60;p&#62;% Optional: save&#60;br /&#62;
filename = ['ShearReflection_' num2str(source_freq) 'Hz'];&#60;br /&#62;
save(filename, &#34;ux_total&#34;, &#34;ux_p&#34;, &#34;ux_s&#34;, &#34;dinf&#34;, &#34;source_freq&#34;, '-v7.3');&#60;/p&#62;
&#60;p&#62;energy_t = zeros(1, size(ux_s,3));&#60;br /&#62;
for ii = 1:size(ux_s,3)&#60;br /&#62;
    frame = ux_s(:,:,ii);&#60;br /&#62;
    energy_t(ii) = sum(frame(:).^2);  % proportional to elastic energy&#60;br /&#62;
end&#60;/p&#62;
&#60;p&#62;% 2. Difference between consecutive frames&#60;br /&#62;
diff_energy = zeros(1, size(ux_s,3)-1);&#60;br /&#62;
for ii = 1:(size(ux_s,3)-1)&#60;br /&#62;
    diff = ux_s(:,:,ii+1) - ux_s(:,:,ii);&#60;br /&#62;
    diff_energy(ii) = sum(diff(:).^2);&#60;br /&#62;
end&#60;/p&#62;
&#60;p&#62;% 3. Plot energy over time&#60;br /&#62;
figure;&#60;br /&#62;
subplot(2,1,1);&#60;br /&#62;
plot(kgrid.t_array, energy_t, 'b-', 'LineWidth', 2);&#60;br /&#62;
xlabel('Time [s]');&#60;br /&#62;
ylabel('Total Shear Energy');&#60;br /&#62;
title('Shear Field Energy vs Time');&#60;br /&#62;
grid on;&#60;/p&#62;
&#60;p&#62;% 4. Plot frame-to-frame difference&#60;br /&#62;
subplot(2,1,2);&#60;br /&#62;
plot(kgrid.t_array(2:end), diff_energy, 'r-', 'LineWidth', 2);&#60;br /&#62;
xlabel('Time [s]');&#60;br /&#62;
ylabel('Frame-to-Frame Energy Change');&#60;br /&#62;
title('Difference Between Consecutive Shear Frames');&#60;br /&#62;
grid on;&#60;/p&#62;
&#60;p&#62;Error output:&#60;br /&#62;
Running k-Wave elastic simulation...&#60;br /&#62;
  start time: 27-May-2025 22:34:42&#60;br /&#62;
  reference sound speed: 1500m/s&#60;br /&#62;
[Warning: Support for ver('distcomp') will be removed in a future release.  Use&#60;br /&#62;
ver('parallel') instead.]&#60;br /&#62;
[&#38;gt; In ver&#38;gt;locGetSingleToolboxInfo (line 283)&#60;br /&#62;
In ver (line 56)&#60;br /&#62;
In verLessThan (line 39)&#60;br /&#62;
In kspaceFirstOrder_inputChecking (line 1306)&#60;br /&#62;
In pstdElastic2D (line 385)&#60;br /&#62;
In sphere (line 104)]&#60;br /&#62;
  dt: 400ns, t_end: 150ms, time steps: 375001&#60;br /&#62;
  input grid size: 60 by 100 grid points (120 by 200mm)&#60;br /&#62;
  maximum supported compressional frequency: 375kHz&#60;br /&#62;
  maximum supported shear frequency: 250Hz&#60;br /&#62;
  expanding computational grid...&#60;br /&#62;
  computational grid size: 100 by 140 grid points&#60;br /&#62;
{Error using gpuArray.zeros&#60;br /&#62;
Out of memory on device. To view more detail about available memory on the GPU,&#60;br /&#62;
use 'gpuDevice()'. If the problem persists, reset the GPU by calling&#60;br /&#62;
'gpuDevice(1)'.&#60;br /&#62;
Error in kspaceFirstOrder_inputChecking&#38;gt;@(sz)gpuArray.zeros(sz,'single') (line 1314)&#60;br /&#62;
                castZeros = @(sz) gpuArray.zeros(sz, 'single');&#60;br /&#62;
Error in kspaceFirstOrder_createStorageVariables (line 336)&#60;br /&#62;
                sensor_data.uy_split_s = castZeros([num_sensor_points, num_recorded_time_points]);&#60;br /&#62;
Error in kspaceFirstOrder_inputChecking (line 1663)&#60;br /&#62;
    kspaceFirstOrder_createStorageVariables;&#60;br /&#62;
Error in pstdElastic2D (line 385)&#60;br /&#62;
kspaceFirstOrder_inputChecking;&#60;br /&#62;
Error in sphere (line 104)&#60;br /&#62;
sensor_data = pstdElastic2D(kgrid, medium, source, sensor, input_args{:});}
&#60;/p&#62;</description>
		</item>
		<item>
			<title>Kangyi Feng on "How to solve using GPU increases the usage space of the C drive"</title>
			<link>http://www.k-wave.org/forum/topic/how-to-solve-using-gpu-increases-the-usage-space-of-the-c-drive#post-9094</link>
			<pubDate>Fri, 17 May 2024 04:49:13 +0000</pubDate>
			<dc:creator>Kangyi Feng</dc:creator>
			<guid isPermaLink="false">9094@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Hi,I am a beginner in k-wave. When I use  kspaceFirstOrder3DG for acceleration calculations, I find that every time I run the code in MATLAB, it increases the memory usage of the C drive. How to change the code so that when I run kspaceFirstOrder3DG calculations, it does not increase the usage space of the C drive. Thanks a lot
&#60;/p&#62;</description>
		</item>
		<item>
			<title>zak morgan on "GPU slower than CPU due to FFT preprocessing"</title>
			<link>http://www.k-wave.org/forum/topic/gpu-slower-than-cpu-due-to-fft-preprocessing#post-9090</link>
			<pubDate>Thu, 09 May 2024 00:06:47 +0000</pubDate>
			<dc:creator>zak morgan</dc:creator>
			<guid isPermaLink="false">9090@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Hi, I'm new to k-wave and noticed that when running on the CPU c++ code pre-rpocessing is instant and then my simulation takes about 47 seconds, however when running the CUDA code, the simulation takes about 10 seconds, but 80 seconds is spent on pre-processing the FFT. Is this expected?
&#60;/p&#62;</description>
		</item>
		<item>
			<title>shahwaiz on "Problem about use kspaceFirstOrder2DG"</title>
			<link>http://www.k-wave.org/forum/topic/problem-about-use-kspacefirstorder2dg#post-8991</link>
			<pubDate>Mon, 18 Dec 2023 04:06:22 +0000</pubDate>
			<dc:creator>shahwaiz</dc:creator>
			<guid isPermaLink="false">8991@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;i have resolve this problem by install the matlab2016a
&#60;/p&#62;</description>
		</item>
		<item>
			<title>shahwaiz on "Problem about use kspaceFirstOrder2DG"</title>
			<link>http://www.k-wave.org/forum/topic/problem-about-use-kspacefirstorder2dg#post-8990</link>
			<pubDate>Sun, 17 Dec 2023 10:33:47 +0000</pubDate>
			<dc:creator>shahwaiz</dc:creator>
			<guid isPermaLink="false">8990@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Hello&#60;/p&#62;
&#60;p&#62;I get the following error when i use the kspaceFirstOrder2DG:&#60;/p&#62;
&#60;p&#62;Running k-Wave simulation...&#60;br /&#62;
  start time: 17-Dec-2023 17:28:54&#60;br /&#62;
  reference sound speed: 2783.7252m/s&#60;br /&#62;
  dt: 10.7769ns, t_end: 48.2699us, time steps: 4480&#60;br /&#62;
  input grid size: 512 by 512 grid points (51.2 by 51.2mm)&#60;br /&#62;
  maximum supported frequency: 7.5MHz&#60;br /&#62;
  expanding computational grid...&#60;br /&#62;
  computational grid size: 552 by 552 grid points&#60;br /&#62;
  WARNING: Highest prime factors in each dimension are 23  23&#60;br /&#62;
           Use dimension sizes with lower prime factors to improve speed&#60;br /&#62;
  precomputation completed in 0.083601s&#60;br /&#62;
  saving input files to disk...&#60;br /&#62;
函数或变量 'attvalue_size' 无法识别。&#60;/p&#62;
&#60;p&#62;出错 h5writeatt&#38;gt;createDatatypeId (line 186)&#60;br /&#62;
        elseif attvalue_size &#38;gt; 0&#60;/p&#62;
&#60;p&#62;出错 h5writeatt (line 86)&#60;br /&#62;
datatypeId  = createDatatypeId(Attvalue, useUtf8);&#60;/p&#62;
&#60;p&#62;出错 writeMatrix (line 211)&#60;br /&#62;
    h5writeatt(filename, ['/' matrix_name], DOMAIN_TYPE_ATT_NAME, domain_type, 'TextEncoding', 'system');&#60;/p&#62;
&#60;p&#62;出错 kspaceFirstOrder_saveToDisk (line 462)&#60;br /&#62;
        writeMatrix(flags.save_to_disk, eval(variable_list{cast_index}), variable_list{cast_index}, hdf_compression_level);&#60;/p&#62;
&#60;p&#62;出错 kspaceFirstOrder2D (line 620)&#60;br /&#62;
    kspaceFirstOrder_saveToDisk;&#60;/p&#62;
&#60;p&#62;出错 kspaceFirstOrder3DC (line 532)&#60;br /&#62;
eval(run_string);&#60;/p&#62;
&#60;p&#62;出错 kspaceFirstOrder2DG (line 76)&#60;br /&#62;
sensor_data = kspaceFirstOrder3DC(varargin{:});&#60;/p&#62;
&#60;p&#62;出错 mydataset (line 69)&#60;br /&#62;
    sensor_data = kspaceFirstOrder2DG(kgrid, medium, source, sensor, input_args{:});
&#60;/p&#62;</description>
		</item>
		<item>
			<title>Bradley Treeby on "The optional input &#039;SaveToDisk&#039; is currently only compatible with forward simula"</title>
			<link>http://www.k-wave.org/forum/topic/the-optional-input-savetodisk-is-currently-only-compatible-with-forward-simula#post-8955</link>
			<pubDate>Fri, 24 Nov 2023 17:31:02 +0000</pubDate>
			<dc:creator>Bradley Treeby</dc:creator>
			<guid isPermaLink="false">8955@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;As the error message suggests, you can't run the C++ code with a Cartesian sensor mask. You have to use a binary sensor mask. You could also use a sensor mask created using &#60;code&#62;kWaveArray&#60;/code&#62; if you don't want to discretise your sensor points to the grid.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>AoTsing on "The optional input &#039;SaveToDisk&#039; is currently only compatible with forward simula"</title>
			<link>http://www.k-wave.org/forum/topic/the-optional-input-savetodisk-is-currently-only-compatible-with-forward-simula#post-8937</link>
			<pubDate>Mon, 06 Nov 2023 06:48:59 +0000</pubDate>
			<dc:creator>AoTsing</dc:creator>
			<guid isPermaLink="false">8937@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;hello，I have difine a 512*512*512 kgird and my initial pressure distribution(get from MCmatlab,also 512*512*512 grids) ,then I define a CartBowl senser by makeCartBowl,but when I run the program finally with kspaceFirstOrder3DG,I get a error:&#34;The optional input 'SaveToDisk' is currently only compatible with forward simulations using a non-zero binary sensor mask.&#34;&#60;/p&#62;
&#60;p&#62;my total code following:&#60;br /&#62;
clc; clear all; close;&#60;br /&#62;
Nx = 512; % number of grid points in the x (row) direction&#60;br /&#62;
Ny = 512; % number of grid points in the y (column) direction&#60;br /&#62;
Nz = 512; % number of grid points in the x (row) direction&#60;/p&#62;
&#60;p&#62;x = 4e-2;  % [m]&#60;br /&#62;
dx = x/Nx; % grid point spacing in the x direction [m]&#60;br /&#62;
dy = dx;   % grid point spacing in the y direction [m]&#60;br /&#62;
dz = dx;   % grid point spacing in the y direction [m]&#60;/p&#62;
&#60;p&#62;kgrid = makeGrid(Nx, dx, Ny, dy, Nz, dz);&#60;br /&#62;
load(&#34;P0.mat&#34;);&#60;br /&#62;
source.p0 = p0;&#60;/p&#62;
&#60;p&#62;medium.sound_speed = 1500*ones(Nx, Ny,Nz); % [m/s]&#60;br /&#62;
medium.density = 1040; % [kg/m^3]&#60;/p&#62;
&#60;p&#62;% define sensor&#60;br /&#62;
bowl_pos  = [-1.5e-2, 0e-2, 0e-2];&#60;br /&#62;
radius    = Inf;&#60;br /&#62;
diameter  = 4e-2;&#60;br /&#62;
focus_pos = [0e-2,0e-2,0e-2];&#60;br /&#62;
num_points= 256;&#60;br /&#62;
sensor.mask = makeCartBowl(bowl_pos, radius, diameter, focus_pos, num_points);&#60;/p&#62;
&#60;p&#62;sensor_data = kspaceFirstOrder3DG(kgrid, medium, source, sensor);
&#60;/p&#62;</description>
		</item>
		<item>
			<title>djps on "Weird results when modeling attenuation with the GPU and CPU versions"</title>
			<link>http://www.k-wave.org/forum/topic/weird-results-when-modeling-attenuation-with-the-gpu-and-cpu-versions#post-8914</link>
			<pubDate>Thu, 21 Sep 2023 11:45:05 +0000</pubDate>
			<dc:creator>djps</dc:creator>
			<guid isPermaLink="false">8914@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;perhaps you could just say how the perfectly matched layers were constructed in the benchmarks?
&#60;/p&#62;</description>
		</item>
		<item>
			<title>djps on "Weird results when modeling attenuation with the GPU and CPU versions"</title>
			<link>http://www.k-wave.org/forum/topic/weird-results-when-modeling-attenuation-with-the-gpu-and-cpu-versions#post-8850</link>
			<pubDate>Thu, 29 Jun 2023 08:32:14 +0000</pubDate>
			<dc:creator>djps</dc:creator>
			<guid isPermaLink="false">8850@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;That would be great, thanks!
&#60;/p&#62;</description>
		</item>
		<item>
			<title>Bradley Treeby on "Weird results when modeling attenuation with the GPU and CPU versions"</title>
			<link>http://www.k-wave.org/forum/topic/weird-results-when-modeling-attenuation-with-the-gpu-and-cpu-versions#post-8786</link>
			<pubDate>Fri, 02 Jun 2023 08:50:26 +0000</pubDate>
			<dc:creator>Bradley Treeby</dc:creator>
			<guid isPermaLink="false">8786@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Hi djps,&#60;/p&#62;
&#60;p&#62;I'll tidy up all the k-Wave intercomparison code when I get a chance and move to a public GitHub repo.&#60;/p&#62;
&#60;p&#62;Brad.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>djps on "Weird results when modeling attenuation with the GPU and CPU versions"</title>
			<link>http://www.k-wave.org/forum/topic/weird-results-when-modeling-attenuation-with-the-gpu-and-cpu-versions#post-8785</link>
			<pubDate>Wed, 31 May 2023 14:23:43 +0000</pubDate>
			<dc:creator>djps</dc:creator>
			<guid isPermaLink="false">8785@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Is there an example of how to generate the kwave results for the benchmark data, rather than download it?
&#60;/p&#62;</description>
		</item>
		<item>
			<title>Bradley Treeby on "Problems about GPU memory used"</title>
			<link>http://www.k-wave.org/forum/topic/problems-about-gpu-memory-used#post-8758</link>
			<pubDate>Wed, 17 May 2023 18:53:42 +0000</pubDate>
			<dc:creator>Bradley Treeby</dc:creator>
			<guid isPermaLink="false">8758@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;I’d advise switching to the C++/CUDA version of the GPU code - it uses much less memory in general, and the memory usage can be calculated reasonably accurately. Unfortunately, when using gpuArray, I’m not exactly sure how the inbuilt memory management of MATLAB works under the hood, and why the memory usage might change in these two cases.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>maheshiitpkd on "kspaceFirstOrder2DG Error on Cluster with A100 GPUs"</title>
			<link>http://www.k-wave.org/forum/topic/kspacefirstorder2dg-error-on-cluster-with-a100-gpus#post-8738</link>
			<pubDate>Fri, 14 Apr 2023 10:00:01 +0000</pubDate>
			<dc:creator>maheshiitpkd</dc:creator>
			<guid isPermaLink="false">8738@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;I was able to solve by setting the binary path correctly using setrpaths.sh --path path (second path should be the path of the installed binaries)
&#60;/p&#62;</description>
		</item>
		<item>
			<title>maheshiitpkd on "kspaceFirstOrder2DG Error on Cluster with A100 GPUs"</title>
			<link>http://www.k-wave.org/forum/topic/kspacefirstorder2dg-error-on-cluster-with-a100-gpus#post-8737</link>
			<pubDate>Wed, 12 Apr 2023 09:43:57 +0000</pubDate>
			<dc:creator>maheshiitpkd</dc:creator>
			<guid isPermaLink="false">8737@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Dear All,&#60;/p&#62;
&#60;p&#62;Trying to do a simulation (2D) on a cluster (linux) with A100 GPU. However getting the error saying that&#60;br /&#62;
./kspaceFirstOrder-CUDA: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by ./kspaceFirstOrder-CUDA)&#60;br /&#62;
./kspaceFirstOrder-CUDA: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./kspaceFirstOrder-CUDA)&#60;/p&#62;
&#60;p&#62;Is there an alternative other than recompiling from the source files (I don't have the sudo access on the cluster).&#60;/p&#62;
&#60;p&#62;Regards,&#60;br /&#62;
Mahesh
&#60;/p&#62;</description>
		</item>
		<item>
			<title>lschlieder on "hdf5 problems"</title>
			<link>http://www.k-wave.org/forum/topic/hdf5-problems#post-8736</link>
			<pubDate>Fri, 07 Apr 2023 12:27:33 +0000</pubDate>
			<dc:creator>lschlieder</dc:creator>
			<guid isPermaLink="false">8736@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;After a week of fiddeling around, I found out, that there is an environment Variable under linux called HDF5_USE_FILE_LOCKING (&#60;a href=&#34;https://docs.hdfgroup.org/hdf5/develop/_h5public_8h.html#a347a82c5915f93e3f03265aac4f7a795&#34; rel=&#34;nofollow&#34;&#62;https://docs.hdfgroup.org/hdf5/develop/_h5public_8h.html#a347a82c5915f93e3f03265aac4f7a795&#60;/a&#62;)&#60;/p&#62;
&#60;p&#62;You can just:&#60;br /&#62;
export HDF5_USE_FILE_LOCKING='FALSE'
&#60;/p&#62;</description>
		</item>
		<item>
			<title>lschlieder on "hdf5 problems"</title>
			<link>http://www.k-wave.org/forum/topic/hdf5-problems#post-8735</link>
			<pubDate>Wed, 05 Apr 2023 17:33:18 +0000</pubDate>
			<dc:creator>lschlieder</dc:creator>
			<guid isPermaLink="false">8735@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;I posted a similar topic in general, and still have found no solution. I am running into a very weird situation, where kspaceFirstOrder-CUDA fails to open the input h5 file. It seems unable to open or locate the file: &#60;/p&#62;
&#60;p&#62;me@cluster_node:/clusterpath/KWaveSimulationsCorrect/rng806480170$ kspaceFirstOrder-CUDA -i ./rholo3d_kwave_input.h5 -o test.h5&#60;/p&#62;
&#60;p&#62;┌───────────────────────────────────────────────────────────────┐&#60;br /&#62;
│                  kspaceFirstOrder-CUDA v1.3                   │&#60;br /&#62;
├───────────────────────────────────────────────────────────────┤&#60;br /&#62;
│ Reading simulation configuration:                      Failed │&#60;br /&#62;
└───────────────────────────────────────────────────────────────┘&#60;br /&#62;
┌───────────────────────────────────────────────────────────────┐&#60;br /&#62;
│            !!! K-Wave experienced a fatal error !!!           │&#60;br /&#62;
├───────────────────────────────────────────────────────────────┤&#60;br /&#62;
│ Error: File ./rholo3d_kwave_input.h5 was not found or cannot  │&#60;br /&#62;
│        be opened.: iostream error                             │&#60;br /&#62;
├───────────────────────────────────────────────────────────────┤&#60;br /&#62;
│                      Execution terminated                     │&#60;br /&#62;
└───────────────────────────────────────────────────────────────┘&#60;/p&#62;
&#60;p&#62;The weird thing is, that the same program can use the same input files and it works, when i start the program on my local pc, instead of the cluster i want to use everything works fine. &#60;/p&#62;
&#60;p&#62;me@mypc:/cluster_remote_smtp_access/KWaveSimulationsCorrect/rng806480170$ kspaceFirstOrder-CUDA -i rholo3d_kwave_input.h5 -o test.h5&#60;/p&#62;
&#60;p&#62;┌───────────────────────────────────────────────────────────────┐&#60;br /&#62;
│                  kspaceFirstOrder-CUDA v1.3                   │&#60;br /&#62;
├───────────────────────────────────────────────────────────────┤&#60;br /&#62;
│ Reading simulation configuration:                        Done │&#60;br /&#62;
│ Selected GPU device id:                                     0 │&#60;br /&#62;
│ GPU device name:                   NVIDIA GeForce GTX 1650 Ti │&#60;br /&#62;
│ Number of CPU threads:                                     16 │&#60;br /&#62;
│ Processor name:     Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz │&#60;br /&#62;
├───────────────────────────────────────────────────────────────┤&#60;br /&#62;
│                      Simulation details                       │&#60;br /&#62;
├───────────────────────────────────────────────────────────────┤&#60;br /&#62;
│ Domain dimensions:                            256 x 256 x 128 │&#60;br /&#62;
│ Medium type:                                               3D │&#60;br /&#62;
│ Simulation time steps:                                    828 │&#60;br /&#62;
├───────────────────────────────────────────────────────────────┤&#60;br /&#62;
│                        Initialization                         │&#60;br /&#62;
├───────────────────────────────────────────────────────────────┤&#60;br /&#62;
│ Memory allocation:                                       Done │&#60;/p&#62;
&#60;p&#62;The files are created by Matlab2019b, since newer versions crash in various ways, i.e. 2022a:&#60;/p&#62;
&#60;p&#62;Error using hdf5lib2&#60;br /&#62;
Unable to open the file because of HDF5 Library error. Reason:Unknown&#60;/p&#62;
&#60;p&#62;Error in H5F.open (line 130)&#60;br /&#62;
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);&#60;/p&#62;
&#60;p&#62;Error in h5write (line 108)&#60;br /&#62;
    file_id    = H5F.open(Filename,flags,fapl);&#60;/p&#62;
&#60;p&#62;Error in writeMatrix (line 204)&#60;br /&#62;
h5write(filename, ['/' matrix_name], matrix);&#60;/p&#62;
&#60;p&#62;Error in kspaceFirstOrder_saveToDisk (line 462)&#60;br /&#62;
        writeMatrix(flags.save_to_disk, eval(variable_list{cast_index}), variable_list{cast_index}, hdf_compression_level);&#60;/p&#62;
&#60;p&#62;Error in kspaceFirstOrder3D (line 673)&#60;br /&#62;
    kspaceFirstOrder_saveToDisk;&#60;/p&#62;
&#60;p&#62;Error in init_kwave_sim (line 169)&#60;br /&#62;
sensor_data = kspaceFirstOrder3D(kgrid, medium, source, sensor, input_args{:});&#60;/p&#62;
&#60;p&#62;Now i do not even know where to start debugging this.&#60;br /&#62;
First problem could be, that the file isn't there, but i checked that like 100 times. It could be some weird access stuff, but i never ran into problems before, though i obviously do not have sudo rights on the cluster.&#60;/p&#62;
&#60;p&#62;I assume, since the binaries are statically linked everything is contained within them, and i do not need to worry about CUDA versions or hdf5 versions. Is that correct? Also does the Card i run it on matter? I assume it just works for all new Nvidia cards, like A100, Tesla V100 etc, correct? &#60;/p&#62;
&#60;p&#62;Does anyone have any ideas that go beyond &#34;have you checked if the file is really there?&#34;
&#60;/p&#62;</description>
		</item>
		<item>
			<title>CelestineAngla on "Weird results when modeling attenuation with the GPU and CPU versions"</title>
			<link>http://www.k-wave.org/forum/topic/weird-results-when-modeling-attenuation-with-the-gpu-and-cpu-versions#post-8732</link>
			<pubDate>Thu, 16 Mar 2023 10:22:33 +0000</pubDate>
			<dc:creator>CelestineAngla</dc:creator>
			<guid isPermaLink="false">8732@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Makes sense, thank you very much for your answer !
&#60;/p&#62;</description>
		</item>
		<item>
			<title>Bradley Treeby on "Weird results when modeling attenuation with the GPU and CPU versions"</title>
			<link>http://www.k-wave.org/forum/topic/weird-results-when-modeling-attenuation-with-the-gpu-and-cpu-versions#post-8731</link>
			<pubDate>Wed, 15 Mar 2023 19:55:52 +0000</pubDate>
			<dc:creator>Bradley Treeby</dc:creator>
			<guid isPermaLink="false">8731@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;If your source has a single frequency, the frequency dependent power law can be set to anything, as long as the value for alpha_coeff is adjusted accordingly to give the correct absorption in dB/cm. To avoid dispersion, you can set alpha_power to 2. This is the approach I used for the intercomparison simulations.&#60;/p&#62;
&#60;p&#62;The code used is the same one as publicly available, except for the very large 3D simulations which were done using the MPI code.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>CelestineAngla on "Weird results when modeling attenuation with the GPU and CPU versions"</title>
			<link>http://www.k-wave.org/forum/topic/weird-results-when-modeling-attenuation-with-the-gpu-and-cpu-versions#post-8730</link>
			<pubDate>Wed, 15 Mar 2023 14:41:16 +0000</pubDate>
			<dc:creator>CelestineAngla</dc:creator>
			<guid isPermaLink="false">8730@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Please, does anyone know how absorption was modeled in the benchmark configurations of the paper &#34;Benchmark problems for transcranial ultrasound simulation&#34; ?&#60;br /&#62;
Is  it possible to model absorption without dispersion with the 3D version of k-Wave optimized for high performance computing clusters ?&#60;br /&#62;
Thank you
&#60;/p&#62;</description>
		</item>
		<item>
			<title>zhaoxiang on "Problems about GPU memory used"</title>
			<link>http://www.k-wave.org/forum/topic/problems-about-gpu-memory-used#post-8713</link>
			<pubDate>Tue, 07 Mar 2023 09:52:22 +0000</pubDate>
			<dc:creator>zhaoxiang</dc:creator>
			<guid isPermaLink="false">8713@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Hi, firstly I have to say this toolbox is very useful. However, I have a problem that has troubled me a lot.&#60;/p&#62;
&#60;p&#62;The problem is about GPU memory used. The code is run on A100(40GB). It looked normal at first：&#60;/p&#62;
&#60;p&#62;estimated simulation time 2hours 6min 29.7093s...&#60;br /&#62;
  GPU memory used: 21.9727 GB (of 39.5871 GB)&#60;br /&#62;
  simulation completed in 2hours 6min 36.3867s&#60;/p&#62;
&#60;p&#62;It looked like it was over but next something happende：&#60;/p&#62;
&#60;p&#62;{Error using gpuArray/ifft&#60;br /&#62;
Out of memory on device. MATLAB was unable to allocate sufficient resources on&#60;br /&#62;
the GPU to complete this operation. If the problem persists, reset the GPU by&#60;br /&#62;
calling 'gpuDevice(1)'.&#60;/p&#62;
&#60;p&#62;Error in fourierShift (line 112)&#60;br /&#62;
data = real(ifft(bsxfun(@times, ifftshift( exp(1i .* k_vec .* shift) ),&#60;br /&#62;
fft(data, [], dim)), [], dim));&#60;/p&#62;
&#60;p&#62;Error in kspaceFirstOrder_saveIntensity (line 46)&#60;br /&#62;
    ux_sgt     = fourierShift(sensor_data(sensor_mask_index).ux_non_staggered,&#60;br /&#62;
    1/2);&#60;/p&#62;
&#60;p&#62;Error in kspaceFirstOrderAS (line 1387)&#60;br /&#62;
    kspaceFirstOrder_saveIntensity;&#60;/p&#62;
&#60;p&#62;Error in code1 (line 169)&#60;br /&#62;
        sensor_data = kspaceFirstOrderAS(kgrid, medium, source, sensor, ...&#60;br /&#62;
} &#60;/p&#62;
&#60;p&#62;And I have another question about GPU memory used.&#60;br /&#62;
There is another case which can be run on A100 and the GPU memory used is 3.5964 GB (of 39.5871 GB). However, this case cannot not be run on a GPU of 6GB.&#60;/p&#62;
&#60;p&#62;Could anybody tell me why?&#60;br /&#62;
Thanks in advance.
&#60;/p&#62;</description>
		</item>

	</channel>
</rss>
