<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="bbPress/1.0.2" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>k-Wave User Forum &#187; Topic: Linux vs windows GPU memory allocation</title>
		<link>http://www.k-wave.org/forum/topic/linux-vs-windows-gpu-memory-allocation</link>
		<description>Support for the k-Wave MATLAB toolbox</description>
		<language>en-US</language>
		<pubDate>Tue, 12 May 2026 22:28:39 +0000</pubDate>
		<generator>http://bbpress.org/?v=1.0.2</generator>
		<textInput>
			<title><![CDATA[Search]]></title>
			<description><![CDATA[Search all topics from these forums.]]></description>
			<name>q</name>
			<link>http://www.k-wave.org/forum/search.php</link>
		</textInput>
		<atom:link href="http://www.k-wave.org/forum/rss/topic/linux-vs-windows-gpu-memory-allocation" rel="self" type="application/rss+xml" />

		<item>
			<title>ejoa1234561 on "Linux vs windows GPU memory allocation"</title>
			<link>http://www.k-wave.org/forum/topic/linux-vs-windows-gpu-memory-allocation#post-9218</link>
			<pubDate>Wed, 28 May 2025 04:03:52 +0000</pubDate>
			<dc:creator>ejoa1234561</dc:creator>
			<guid isPermaLink="false">9218@http://www.k-wave.org/forum/</guid>
			<description>&#60;p&#62;Hello,&#60;/p&#62;
&#60;p&#62;I am running this code in 2 workstations.&#60;br /&#62;
The first workstation is a Windows 10 with a GPU ADA A4000 of 12gb VRAM.&#60;br /&#62;
The second is a cluster linux with 2 GPUs A100 of 48gb of VRAM each.&#60;/p&#62;
&#60;p&#62;In the W10 the code can allocate until t_end = 0.12; (11.48gb of 12gb) without problems. However, when I try to do the same in the cluster the VRAM only accepts (t_end=0.1) using 39.8gb of 40gb. What is happening? Linux manages different the VRAM? Is there a way to fix this? I installed the binaries in both cases and the problem is on pstdElastic2D(kgrid, medium, source, sensor, input_args{:}) function.&#60;/p&#62;
&#60;p&#62;clc; clear; close all;&#60;br /&#62;
gpuDevice(1)&#60;/p&#62;
&#60;p&#62;format compact;&#60;br /&#62;
set(0,'defaultAxesFontSize',18);&#60;br /&#62;
set(groot, 'defaultAxesTickLabelInterpreter','tex');&#60;br /&#62;
set(groot, 'defaultLegendInterpreter','tex');&#60;br /&#62;
set(0,'defaulttextInterpreter','tex');&#60;/p&#62;
&#60;p&#62;DATA_CAST = 'gpuArray-single'; &#60;/p&#62;
&#60;p&#62;%% Create the computational grid&#60;br /&#62;
Nx = 60;           % number of grid points in the x direction&#60;br /&#62;
Ny = 100;           % number of grid points in the y direction&#60;br /&#62;
dx = 2e-3;          % grid spacing in x [m]&#60;br /&#62;
dy = 2e-3;          % grid spacing in y [m]&#60;br /&#62;
kgrid = kWaveGrid(Nx, dx, Ny, dy);&#60;/p&#62;
&#60;p&#62;%% Define medium properties&#60;br /&#62;
medium.sound_speed_compression = 1500 * ones(Nx, Ny);   % [m/s] realistic compressional speed&#60;br /&#62;
medium.sound_speed_shear = 2.5 * ones(Nx, Ny);            % [m/s] for gelatin-like shear&#60;br /&#62;
medium.density = 1000 * ones(Nx, Ny);                   % [kg/m^3]&#60;/p&#62;
&#60;p&#62;% Add boundary&#60;br /&#62;
rigid_thickness = 5;&#60;/p&#62;
&#60;p&#62;%AIR&#60;br /&#62;
medium.density(:,1:rigid_thickness) = 12;&#60;br /&#62;
% medium.density(:,end-rigid_thickness:end) = 1.2;&#60;br /&#62;
medium.density(1:rigid_thickness,:) = 12;&#60;br /&#62;
medium.density(end-rigid_thickness:end,:) = 12;&#60;/p&#62;
&#60;p&#62;medium.sound_speed_shear(:, 1:rigid_thickness) = 1;&#60;br /&#62;
% medium.sound_speed_shear(:,end-rigid_thickness:end) = 0.1;&#60;br /&#62;
medium.sound_speed_shear(1:rigid_thickness,:) = 1;&#60;br /&#62;
medium.sound_speed_shear(end-rigid_thickness:end,:) = 1;&#60;/p&#62;
&#60;p&#62;medium.sound_speed_compression(:, 1:rigid_thickness) = 1500;&#60;br /&#62;
% medium.sound_speed_compression(:,end-rigid_thickness:end) = 340;&#60;br /&#62;
medium.sound_speed_compression(1:rigid_thickness,:) = 1500;&#60;br /&#62;
medium.sound_speed_compression(end-rigid_thickness:end,:) = 1500;&#60;/p&#62;
&#60;p&#62;%TX&#60;br /&#62;
medium.density(1:rigid_thickness,40:60) = 12;&#60;br /&#62;
medium.sound_speed_shear(1:rigid_thickness,40:60) = 1000;&#60;br /&#62;
medium.sound_speed_compression(1:rigid_thickness,40:60) = 1500;&#60;/p&#62;
&#60;p&#62;% Attenuation&#60;br /&#62;
medium.alpha_coeff_compression = 0.05;    % [dB/(MHz^2 cm)]&#60;br /&#62;
medium.alpha_coeff_shear = 1000;           % [dB/(MHz^2 cm)]&#60;br /&#62;
% medium.alpha_power_shear = 1.3;&#60;/p&#62;
&#60;p&#62;%% Time array (safe dt based on c_max)&#60;br /&#62;
t_end = 0.1; %0.12;  % [s] to allow steady-state observation&#60;br /&#62;
c_max = max([max(medium.sound_speed_compression(:)), max(medium.sound_speed_shear(:))]);&#60;br /&#62;
cfl = 0.3;&#60;br /&#62;
kgrid.makeTime(c_max, cfl, t_end);&#60;/p&#62;
&#60;p&#62;%% Define source&#60;br /&#62;
cx1 = Nx/2;&#60;br /&#62;
cy1 = Ny-14;&#60;br /&#62;
radius = 10;&#60;br /&#62;
plot_disc = 'true';&#60;br /&#62;
disc1 = makeDisc(Nx, Ny, cx1, cy1, radius, plot_disc);&#60;br /&#62;
source.u_mask = disc1;&#60;/p&#62;
&#60;p&#62;source_freq = 200; % [Hz]&#60;br /&#62;
source_mag = 1e-6;&#60;br /&#62;
source_signal = source_mag * sin(2 * pi * source_freq * kgrid.t_array);&#60;br /&#62;
source.uy= source_signal;&#60;/p&#62;
&#60;p&#62;% source_points = find(source.u_mask);&#60;br /&#62;
% num_sources = length(source_points);&#60;br /&#62;
% source.ux = zeros(length(kgrid.t_array), num_sources);&#60;br /&#62;
% for i = 1:num_sources&#60;br /&#62;
%     source.ux(:, i) = source_signal;&#60;br /&#62;
% end&#60;/p&#62;
&#60;p&#62;%% Define sensor&#60;br /&#62;
mask = zeros(Nx, Ny);&#60;br /&#62;
mask(:, 10:end) = 1;&#60;br /&#62;
sensor.mask = mask;&#60;br /&#62;
sensor.record = {'u', 'u_split_field'};&#60;/p&#62;
&#60;p&#62;%% Run simulation&#60;br /&#62;
input_args = {'PMLAlpha', 2, 'PlotPML', false, 'PMLInside', false, ...&#60;br /&#62;
              'DisplayMask', 'off', 'DataCast', DATA_CAST};&#60;/p&#62;
&#60;p&#62;sensor_data = pstdElastic2D(kgrid, medium, source, sensor, input_args{:});&#60;/p&#62;
&#60;p&#62;ux_s = gather(reshape(sensor_data.ux_split_s, Nx, Ny-9, []));&#60;br /&#62;
ux_p = gather(reshape(sensor_data.ux_split_p, Nx, Ny-9, []));&#60;br /&#62;
ux_total = gather(reshape(sensor_data.ux, Nx, Ny-9, []));&#60;/p&#62;
&#60;p&#62;dinf.dx = dx;&#60;br /&#62;
dinf.dy = dy;&#60;br /&#62;
dinf.dt = kgrid.dt;&#60;/p&#62;
&#60;p&#62;%% Visualization&#60;br /&#62;
figure&#60;br /&#62;
xx = dinf.dx * (0:size(ux_total,1)-1);&#60;br /&#62;
yy = dinf.dy * (0:size(ux_total,2)-1);&#60;br /&#62;
for ii = 1:500:20000&#60;br /&#62;
    im = real(squeeze(ux_total(:,:,ii)));  % amplitude of shear wave&#60;br /&#62;
    imagesc(yy(10:end),xx, im);&#60;br /&#62;
    xlabel('x (m)'); ylabel('y (m)');&#60;br /&#62;
    axis image;&#60;br /&#62;
    clim([-1 1]);&#60;br /&#62;
    title(['Frame ' num2str(ii)]);&#60;br /&#62;
    grid on;&#60;br /&#62;
    colormap('jet');&#60;br /&#62;
    colorbar;&#60;br /&#62;
    drawnow&#60;br /&#62;
    pause(0.1)&#60;br /&#62;
end&#60;/p&#62;
&#60;p&#62;% Optional: save&#60;br /&#62;
filename = ['ShearReflection_' num2str(source_freq) 'Hz'];&#60;br /&#62;
save(filename, &#34;ux_total&#34;, &#34;ux_p&#34;, &#34;ux_s&#34;, &#34;dinf&#34;, &#34;source_freq&#34;, '-v7.3');&#60;/p&#62;
&#60;p&#62;energy_t = zeros(1, size(ux_s,3));&#60;br /&#62;
for ii = 1:size(ux_s,3)&#60;br /&#62;
    frame = ux_s(:,:,ii);&#60;br /&#62;
    energy_t(ii) = sum(frame(:).^2);  % proportional to elastic energy&#60;br /&#62;
end&#60;/p&#62;
&#60;p&#62;% 2. Difference between consecutive frames&#60;br /&#62;
diff_energy = zeros(1, size(ux_s,3)-1);&#60;br /&#62;
for ii = 1:(size(ux_s,3)-1)&#60;br /&#62;
    diff = ux_s(:,:,ii+1) - ux_s(:,:,ii);&#60;br /&#62;
    diff_energy(ii) = sum(diff(:).^2);&#60;br /&#62;
end&#60;/p&#62;
&#60;p&#62;% 3. Plot energy over time&#60;br /&#62;
figure;&#60;br /&#62;
subplot(2,1,1);&#60;br /&#62;
plot(kgrid.t_array, energy_t, 'b-', 'LineWidth', 2);&#60;br /&#62;
xlabel('Time [s]');&#60;br /&#62;
ylabel('Total Shear Energy');&#60;br /&#62;
title('Shear Field Energy vs Time');&#60;br /&#62;
grid on;&#60;/p&#62;
&#60;p&#62;% 4. Plot frame-to-frame difference&#60;br /&#62;
subplot(2,1,2);&#60;br /&#62;
plot(kgrid.t_array(2:end), diff_energy, 'r-', 'LineWidth', 2);&#60;br /&#62;
xlabel('Time [s]');&#60;br /&#62;
ylabel('Frame-to-Frame Energy Change');&#60;br /&#62;
title('Difference Between Consecutive Shear Frames');&#60;br /&#62;
grid on;&#60;/p&#62;
&#60;p&#62;Error output:&#60;br /&#62;
Running k-Wave elastic simulation...&#60;br /&#62;
  start time: 27-May-2025 22:34:42&#60;br /&#62;
  reference sound speed: 1500m/s&#60;br /&#62;
[Warning: Support for ver('distcomp') will be removed in a future release.  Use&#60;br /&#62;
ver('parallel') instead.]&#60;br /&#62;
[&#38;gt; In ver&#38;gt;locGetSingleToolboxInfo (line 283)&#60;br /&#62;
In ver (line 56)&#60;br /&#62;
In verLessThan (line 39)&#60;br /&#62;
In kspaceFirstOrder_inputChecking (line 1306)&#60;br /&#62;
In pstdElastic2D (line 385)&#60;br /&#62;
In sphere (line 104)]&#60;br /&#62;
  dt: 400ns, t_end: 150ms, time steps: 375001&#60;br /&#62;
  input grid size: 60 by 100 grid points (120 by 200mm)&#60;br /&#62;
  maximum supported compressional frequency: 375kHz&#60;br /&#62;
  maximum supported shear frequency: 250Hz&#60;br /&#62;
  expanding computational grid...&#60;br /&#62;
  computational grid size: 100 by 140 grid points&#60;br /&#62;
{Error using gpuArray.zeros&#60;br /&#62;
Out of memory on device. To view more detail about available memory on the GPU,&#60;br /&#62;
use 'gpuDevice()'. If the problem persists, reset the GPU by calling&#60;br /&#62;
'gpuDevice(1)'.&#60;br /&#62;
Error in kspaceFirstOrder_inputChecking&#38;gt;@(sz)gpuArray.zeros(sz,'single') (line 1314)&#60;br /&#62;
                castZeros = @(sz) gpuArray.zeros(sz, 'single');&#60;br /&#62;
Error in kspaceFirstOrder_createStorageVariables (line 336)&#60;br /&#62;
                sensor_data.uy_split_s = castZeros([num_sensor_points, num_recorded_time_points]);&#60;br /&#62;
Error in kspaceFirstOrder_inputChecking (line 1663)&#60;br /&#62;
    kspaceFirstOrder_createStorageVariables;&#60;br /&#62;
Error in pstdElastic2D (line 385)&#60;br /&#62;
kspaceFirstOrder_inputChecking;&#60;br /&#62;
Error in sphere (line 104)&#60;br /&#62;
sensor_data = pstdElastic2D(kgrid, medium, source, sensor, input_args{:});}
&#60;/p&#62;</description>
		</item>

	</channel>
</rss>
