Hello,
I am running this code in 2 workstations.
The first workstation is a Windows 10 with a GPU ADA A4000 of 12gb VRAM.
The second is a cluster linux with 2 GPUs A100 of 48gb of VRAM each.
In the W10 the code can allocate until t_end = 0.12; (11.48gb of 12gb) without problems. However, when I try to do the same in the cluster the VRAM only accepts (t_end=0.1) using 39.8gb of 40gb. What is happening? Linux manages different the VRAM? Is there a way to fix this? I installed the binaries in both cases and the problem is on pstdElastic2D(kgrid, medium, source, sensor, input_args{:}) function.
clc; clear; close all;
gpuDevice(1)
format compact;
set(0,'defaultAxesFontSize',18);
set(groot, 'defaultAxesTickLabelInterpreter','tex');
set(groot, 'defaultLegendInterpreter','tex');
set(0,'defaulttextInterpreter','tex');
DATA_CAST = 'gpuArray-single';
%% Create the computational grid
Nx = 60; % number of grid points in the x direction
Ny = 100; % number of grid points in the y direction
dx = 2e-3; % grid spacing in x [m]
dy = 2e-3; % grid spacing in y [m]
kgrid = kWaveGrid(Nx, dx, Ny, dy);
%% Define medium properties
medium.sound_speed_compression = 1500 * ones(Nx, Ny); % [m/s] realistic compressional speed
medium.sound_speed_shear = 2.5 * ones(Nx, Ny); % [m/s] for gelatin-like shear
medium.density = 1000 * ones(Nx, Ny); % [kg/m^3]
% Add boundary
rigid_thickness = 5;
%AIR
medium.density(:,1:rigid_thickness) = 12;
% medium.density(:,end-rigid_thickness:end) = 1.2;
medium.density(1:rigid_thickness,:) = 12;
medium.density(end-rigid_thickness:end,:) = 12;
medium.sound_speed_shear(:, 1:rigid_thickness) = 1;
% medium.sound_speed_shear(:,end-rigid_thickness:end) = 0.1;
medium.sound_speed_shear(1:rigid_thickness,:) = 1;
medium.sound_speed_shear(end-rigid_thickness:end,:) = 1;
medium.sound_speed_compression(:, 1:rigid_thickness) = 1500;
% medium.sound_speed_compression(:,end-rigid_thickness:end) = 340;
medium.sound_speed_compression(1:rigid_thickness,:) = 1500;
medium.sound_speed_compression(end-rigid_thickness:end,:) = 1500;
%TX
medium.density(1:rigid_thickness,40:60) = 12;
medium.sound_speed_shear(1:rigid_thickness,40:60) = 1000;
medium.sound_speed_compression(1:rigid_thickness,40:60) = 1500;
% Attenuation
medium.alpha_coeff_compression = 0.05; % [dB/(MHz^2 cm)]
medium.alpha_coeff_shear = 1000; % [dB/(MHz^2 cm)]
% medium.alpha_power_shear = 1.3;
%% Time array (safe dt based on c_max)
t_end = 0.1; %0.12; % [s] to allow steady-state observation
c_max = max([max(medium.sound_speed_compression(:)), max(medium.sound_speed_shear(:))]);
cfl = 0.3;
kgrid.makeTime(c_max, cfl, t_end);
%% Define source
cx1 = Nx/2;
cy1 = Ny-14;
radius = 10;
plot_disc = 'true';
disc1 = makeDisc(Nx, Ny, cx1, cy1, radius, plot_disc);
source.u_mask = disc1;
source_freq = 200; % [Hz]
source_mag = 1e-6;
source_signal = source_mag * sin(2 * pi * source_freq * kgrid.t_array);
source.uy= source_signal;
% source_points = find(source.u_mask);
% num_sources = length(source_points);
% source.ux = zeros(length(kgrid.t_array), num_sources);
% for i = 1:num_sources
% source.ux(:, i) = source_signal;
% end
%% Define sensor
mask = zeros(Nx, Ny);
mask(:, 10:end) = 1;
sensor.mask = mask;
sensor.record = {'u', 'u_split_field'};
%% Run simulation
input_args = {'PMLAlpha', 2, 'PlotPML', false, 'PMLInside', false, ...
'DisplayMask', 'off', 'DataCast', DATA_CAST};
sensor_data = pstdElastic2D(kgrid, medium, source, sensor, input_args{:});
ux_s = gather(reshape(sensor_data.ux_split_s, Nx, Ny-9, []));
ux_p = gather(reshape(sensor_data.ux_split_p, Nx, Ny-9, []));
ux_total = gather(reshape(sensor_data.ux, Nx, Ny-9, []));
dinf.dx = dx;
dinf.dy = dy;
dinf.dt = kgrid.dt;
%% Visualization
figure
xx = dinf.dx * (0:size(ux_total,1)-1);
yy = dinf.dy * (0:size(ux_total,2)-1);
for ii = 1:500:20000
im = real(squeeze(ux_total(:,:,ii))); % amplitude of shear wave
imagesc(yy(10:end),xx, im);
xlabel('x (m)'); ylabel('y (m)');
axis image;
clim([-1 1]);
title(['Frame ' num2str(ii)]);
grid on;
colormap('jet');
colorbar;
drawnow
pause(0.1)
end
% Optional: save
filename = ['ShearReflection_' num2str(source_freq) 'Hz'];
save(filename, "ux_total", "ux_p", "ux_s", "dinf", "source_freq", '-v7.3');
energy_t = zeros(1, size(ux_s,3));
for ii = 1:size(ux_s,3)
frame = ux_s(:,:,ii);
energy_t(ii) = sum(frame(:).^2); % proportional to elastic energy
end
% 2. Difference between consecutive frames
diff_energy = zeros(1, size(ux_s,3)-1);
for ii = 1:(size(ux_s,3)-1)
diff = ux_s(:,:,ii+1) - ux_s(:,:,ii);
diff_energy(ii) = sum(diff(:).^2);
end
% 3. Plot energy over time
figure;
subplot(2,1,1);
plot(kgrid.t_array, energy_t, 'b-', 'LineWidth', 2);
xlabel('Time [s]');
ylabel('Total Shear Energy');
title('Shear Field Energy vs Time');
grid on;
% 4. Plot frame-to-frame difference
subplot(2,1,2);
plot(kgrid.t_array(2:end), diff_energy, 'r-', 'LineWidth', 2);
xlabel('Time [s]');
ylabel('Frame-to-Frame Energy Change');
title('Difference Between Consecutive Shear Frames');
grid on;
Error output:
Running k-Wave elastic simulation...
start time: 27-May-2025 22:34:42
reference sound speed: 1500m/s
[Warning: Support for ver('distcomp') will be removed in a future release. Use
ver('parallel') instead.]
[> In ver>locGetSingleToolboxInfo (line 283)
In ver (line 56)
In verLessThan (line 39)
In kspaceFirstOrder_inputChecking (line 1306)
In pstdElastic2D (line 385)
In sphere (line 104)]
dt: 400ns, t_end: 150ms, time steps: 375001
input grid size: 60 by 100 grid points (120 by 200mm)
maximum supported compressional frequency: 375kHz
maximum supported shear frequency: 250Hz
expanding computational grid...
computational grid size: 100 by 140 grid points
{Error using gpuArray.zeros
Out of memory on device. To view more detail about available memory on the GPU,
use 'gpuDevice()'. If the problem persists, reset the GPU by calling
'gpuDevice(1)'.
Error in kspaceFirstOrder_inputChecking>@(sz)gpuArray.zeros(sz,'single') (line 1314)
castZeros = @(sz) gpuArray.zeros(sz, 'single');
Error in kspaceFirstOrder_createStorageVariables (line 336)
sensor_data.uy_split_s = castZeros([num_sensor_points, num_recorded_time_points]);
Error in kspaceFirstOrder_inputChecking (line 1663)
kspaceFirstOrder_createStorageVariables;
Error in pstdElastic2D (line 385)
kspaceFirstOrder_inputChecking;
Error in sphere (line 104)
sensor_data = pstdElastic2D(kgrid, medium, source, sensor, input_args{:});}