Hello,
I have just installed GPUSPH and tried to run some tests of the src/problems folder. However I get warnings and the results at the end of the simulations are wrong.
For example, regarding the DamBreak3D problem this is what I get:
Simulation time t=4.523146e-02s, iteration=186, dt=2.659091e-04s, 84,444 parts (15, cum. 14 MIPPS), maxneibs 122+0
Simulation time t=5.006359e-02s, iteration=208, dt=2.017437e-04s, 84,444 parts (15, cum. 14 MIPPS), maxneibs 124+0
Simulation time t=5.512060e-02s, iteration=231, dt=2.659091e-04s, 84,444 parts (15, cum. 14 MIPPS), maxneibs 124+0
Simulation time t=6.015487e-02s, iteration=251, dt=2.424981e-04s, 84,444 parts (14, cum. 14 MIPPS), maxneibs 124+0
Simulation time t=6.505197e-02s, iteration=271, dt=2.659091e-04s, 84,444 parts (13, cum. 14 MIPPS), maxneibs 125+0
WARNING: at iteration 280 the number of particles changed from 84444 to 84443 for no known reason!
WARNING: at iteration 280, time 0.0671853 particle ID 0 is at indices 0 and 1!
WARNING: at iteration 280, time 0.0671853 particle ID 1 was not found!
Recap of devices after roll call:
- device at index 0 has 84,443 particles assigned and offset 0
Simulation time t=7.002765e-02s, iteration=293, dt=2.659091e-04s, 84,443 parts (15, cum. 14 MIPPS), maxneibs 125+0
WARNING: at iteration 310 the number of particles changed from 84443 to 84442 for no known reason!
WARNING: at iteration 310, time 0.0737294 particle ID 67044 was not found!
Recap of devices after roll call: - device at index 0 has 84,442 particles assigned and offset 0
Simulation time t=7.523180e-02s, iteration=317, dt=1.200993e-04s, 84,442 parts (16, cum. 14 MIPPS), maxneibs 125+0
WARNING: at iteration 330 the number of particles changed from 84442 to 84439 for no known reason!
Simulation time t=8.005969e-02s, iteration=345, dt=1.771645e-04s, 84,439 parts (16, cum. 14 MIPPS), maxneibs 125+0
WARNING: at iteration 350 the number of particles changed from 84439 to 84434 for no known reason!
WARNING: at iteration 1930 the number of particles changed from 71562 to 71512 for no known reason!
WARNING: current max. neighbors numbers (132 | 0) greater than max possible neibs (127 | 0) at iteration 1930
possible culprit: 1216 (neibs: 132 + 0 | 0)
WARNING: at iteration 1940 the number of particles changed from 71512 to 71487 for no known reason!
WARNING: current max. neighbors numbers (131 | 0) greater than max possible neibs (127 | 0) at iteration 1940
possible culprit: 960 (neibs: 131 + 0 | 0)
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 1941, time 0.330071
Simulation time t=3.300714e-01s, iteration=1,941, dt=2.085889e-04s, 71,487 parts (20, cum. 20 MIPPS), maxneibs 180+0
WARNING: at iteration 1950 the number of particles changed from 71487 to 71441 for no known reason!
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 1950, time 0.331993
WARNING: at iteration 1960 the number of particles changed from 71441 to 71387 for no known reason!
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 1967, time 0.335032
Simulation time t=3.350320e-01s, iteration=1,967, dt=2.200599e-04s, 71,387 parts (22, cum. 20 MIPPS), maxneibs 180+0
WARNING: at iteration 1970 the number of particles changed from 71387 to 71367 for no known reason!
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 1970, time 0.335692
WARNING: at iteration 1980 the number of particles changed from 71367 to 71351 for no known reason!
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 1990, time 0.340093
Simulation time t=3.400934e-01s, iteration=1,990, dt=2.200599e-04s, 71,351 parts (20, cum. 20 MIPPS), maxneibs 180+0
WARNING: at iteration 1990 the number of particles changed from 71351 to 71332 for no known reason!
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 1990, time 0.340093
WARNING: at iteration 2000 the number of particles changed from 71332 to 71313 for no known reason!
WARNING: at iteration 2010 the number of particles changed from 71313 to 71312 for no known reason!
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 2014, time 0.345084
Simulation time t=3.450835e-01s, iteration=2,014, dt=2.200599e-04s, 71,312 parts (21, cum. 20 MIPPS), maxneibs 180+0
WARNING: at iteration 2020 the number of particles changed from 71312 to 71273 for no known reason!
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 2020, time 0.346147
WARNING: at iteration 2030 the number of particles changed from 71273 to 71202 for no known reason!
WARNING: particle 0 (id 0, type 0) has NAN position! (nan, nan, nan) @ (0, 0, 0) = (nan, nan, nan) at iteration 2038, time 0.350108
Simulation time t=3.501078e-01s, iteration=2,038, dt=2.200599e-04s, 71,202 parts (20, cum. 20 MIPPS), maxneibs 180+0
Basically, the particles slowly flow out of the domain.
Do you have any idea why this happens? Could it be due to my GPU/CUDA which are not properly working?
Here the GPU/CUDA details:
CUDA Device Query (Driver API) statically linked version
Detected 1 CUDA Capable device(s)
Device 0: “Tesla V100-PCIE-32GB”
CUDA Driver Version: 11.1
CUDA Capability Major/Minor version number: 7.0
Total amount of global memory: 32510 MBytes (34089730048 bytes)
(80) Multiprocessors, ( 64) CUDA Cores/MP: 5120 CUDA Cores
GPU Max Clock rate: 1380 MHz (1.38 GHz)
Memory Clock rate: 877 Mhz
Memory Bus Width: 4096-bit
L2 Cache Size: 6291456 bytes
Max Texture Dimension Sizes 1D=(131072) 2D=(131072, 65536) 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Texture alignment: 512 bytes
Maximum memory pitch: 2147483647 bytes
Concurrent copy and kernel execution: Yes with 7 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 59 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
Result = PASS
Thanks for the help.
Regards
Manuel