Safety margin in available memory and max number of particles

During some convergence/performance studies with the example DamBreak3D on a single GPU, I reduced the interparticle distance dp until the compilation of the attending executable has been stopped because of the following condition:

FATAL: thread 0 needs 120671192 (and up to 120671192) particles, but we can only store 39130316 in 14.87 GiB available of 15.9 GiB total with 508.71 MiB safety margin

Estimated memory consumption: 408B/particle
NOTE: device 0 can allocate 39130316 particles, while the whole simulation might require 120671192

The message is clear and acceptable. It does invite some questions:

  • what is the rationale of maintaining a safety margin for memory consumption?
  • would this safety margin be a benefit to running multi-GPU simulations?
  • is there a possibility to customize the margin size and, as a special case, bring it down to 0 units or 0%?

I have had a glance at the options to make executables and to steer them, and I cannot seem to find any serving this end. Perhaps there is some keyword in the source file of the example or, most probably, deeper in the code. Directions and corrections appreciated.

Keeping in mind that even without the safety margin the simulation wouldn’t fit in your case, the main reason for the safety margin was to avoid everything crashing when using a GPU connected to a display where accelerated (OpenGL) graphics was being rendered. In this sense it’s independent from multi-GPU, for which data exchange occurs within the standard particle buffers.

Making the safety margin optional or at least configurable is in the TODO list (GPUWorker::computeAndSetAllocableParticles). It’s not entirely trivial because you might have a multi-GPU setup where one GPU is attached to the display (and for which you’d want a safety margin) while the others are not (and for them you could set safety to 0). If we ignore this possible heterogeneity, however, it should be relatively easy to implement a command-line option to make this configurable.

You can provisionally set safetyMargin = 0 (by default it’s set to 1/32nd of the total device memory) in the function I mentioned, even though of course that doesn’t help when you want 3× more particles that can fit in the device memory :sunglasses: