Segmentation fault when running new case

Manuel · November 25, 2020, 4:49pm

Hello,

I am trying to simulate an air flow through internal cooling channels of a drill bit. Later on I will have to add a liquid phase, but at the moment I started easy with just a single phase.
I used Crixus for generating the initial sets of particles. The geometry consists of one inlet, one outlet and walls. Then I wrote the code for running this new case (I used ChannelIO as a starting point). But when I execute the case, I get rather soon a segmentation fault error. Here below the execution output:

No devices specified, falling back to default (0)…
GPUSPH version v5.0+73-83070831
Release version without fastmath for compute capability 7.0
Chrono : disabled
HDF5 : enabled
MPI : disabled
Catalyst : disabled
Compiled for problem “DrillOnePhaseStill”
[Network] rank 0 (1/1), host
tot devs = 1 (1 * 1)
Info stream: GPUSPH-6728
Initializing…
Reading particle data from the input: ./data_files/DrillOnePhaseStill/0.DrillOnePhaseStill.fluid.h5sph
Reading particle data from the input: ./data_files/DrillOnePhaseStill/0.DrillOnePhaseStill.boundary.kent0.h5sph
Reading particle data from the input: ./data_files/DrillOnePhaseStill/0.DrillOnePhaseStill.boundary.kent2.h5sph
Reading particle data from the input: ./data_files/DrillOnePhaseStill/0.DrillOnePhaseStill.boundary.kent1.h5sph
Water level not set, autocomputed: 0.002481
Max fall height not set, autocomputed: 0.00501
Max particle speed not set, autocomputed from max fall: 0.313522
WARNING: dt 9e-08 will be used only for the first iteration because adaptive dt is enabled
Expected maximum shear rate: 2.08333e+06 1/s
dt = 9e-08 (CFL conditions from soundspeed: 9.6e-08, from gravity 0.000699497, from viscosity 0.00012)
Using problem-set neib list size 640 (safe computed value was 288)
Using problem-set neib bound pos 511 (safe computed value was 287)
Brezzi diffusion coefficient = 1.000000e-01
Artificial viscosity epsilon is not set, using default value: 1.440000e-10
Using problem dir ./tests/DrillOnePhaseStill_2020-11-25T16h33
Problem calling set grid params
Influence radius / neighbor search radius / expected cell side : 0.00024 / 0.00024 / 0.00024

World origin: -0.001 , -0.003 , -0.003
World size: 0.013 x 0.006 x 0.006
Cell size: 0.000240741 x 0.00024 x 0.00024
Grid size: 54 x 25 x 25 (33,750 cells)
Cell linearization: y,z,x
Dp: 6e-05
R0: 6e-05
Generating problem particles…
WARNING: setting the mass by density can’t work with a point-based geometry without a mesh!
WARNING: setting the mass by density can’t work with a point-based geometry without a mesh!
WARNING: setting the mass by density can’t work with a point-based geometry without a mesh!
WARNING: setting the mass by density can’t work with a point-based geometry without a mesh!
VTKWriter will write every 0.0005 (simulated) seconds
HotStart checkpoints every 0.0005 (simulated) seconds
will keep the last 8 checkpoints
Allocating shared host buffers…
Numbodies : 0
Numforcesbodies : 0
numOpenBoundaries : 2
allocated 69.66 MiB on host for 913,108 particles (760,922 active)
Copying the particles to shared arrays…

Open boundaries: 2
Fluid: 506373 parts, mass 2.5488e-13
Boundary: 169691 parts, mass 0
Testpoint: 0 parts
Tot: 760922 particles

RB First/Last Index:
Preparing the problem…

device at index 0 has 760,922 particles assigned and offset 0
Integrator predictor/corrector instantiated.
Starting workers…
number of forces rigid bodies particles = 0
thread 0x7fc7817db700 device idx 0: CUDA device 0/1, PCI device 0000:3b:00.0: Tesla V100-PCIE-32GB
Device idx 0: free memory 32189 MiB, total memory 32510 MiB
Estimated memory consumption: 1404B/particle
Device idx 0 (CUDA: 0) allocated 0 B on host, 1.18 GiB on device
assigned particles: 760,922; allocated: 913,108
GPUSPH: initialized
Performing first write…
Letting threads upload the subdomains…
Thread 0 uploading 760922 Position items (11.61 MiB) on device 0 from position 0
Thread 0 uploading 760922 Velocity items (11.61 MiB) on device 0 from position 0
Thread 0 uploading 760922 Info items (5.81 MiB) on device 0 from position 0
Thread 0 uploading 760922 Hash items (2.9 MiB) on device 0 from position 0
Thread 0 uploading 760922 Next generated ID items (2.9 MiB) on device 0 from position 0
Segmentation fault (core dumped)

Manuel · November 25, 2020, 4:49pm

To generate the code for this case, I just modified few things in the ChannelIO.cu and added few more lines. Here below the code:

DrillOnePhaseStill::DrillOnePhaseStill(GlobalData *_gdata) : Problem(_gdata)
{
SETUP_FRAMEWORK(
kernel,
formulation<SPH_F1>,
densitydiffusion,
rheology,
turbulence_model,
computational_visc,
visc_model,
visc_average,
boundary<LJ_BOUNDARY>,
periodicity<PERIODIC_NONE>,
add_flags<ENABLE_DTADAPT | ENABLE_INLET_OUTLET>
);

//Parameters
m_name = "DrillOnePhaseStill";
set_smoothing(2.0f);//1.3f
set_deltap(0.00006f);
physparams()->r0 = m_deltap;
physparams()->dcoeff = 0.05f;
resize_neiblist(512, 128);
simparams()->buildneibsfreq = 1;
simparams()->dt = 0.00000009f;
simparams()->dtadaptfactor = 0.2f;
simparams()->tend = 0.005f;
simparams()->densityDiffCoeff = 0.1;
set_gravity(0.0, 0.0, -9.81);
add_writer(VTKWRITER, 5.0e-4f);
size_t air = add_fluid(1.18);
set_equation_of_state(air, 1.4f, 250.0f);
set_kinematic_visc(air, 0.000015f);
set_artificial_visc(0.1f);//0.2f


//Geometry
m_origin = make_double3(-0.001, -0.003, -0.003);
m_size = make_double3(0.013, 0.006, 0.006);

// fluid
addHDF5File(GT_FLUID, Point(0,0,0), "./data_files/DrillOnePhaseStill/0.DrillOnePhaseStill.fluid.h5sph", NULL);

// main container
GeometryID container =
	addHDF5File(GT_FIXED_BOUNDARY, Point(0,0,0), "./data_files/DrillOnePhaseStill/0.DrillOnePhaseStill.boundary.kent0.h5sph", NULL);
disableCollisions(container);

// inflow
GeometryID inlet =
	addHDF5File(GT_OPENBOUNDARY, Point(0,0,0), "./data_files/DrillOnePhaseStill/0.DrillOnePhaseStill.boundary.kent2.h5sph", NULL);
disableCollisions(inlet);
setVelocityDriven(inlet, 1);

// outflow
GeometryID outlet =
	addHDF5File(GT_OPENBOUNDARY, Point(0,0,0), "./data_files/DrillOnePhaseStill/0.DrillOnePhaseStill.boundary.kent1.h5sph", NULL);
disableCollisions(outlet);
setVelocityDriven(outlet, 0);

}

device
void
DrillOnePhaseStill_imposeBoundaryCondition(
const particleinfo info,
const float3 absPos,
float waterdepth,
const float t,
float4& vel,
float4& eulerVel,
float& tke,
float& eps)
{
// Default value for eulerVel
// Note that this default value needs to be physically feasible, as it is used in case of boundary elements
// without fluid particles in their support. It is also possible to use this default value to impose tangential
// velocities for pressure outlets.
eulerVel = make_float4(0.0f, 0.0f, 0.0f, 0.0f);
vel = make_float4(0.0f);
tke = 0.0f;
eps = 0.0f;

// open boundary conditions
if (IO_BOUNDARY(info)) {

	if (!VEL_IO(info)) {
		const float pressure = 101325;
		eulerVel.w = RHO(pressure, fluid_num(info));
	} else {
		const float U = -10.0;
		eulerVel.x = U;
	}
}

}

I tried to dig out into the code and see what causes the crash. I could find out that the execution doesn’t go through this line in the GPUSPH::runSimulation()

gdata->nextCommand = IDLE;

the code doesn’t go through the following line
gdata->threadSynchronizer->barrier(); // end of UPLOAD, begins SIMULATION ***
gdata->threadSynchronizer->barrier(); // unlock CYCLE BARRIER 1

However, I couldn’t figure out why this happens. Do you have any idea? Am I setting something wrong in the code?

Thank you very much for your help.

Regards

Manuel

giuseppe.bilotta · November 25, 2020, 5:08pm

Hello @Manuel, I’m actually surprised that the test case compiles, since open boundaries aren’t currently supported with LJ boundary conditions, but maybe we forgot one of the options compatibility tests.

Concerning the segmentation fault, it’s possible that the segmentation fault happens in a different thread than the main one. Can you run GPUSPH under gdb and at the SIGSEGV issue the gdb command thread apply all bt? This should let you know where each thread is at the moment of the segfault. Remember to rebuild everything with the -g option, which you can do by creating (or editing, if it already exists) Makefile.local, adding a line with

CPPFLAGS+=-g

and rebuilding (make clean ; make ).

Manuel · November 26, 2020, 9:48am

Hello @giuseppe.bilotta,

thank you very much for your reply.
I ran GPUSPH under gdb. This it the output

Letting threads upload the subdomains…
Thread 0 uploading 760922 Position items (11.61 MiB) on device 0 from position 0
Thread 0 uploading 760922 Velocity items (11.61 MiB) on device 0 from position 0
Thread 0 uploading 760922 Info items (5.81 MiB) on device 0 from position 0
Thread 0 uploading 760922 Hash items (2.9 MiB) on device 0 from position 0
Thread 0 uploading 760922 Next generated ID items (2.9 MiB) on device 0 from position 0

Thread 2 “DrillOnePhaseSt” received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffece32700 (LWP 22612)]
0x00005555555dba79 in GPUWorker::uploadNumOpenVertices (this=0x555555b86f80) at src/GPUWorker.cc:3301
3301 uploadNumOpenVertices();
(gdb) thread apply all bt

Thread 4 (Thread 0x7fffcffff700 (LWP 22624)):
#0 0x00007ffff656fcb9 in __GI___poll (fds=0x7fffc4000ca0, nfds=15, timeout=100) at …/sysdeps/unix/sysv/linux/poll.c:29
#1 0x00007fffe7040a93 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007fffe70d693a in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007fffe7043118 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff73e56db in start_thread (arg=0x7fffcffff700) at pthread_create.c:463
#5 0x00007ffff657c71f in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 3 (Thread 0x7fffe0e17700 (LWP 22623)):
#0 0x00007ffff657e0c7 in accept4 (fd=11, addr=…, addr_len=0x7fffe0e16b38, flags=524288) at …/sysdeps/unix/sysv/linux/accept4.c:32
#1 0x00007fffe7041a5a in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007fffe70335bd in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007fffe7043118 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff73e56db in start_thread (arg=0x7fffe0e17700) at pthread_create.c:463
#5 0x00007ffff657c71f in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7fffece32700 (LWP 22612)):
#0 0x00005555555dba79 in GPUWorker::uploadNumOpenVertices (this=0x555555b86f80) at src/GPUWorker.cc:3301
#1 GPUWorker::simulationThread() () at src/GPUWorker.cc:3301
#2 0x00007ffff6ed2d80 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff73e56db in start_thread (arg=0x7fffece32700) at pthread_create.c:463
#4 0x00007ffff657c71f in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7ffff7fc3080 (LWP 22608)):
#0 0x00007ffff73ebad3 in futex_wait_cancelable (private=, expected=0, futex_word=0x555555b86f58) at …/sysdeps/unix/sysv/linux/futex-internal.h:88
#1 __pthread_cond_wait_common (abstime=0x0, mutex=0x555555b86f08, cond=0x555555b86f30) at pthread_cond_wait.c:502
#2 __pthread_cond_wait (cond=0x555555b86f30, mutex=0x555555b86f08) at pthread_cond_wait.c:655
#3 0x00007ffff6ecd18c in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00005555555cde70 in Synchronizer::barrier (this=0x555555b86f00) at src/Synchronizer.cc:53
#5 0x0000555555632701 in GPUSPH::runSimulation() () at src/GPUSPH.cc:742
#6 0x0000555555612be9 in simulate(GlobalData*, RunMode) () at src/main.cc:375
#7 0x0000555555582da8 in main () at src/main.cc:471
#8 0x00007ffff647cbf7 in __libc_start_main (main=0x555555582500 , argc=1, argv=0x7fffffffe2a8, init=, fini=, rtld_fini=, stack_end=0x7fffffffe298)
at …/csu/libc-start.c:310
#9 0x0000555555583a9a in _start () at src/GPUWorker.cc:3370

Is this caused by the inconsistent choice of boundary and open conditions? Or does the problem lie somewherelse?
I tried to use LJ boundary conditions in order to avoid (or maybe just reduce) wall penetration problems. Then I presume that MK boundary conditions aren’t currently compatible with open boundaries as well… right?

Thank you for your help.

Regards

Manuel

Manuel · November 27, 2020, 4:27pm

I am wondering whether the problem has maybe something to do with the stl files, since it crashes in the uploadNumOpenVertices() function.
When I export the meshes as stl files, Salome throws the following message:

“During export mesh with name “Mesh_1” to STL Edges will be missed.
Do you want to continue?”

Then I press Yes. Does Crixus need the Edges?

Thank you.

Regards

Manuel

giuseppe.bilotta · November 27, 2020, 5:04pm

Hello @Manuel,

yes, the segmentation fault is most likely related to the fact that open boundaries are not supported (in master and next) with any boundary model other than SA_BOUNDARY. We are currently working on a more general implementation that would support other boundary models, but I’m afraid it is not ready for inclusion in the main branches yet. There is however a bug in the sense that the build process should have failed due to incompatibility of the options. I’ll look into that.

I’m afraid I don’t know the answer to the SALOME/Crixus issue.

Manuel · November 30, 2020, 10:07am

Hello @giuseppe.bilotta,

thank you for you reply.
I have tried to replace LJ_BOUNDARY with SA_BOUNDARY. However, I get the following error, when I then try to run the simulation:

Allocating shared host buffers…
Numbodies : 0
Numforcesbodies : 0
numOpenBoundaries : 2
allocated 125.4 MiB on host for 913,108 particles (760,922 active)
Copying the particles to shared arrays…

Fixing connectivity…FATAL: connectivity: particle id 588053 index 588053 loaded from HDF5 points to non-existing vertices (506361,506412,506404)!

It seems that the cause of this problem lies in the h5sph files. Should I have changed any Crixus setting, in order to then run a simulations with SA_BOUNDARY? I couldn’t find any specific reference to this point in the Crixus description.

Thank you for your help.

Regards

Manuel

giuseppe.bilotta · November 30, 2020, 11:47am

Hello @Manuel,

I’m afraid I don’t know enough about Crixus to help you in this. You could try having a look at the Crixus .ini file used to generate the CompleteSaExample, available as an attachment to issue 46 on GitHub, to see how to configure Crixus for open boundaries.

Manuel · December 7, 2020, 5:28pm

Hello @giuseppe.bilotta,

thank you very much for your suggestion and sorry for coming back to you late.
Last week I had little time to work on GPUSPH due to other duties.

Your suggestion to have a look at the issue on GithHub was helpful. It put me on the right direction to spot out where the problem was. I didn’t understand that before running crixus all the triangles have to pass the test performed by the tool “test-triangle-size”. I thought that it was only a recommendation but not a requirement. After having increased slightly the particle distance such that all the triangles pass the test, then the connectivity problem no longer appears and the GPUSPH simulation starts.

However, a new problem occurred. The simulation crashes after one time step with incredibly high velocities. By analysing the first output, I could notice some issues with boundary and initial conditions.
These are my boundary conditions:

// open boundary conditions
if (IO_BOUNDARY(info)) {

	if (!VEL_IO(info)) {
		eulerVel.w = 1.18;
	} else {
		const float U = -10.0;
		eulerVel.x = U;
	}
}

As for the initial conditions, I didn’t set anything.
The output pressure field at time 0 shows an hydrostatic profile with a maximum value of roughly 500. But then the pressure field at the next time step has a value of 100000 at the outlet. This made me think that probably I should use the “initializeParticles” method in order to set up the initial pressure/density fields in an appropriate way overwriting the hydrostatic initialization. However, I didn’t manage to use this function in the correct way. I tried to have a look at other examples which use this function, but somehow it doesn’t work in my case. This is how I coded the function:

void DrillOnePhaseStill::initializeParticles(BufferList &buffer, const uint numParticle)
{
float *effpres = buffer.getData<BUFFER_EFFPRES>();

for (uint i = 0 ; i < numParticle ; i++) {
	effpres[i] = 100000.f;
}

}

Does “EffPress” represent the fluid pressure?

Thank you very much for your help.

Regards

Manuel

giuseppe.bilotta · December 7, 2020, 6:16pm

Hello @Manuel,

I’m glad that things are getting closer to a functioning condition now!

Concerning the particle initialization, the effective pressure stored in EFFPRES is not the particle pressure, but an effective pressure used for the granular viscous model.

Since in GPUSPH we use the standard weakly-compressible SPH formulations, the pressure is not set directly, but by setting the particle density according to the inverse equation of state. The density in GPUSPH is stored in the 4th component of the velocity (BUFFER_VEL), and you can see an example of it being set in TurbulentPoiseuilleFlowSA, LockExchange and other problems.

Another thing to keep in mind is that to be in a weakly compressible regime, the sound speed must be high enough compared to the velocity of the particles, in order to avoid shocks. This has two requirements: one, your initial setup must not have incoming shocks (e.g.: still water, but high inflow velocity right from the start); two, the sound speed must be properly configured. If you do not setthe sound speed explicitly, GPUSPH will computed from the maximum free-fall velocity according to the initial configuration. If you expect velocities higher than that, you should inform GPUSPH of that, by using setMaxParticleSpeed(fluidnumber, velocity) (e.g. setMaxParticleSpeed(0, 10.0) or something similar in your case.

You can verify the sound speed used by GPUSPH from the summary.txt file found in the test case output directory, or from the simulation output (message looks like: Speed of sound for fluid 0 auto-computed as c0 = 62).

Hope that helps,

Giuseppe Bilotta

Manuel · December 9, 2020, 4:32pm

Hello @giuseppe.bilotta,

thank you very much for your suggestions.

I corrected the implementation of the “initializeParticles” function. Now the initialization seems to work fine. Here below the implementation:

float4 *vel = buffer.getData<BUFFER_VEL>();
float4 *eulerVel = buffer.getData<BUFFER_EULERVEL>();
particleinfo *info = buffer.getData<BUFFER_INFO>();

for (uint i = 0; i < numParticles; i++) {
	if (FLUID(info[i])) {
		vel[i].x = 0.f;
		vel[i].y = 0.f;
		vel[i].z = 0.f;
	} else if (VERTEX(info[i]) || BOUNDARY(info[i])) {
		eulerVel[i].x = 0.f;
		eulerVel[i].y = 0.f;
		eulerVel[i].z = 0.f;
	}
	vel[i].w = atrest_density(0);
}

The value of the speed of sound was already set to be approximately 10 times larger than the expected maximum fluid velocity. Should it be even larger? However, I was not taking into account so far the incoming shocks. Then I tried to reduce the inlet velocity to 0.1. However, even with this small inlet velocity value the simulation crashes immediately. The error is as before:

Simulation time t=0.000000e+00s, iteration=0, dt=9.000000e-08s, 574,553 parts (0, cum. 0 MIPPS), maxneibs 288+75
FATAL: timestep 7.48238e-08 under machine epsilon at iteration 1 - requesting quit…
Simulation time t=9.000000e-08s, iteration=1, dt=7.482384e-08s, 574,553 parts (0.94, cum. 0.94 MIPPS), maxneibs 288+75
Elapsed time of simulation cycle: 0.61s
Peak particle speed was ~67.5168 m/s at 9e-08 s -> can set maximum vel 74 for this problem
Simulation end, cleaning up…
Deallocating…

Do you have any idea where the problem could come from?
I even tried with a different particle preprocessing by using the Mefisto meshing algorithm in Salome (before I was using Netgen 1D-2D). But same error again.

Thank you for your help.

Best regards

Manuel

giuseppe.bilotta · December 9, 2020, 5:13pm

Very odd. One thing you can try is to run with the option --debug forces,inspect_preforce,inspect_pregamma. This will do a dump of the particle state before forces computation and before integration, adding to the save file also the forces and density derivative for each particle. This should help identify which particles are seeing anomalous forces, velocities or densities, which can help track down the issue.

One thing to keep in mind is that we don’t store the actual particle density in .w, but the “relative density variation”, which is (rho/rho0) - 1. So when imposing pressure (i.e. density) you need to convert the physical density value to the relative density value, using numerical_density(phys_rho, fluid_num).

So for example if in the pressure inlet that 1.18 is the physical density that would give you the required pressure, you would do eulerVel.w = numerical_density(1.18, fluid_num(info)) (or use the RHO() function to compute the value directly from the pressure).

Manuel · January 5, 2021, 11:42am

Hello @giuseppe.bilotta,

thanks for the suggestions.
After various attempts I could run the case by changing the density diffusion from Brezzi to Ferrari and setting the length scale to 0.05. However, if I then try to run a simulation with an higher inlet velocity, the same problem as before appears again, regardless the value of the Ferrari length scale. Might there be some issues with the stabilization of the simulation? I tried to use a viscosity model other than DYNAMICVISC. However, ARTVISC, KEPSVISC and SPSVISC seem not to be compatible with the other settings and compiling gives errors. Is there a way to adopt a different viscosity model in combination with SA_BOUNDARY and INLET_OUTLET?

Thank you for the help.

Regards

Manuel

giuseppe.bilotta · January 14, 2021, 5:16pm

Hello @Manuel, and sorry for the late reply.

Concerning the viscous models, KEPSVISC should work with SA and open boundaries. However, I still suspect that there might be some initialization issue with the mesh that is causing the instability. Did you try running with the debug options I suggested? It may help you identify the area of the domain where the shock is occurring.

Segmentation fault when running new case

Open boundaries: 2 Fluid: 506373 parts, mass 2.5488e-13 Boundary: 169691 parts, mass 0 Testpoint: 0 parts Tot: 760922 particles

Allocating shared host buffers… Numbodies : 0 Numforcesbodies : 0 numOpenBoundaries : 2 allocated 125.4 MiB on host for 913,108 particles (760,922 active) Copying the particles to shared arrays…

Open boundaries: 2
Fluid: 506373 parts, mass 2.5488e-13
Boundary: 169691 parts, mass 0
Testpoint: 0 parts
Tot: 760922 particles

Allocating shared host buffers…
Numbodies : 0
Numforcesbodies : 0
numOpenBoundaries : 2
allocated 125.4 MiB on host for 913,108 particles (760,922 active)
Copying the particles to shared arrays…