Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behaviour between placed and instantiated RigidBody3D #857

Closed
MemoriesIn8bit opened this issue May 17, 2024 · 22 comments · Fixed by #865
Closed

Inconsistent behaviour between placed and instantiated RigidBody3D #857

MemoriesIn8bit opened this issue May 17, 2024 · 22 comments · Fixed by #865
Labels
enhancement New feature or request topic:runtime Concerning runtime behavior (or its source code)

Comments

@MemoriesIn8bit
Copy link

Hello there!

I've been tinkering with my project and I noticed something interesting:
A Rigidbody that is placed manually into a scene will behave ever so slightly different than one which was instantiated during runtime (even if during _ready of the main node).

I was able to isolate two specific cases (there might be more, but I can't be sure):

  1. A sphere drops on a box. The manually placed sphere will rest still. The instantiated one will roll off the box. This seems to be related to shape margins. If the shape margins of the box are disabled, both spheres will remain still.
  2. A sphere drops on a box which is slightly angled (0.1 deg on Z axis in my case). The manually placed sphere will roll slightly slower than the instantiated one (shape margins were disabled on both boxes). I am not sure if one of them ignores friction or damping values - this is really hard to nail down

I have created a small project which I attached below. Opening that and just running the "main" scene should show you both issues I listed above. I hope this helps.

Now this could be entirely an issue with Godot itself, but disabling Jolt and switching back to Godot Physics yields consistent behaviors (you can try that in the provided project). But maybe that also is because GodotPhysics is a bit too.... rudimentary. I am not sure. But if this is the case, I am sorry.

All these effects were observed with Godot 4.2.2 using Jolt 0.12.0 and using Windows.

Rigidbody-placed-vs-instantiated.zip

@github-actions github-actions bot added the needs triage Something that needs investigation label May 17, 2024
@mihe mihe added bug Something that isn't working as intended topic:runtime Concerning runtime behavior (or its source code) severity:minor Small impact on functionality or usability and removed needs triage Something that needs investigation labels May 17, 2024
@mihe mihe changed the title Inconsistent behaviour between placed and instantiated Rigidbodies. Inconsistent behaviour between placed and instantiated RigidBody3D May 17, 2024
@mihe
Copy link
Contributor

mihe commented May 18, 2024

That is very strange.

I'm able to reproduce it in Jolt's Samples application as well, with the help of its snapshot feature.

What's even stranger there is that if I reverse the order by which the bodies get added to the scene then the behavior flips between the sets of bodies. That would seem to suggest that the issue is not with the values of the bodies, but perhaps something within Jolt's simulation instead.

But at the same time, when there are multiple bodies of each "type", then the behavior is consistent between the ones that are pre-placed versus the ones that are dynamically instantiated, which would suggest that there is some difference between them still, and that the fault is likely on my end, but I can't for the life of me spot what that difference would be.

I'll take another look tomorrow, but if you're able to take a look at the snapshot, @jrouwe, I'd appreciate another pair of eyes on this. You'll find the snapshot here: jolt_snapshot_2024-05-18_01-26-58_1044.zip, which is from this Godot project, that I reduced further in order to eliminate some possibilities: PlacedVersusInstantiated.zip

What you see in that snapshot should rightfully be identical spheres dropping down onto identical (slightly tilted) boxes, where two of them were added one way and the other two added another way, but they end up rolling away at different rates, but consistent between the way that they were added to the scene in Godot. It's very strange.

If you add std::reverse(scene->GetBodies().begin(), scene->GetBodies().end()) just before the PhysicsScene::CreateBodies call in LoadSnapshotTest you'll see that the behavior flips between the two sets of dynamic bodies.

That snapshot is taken right after the dynamically instantiated bodies were added to the scene, which happens before the first update of the JPH::PhysicsSystem, so the simulation shouldn't have been stepped yet, if that matters at all.

EDIT: If you want to make your own snapshot, you'll need #858 first, as snapshots had apparently regressed a bit.

@MemoriesIn8bit
Copy link
Author

Thank you for taking a look at this. Took me the better part of ta day to get a reliable repro of this and I am still not sure if this is really a Jolt or Godot problem. I wish I could have provided more info, but without recompiling the engine (I REALLY have to set an environment where I can do this) and setting some custom readouts, I can't even figure out what exactly is responsible for these variations.

@mihe
Copy link
Contributor

mihe commented May 18, 2024

For what it's worth, I don't expect people to understand where this extension ends and Jolt itself begins, much less debug the actual code itself. Just reporting discrepancies like this here with a concise repro (which I appreciated) is perfectly fine, and it'll make its way upstream if it needs to.

@jrouwe
Copy link
Contributor

jrouwe commented May 18, 2024

You found a very interesting problem. The reason why there is a difference between these spheres is that Jolt uses the BodyID's of two colliding bodies to consistently order them for the solver:

image

So you can see that due to the order the bodies are created in, two of the pairs are solved as (Box, Sphere) and the other two are solved (Sphere, Box) which results in the normals being in opposite directions. Normally this should only result in minute differences but in this particular case the difference is quite large which suggests that there is some sort of bug happening here. I can see that the problem goes away when I disable the contact cache and I have a hunch as to what may be going on. In any case I think this is a problem that I need to investigate further and not @mihe.

@mihe
Copy link
Contributor

mihe commented May 18, 2024

Great, thank you for taking a look!

Note that I left out the first test case that @MemoriesIn8bit provided when generating that snapshot, just to narrow things down a bit for myself, since I'm assuming it's the same root cause for both. I'll be sure to double-check if/when you're able to find a fix for that one.

@mihe mihe added the external Involving an external party label May 18, 2024
@jrouwe
Copy link
Contributor

jrouwe commented May 19, 2024

I figured out what the issue is: The contact cache (which stores previous frames collisions to reduce the amount of collision detection work for the next frame), uses a distance and rotation check to see if the bodies moved too much relative to each other to reuse last frame's result. It stores the distance between the bodies in the local space of the body with the lowest ID, this means that for the (Box, Sphere) pairs it is stored in the local space of the box and for the (Sphere, Box) pairs it is stored in the local space of the sphere. Since the sphere is tiny (2.5 cm radius) it easily picks up angular velocity and when the distance is stored local to the sphere the contact cache is essentially never used in this scene. When it is stored relative to the box, the check passes and the previous frame results are used for a very short while until the sphere picks up more speed and then it also stops using the contact cache. This is enough to create a difference between the two simulations. If I hack the code so that it always does the calculations relative to the same object, the simulations become the same (regardless of which object I pick).

Next step is thinking about a fix which is not easy since the contact cache is the hottest path in the simulation so anything I add there will cause a slowdown.

@jrouwe
Copy link
Contributor

jrouwe commented May 19, 2024

So I have a fix that fixes the provided snapshot:

jrouwe/JoltPhysics@master...bugfix/contact_cache

basically instead of checking the delta position in local space, I check in world space which means it doesn't matter in which order the bodies are anymore. This is cheaper but obviously fails to detect if two bodies are rotating together while they're touching (which is a rare case and doesn't happen in my test scenarios). In the performance test the contact cache hit efficiency drops a little bit but the cheaper check seems to make up for this so the performance ends up being roughly the same. I'm not super happy with it, but haven't found something better yet.

So then I went on to the original repro in godot to check if my fix fixes the issue there too and unfortunately it doesn't and I don't know why yet...

@MemoriesIn8bit
Copy link
Author

@jrouwe Thank you so much for looking into this. I must admit, while I try to keep up, some of the more technical details escape me. I understand the repro case I made is by all means an edge case, which should not have much of an impact, but it does due to the spheres being extremely tiny. Again, I am not sure if I am getting this right, but if this essentially boils down to me just operating below the limits of what Jolt was designed for, I feel bad for bothering you with this. Not to mention, in the project I am working on, all dynamic objects are instantiated - and with that, I can try as I might and fail to ever see this issue (although I understand it might still happen). But if it weren't for me trying to debug something with manually placed objects, I would have never found this. So I wasn't really counting on a quick fix anyway. Especially if it ends up being my fault. :)

@mihe
Copy link
Contributor

mihe commented May 19, 2024

So then I went on to the original repro in godot to check if my fix fixes the issue there too and unfortunately it doesn't and I don't know why yet...

Hm, yeah, that's strange. It does fix my reduced repro obviously, but it does indeed not fix the original repro. As far as I can tell the deciding difference between the two is the physics material, more specifically the friction of 0 between the two bodies. Any non-zero friction (even something as small as 0.01) and the two spheres behave the same, but only with your branch applied.

Your branch also doesn't seem to affect that other repro case, the one where the ball starts rolling away on its own on a perfectly flat surface.

Here's a snapshot with both issues, if you need one: jolt_snapshot_2024-05-19_22-43-55_1027.zip

(The spheres sitting at Z=0 are the ones that are supposed to stay still.)

@jrouwe
Copy link
Contributor

jrouwe commented May 20, 2024

I have created another fix, one that involves removing the ordering of the bodies completely:

diff.patch

This fixes the two balls on the 0.1 degree incline and also fixes the ball rolling off the frictionless cube, but unfortunately it means that it starts rotating slowly without moving. Other downsides:

  • OnContactAdded/Persisted/Removed callbacks no longer have the guarantee that body1.GetID() < body2.GetID() which would be a breaking API change
  • It causes issues when switching the motion type of a body (because this may change the ordering)
  • It has issues with inconsistent ordering in the CCD pipeline (fixing that requires additional code in ContactConstraintManager::OnCCDContactAdded).

If this is not blocking @MemoriesIn8bit then I think we should leave this bug for what it is at this moment. The balls are smaller than the recommended minimum size (making them larger seems to reduce the issue a lot) and even if all balls are created after the floor there are still numerical issues based on the distance between the center of mass of the floor and that of the ball which causes the balls to behave differently.

Start of test (balls 2.5 cm radius in increasing distance to the floor center of mass, all created after the floor, 0.1 degree tilt):

start.mp4

After some time:

after_some_time.mp4

Test code:

void SimpleTest::Initialize()
{
	constexpr float cRadius = 0.025f;
	BodyCreationSettings dynamic(new SphereShape(cRadius), RVec3(0, cRadius, 0), Quat::sIdentity(), EMotionType::Dynamic, Layers::MOVING);
	dynamic.mFriction = 1.0f;
	dynamic.mLinearDamping = 0.1f;
	dynamic.mAngularDamping = 0.1f;
	dynamic.mAllowSleeping = false;

	BodyCreationSettings floor(new BoxShape(Vec3(100, 1.0f, 100), 0.0f), RVec3(Vec3(0.0f, -1.0f, 0.0f)), Quat::sRotation(Vec3::sAxisX(), DegreesToRadians(0.1f)), EMotionType::Static, Layers::NON_MOVING);
	floor.mFriction = 1.0f;
	mBodyInterface->CreateAndAddBody(floor, EActivation::DontActivate);

	for (int i = 1; i < 10.0f; ++i)
	{
		dynamic.mPosition.SetX(5.0f * i *  cRadius);
		mBodyInterface->CreateAndAddBody(dynamic, EActivation::Activate);
	}
}

@MemoriesIn8bit
Copy link
Author

MemoriesIn8bit commented May 20, 2024

Hah! I noticed a similar effect in my game as well, but since my game involves a lot of relatively complex concave colliders I assumed it was related to that.

I guess I should iterate on my project once more and increase the size of my dynamic objects. The downside is, in my game, the balls are supposed to be "marbles" (the game involves building marble machines). Scaling everything (including the (VR-)player so this isn't noticeable) by 5 or 10 is possible, but it's going to be a challenge so make the objects behave AS IF they are tiny (scaling up gravity, reducing drag) and I do wonder if that might re-introduce some of these or entirely new issues again.

@jrouwe
Copy link
Contributor

jrouwe commented May 20, 2024

I do wonder if that might re-introduce some of these or entirely new issues again.

That is indeed the big question. If you don't have problems now I would leave things as they are. To a certain extent you can tweak the constants that Jolt uses as tolerance, e.g. in godot-jolt this would be the most important project setting:

image

which in the repro was set to 0.02 = roughly the radius of your marble, so Jolt would be ok with sinking the marble by 2 cm into the floor.

@MemoriesIn8bit
Copy link
Author

MemoriesIn8bit commented May 20, 2024

@jrouwe Yeah this value I am setting extremely low - it's at 0.0001 right now. Otherwise the edges of my marble tracks can easily get the marble to get stuck if the speed and inclination is sufficiently low enough. And reading more I think this may also (at least partially) related to Active Edge angle.
image

I have also experimented with just setting it to 0, but at higher speeds, the simulation tends to become a bit unstable then (i.e. sometimes marbles not being able to resolve collisions anymore at all).

I made an experiment scaling everything up, but failed to notice any significant improvements unless that scale factor was reaching 10. At which point I wasn't sure anymore if I adjusted drag/gravity to "fake" a smaller size properly as everything still moved slower which might just have been the reason for the improvements.

I realize I am just asking for trouble - being below minimum size, using a lot of complex concave colliders (with meshes extruded from bezier curves, that's really the only viable option). And yeah, I do see some downsides - like marbles picking up a lot of speed when on almost flat surfaces, unrealistic depenetration speeds and as a result them never coming to rest unless sleeping is enabled (or drag/damp values so high that they behave more like basketballs which I didn't want). For what it's worth, when this project still existed in Unity (using PhysX), I saw more or less the same results.

I always try to work at accurate sizes but for this I know I am flying really close to the sun. But I might still go back and tinker with larger scales. Worst case I will forgo accurate scale and pretend the game world works on fantasy dimensions.

@jrouwe
Copy link
Contributor

jrouwe commented May 21, 2024

Thinking about this some more: There is actually a relatively simple solution.

The function of the contact cache is to trade simulation accuracy for performance and the default tolerance for the contact cache is 1 mm (if objects move 1 mm relative to each other we recalculate the collision otherwise we re-use the calculated contact point from the previous frame). When your object is a meter, 1 mm is tiny. If the object is 2.5 cm then 1 mm is not so small anymore and the simulation artifacts become visible. If I set settings.mBodyPairCacheMaxDeltaPositionSq = JPH::Square(1.0e-5f); (so 0.01 mm instead of 1 mm) here:

JPH::PhysicsSettings settings;
settings.mBaumgarte = JoltProjectSettings::get_position_correction();
settings.mSpeculativeContactDistance = JoltProjectSettings::get_contact_distance();
settings.mPenetrationSlop = JoltProjectSettings::get_contact_penetration();
settings.mLinearCastThreshold = JoltProjectSettings::get_ccd_movement_threshold();
settings.mLinearCastMaxPenetration = JoltProjectSettings::get_ccd_max_penetration();
settings.mNumVelocitySteps = (JPH::uint)JoltProjectSettings::get_velocity_iterations();
settings.mNumPositionSteps = (JPH::uint)JoltProjectSettings::get_position_iterations();
settings.mMinVelocityForRestitution = JoltProjectSettings::get_bounce_velocity_threshold();
settings.mTimeBeforeSleep = JoltProjectSettings::get_sleep_time_threshold();
settings.mPointVelocitySleepThreshold = JoltProjectSettings::get_sleep_velocity_threshold();
settings.mAllowSleeping = JoltProjectSettings::is_sleep_enabled();

Then the godot scene behaves as expected (and we still maintain the contact cache in the case where the bodies are really static).

This suggests that we could simply expose this value to the godot-jolt project settings to allow people to tweak the contact cache behavior. If we do this, we should probably also expose mBodyPairCacheCosMaxDeltaRotationDiv2.

@mihe what do you think? (I can make the change if you want)

@MemoriesIn8bit
Copy link
Author

I for one would be happy to test such a change out on my project and report back on how it works. I'm no stranger to compiling GodotJolt from source. :)

@mihe
Copy link
Contributor

mihe commented May 21, 2024

@mihe what do you think? (I can make the change if you want)

It's no problem at all for me to add it, besides trying to actually understand what I'm adding. 😅

Feel free to give #865 a try, @MemoriesIn8bit.

You might also want to take a look at it, @jrouwe, just to make sure I'm not misunderstanding anything.

@MemoriesIn8bit
Copy link
Author

Alright, it's a bit late, so my testing is limited, but I gave it a quick go. Using the - I think - proposed settings of @jrouwe :
image
And the results look very promising so far.

Placed and instantiated objects behave extremely similar to one another. Mostly they will not roll at all, but depending on where exactly I place them on the box collider they may start to roll very slowly - which I think however is related to where exactly they are placed on the collider.
https://github.com/godot-jolt/godot-jolt/assets/166986685/a8c0ab2d-14b2-4ade-a0b2-1e8db751850c

Trying again on my test project yields similar results: On the angled boxes, the balls roll in perfectly similar speeds. While on the boxes that are perfectly level, the placed ball will rest while the instantiated one rolls on its own very slowly.
https://github.com/godot-jolt/godot-jolt/assets/166986685/aee9c014-4fab-4c57-9980-16ab02db271e
Decreasing Body Pair Cache Distance/Angle Threshold (0.0001 m and 0.2 deg) reduces this even further until the marble will not roll at all.

If I understand correctly from earlier posts, some minuscule differences are to be expected. Honestly, for me, this is more than I expected. The issue where the marbles pick up rather impressive speed on completely level surfaces is entirely gone and I look forward to tinker with both the two new options tomorrow to see how it affects my game. :)

@mihe
Copy link
Contributor

mihe commented May 22, 2024

Decreasing Body Pair Cache Distance/Angle Threshold (0.0001 m and 0.2 deg) reduces this even further until the marble will not roll at all.

Even when decreasing the angle like that I'm still seeing the marble-on-level-surface crawl away slowly, albeit slower than the original repro. The only thing that stops it completely for me is setting both of these new settings to zero, which I assume effectively disables the contact cache completely, and which presumably impacts performance in a bad way.

@MemoriesIn8bit
Copy link
Author

MemoriesIn8bit commented May 22, 2024

I think you're right, sorry. Either it was a fluke or I didn't observe the marble long enough to notice.
EDIT: Turns out my repro case had Shape Margins turned OFF which solved the issue before this new experimental fix as well, so yeah, my bad.

I also also don't think Body Pair Cache Distance Threshold at 0.0001 was right. 1.0e-5f is 0.00001 - so I was still off by a factor of 10.

I will say, though: In my project, things have improved drastically just with Body Pair Cache Distance/Angle Threshold 0.0001/0.2. It's hard to describe it without a ton of context, but I get a lot less of marbles developing rather drastic speeds when dropping on level surfaces (which is what tipped me off to the issue in the first place) or bouncing back from an edge connecting two surfaces with ever so slightly different normals (which I didn't even think was related but just a result of my tiny marbles).

So even if this fix isn't perfect, it is a MASSIVE improvement I would never have expected. I can't say if it's 100% perfect, yet - mainly because I still have to tinker and balance things out with these new options given to me (and not really sure if I use them correctly), but I will say that it wasn't really blocking me before, and as it is now, I wouldn't have reported it as I would have assumed it to be float inaccuracies (and again: I am not sure how much tolerance there still mathematically is).

@jrouwe
Copy link
Contributor

jrouwe commented May 22, 2024

The only thing that stops it completely for me is setting both of these new settings to zero, which I assume effectively disables the contact cache completely, and which presumably impacts performance in a bad way.

Yes, that disables it completely but in that case I would use PhysicsSettings::mUseBodyPairContactCache = false instead to avoid the lookup in the contact cache. If you have many simulated but almost static objects then it's definitively going to have a negative impact on performance, but if the marbles are rolling most of the time then the benefit from the contact cache isn't going to be great anyway.

@MemoriesIn8bit
Copy link
Author

This is incredibly informative. And yes, for the most time in this game, most marbles will always be moving and is sitting still, then only temporary - in which case I'll likely just let them sleep. Because they usually have to deal with a lot of contact points on tracks with very little friction (to roll smoothly) so they like to jiggle around when "at rest" - which is usually on an edge between two surfaces (although that got MUCH MUCH better with these new options and I am still tinkering with settings, physics materials, damping, etc. as I go). So sleeping helps a lot with that.

@mihe
Copy link
Contributor

mihe commented May 23, 2024

I'll just go ahead and expose the setting to disable the contact cache entirely, along with those other two. It sounds like disabling it entirely might be the better option for you, and might perhaps be useful/necessary for someone else in the future as well.

@mihe mihe closed this as completed in #865 May 24, 2024
@mihe mihe added enhancement New feature or request and removed external Involving an external party bug Something that isn't working as intended severity:minor Small impact on functionality or usability labels May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request topic:runtime Concerning runtime behavior (or its source code)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants