Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Deadlock with rayon usage #3063

Open
HarukaMa opened this issue Feb 7, 2024 · 2 comments
Open

[Bug] Deadlock with rayon usage #3063

HarukaMa opened this issue Feb 7, 2024 · 2 comments
Labels
bug Incorrect or unexpected behavior

Comments

@HarukaMa
Copy link
Contributor

HarukaMa commented Feb 7, 2024

馃悰 Bug Report

There is a rayon-related deadlock in snarkOS, but I'm not quite sure which situation it actually is:

  1. Using rayon parallel iterators while holding a Mutex or write RwLock (this case). See multiple discussions like this and this.
  2. Using rayon with blocking calls (not sure if spawn_blocking applies here). Maybe see this or this.

I think it's probably the first one, as from a deadlock core dump, I did see write lock being acquired while the node stuck at a read lock. Here is the full backtrace of all threads. (Large text file as rayon tend to generate a deep stack. The file is actually .7z but has to be named .zip to upload here.) Notice the thread 69 has the write lock to vm.process while trying to advance a block, while there are many threads trying to validate incoming unconfirmed transactions and needed a read lock.

Steps to Reproduce

Not sure. Run the node with a large number of connections?

Expected Behavior

The node should not deadlock.

Your Environment

@HarukaMa HarukaMa added the bug Incorrect or unexpected behavior label Feb 7, 2024
@ljedrz
Copy link
Collaborator

ljedrz commented Feb 8, 2024

This one feels like it's going to be tricky, but I'll try to investigate it soon.

@raychu86
Copy link
Contributor

We did initial passes, but were unable to reproduce this. Putting this on a lower priority, but will keep and eye out and revisit this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect or unexpected behavior
Projects
None yet
Development

No branches or pull requests

3 participants