Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models trained in Linux run poorly on Windows and vice-versa. #8903

Open
acidtonic opened this issue May 6, 2024 · 4 comments
Open

Models trained in Linux run poorly on Windows and vice-versa. #8903

acidtonic opened this issue May 6, 2024 · 4 comments

Comments

@acidtonic
Copy link

I am trying to figure out why models trained on the same codebase compiled in linux, run poorly when moved to windows and vice-versa.

I have tried training on Ubuntu 22.04 and moving to windows 11 to run the net and find performance is similar at high thresholds but if I lower the threshold the windows side will start showing hundreds of boxes all over the screen and the linux side does not.

Both compiled using the exact same github commit hash and using the same hardware/gpu/etc. Both sides have the same cuda sdk down to the minor number too. Tried various cards such a 2080ti, 3090ti, 4080, etc.

Is there any guidance for running models between operating systems? Do I need to adjust something to make this work smoothly?

@Statgator2
Copy link

I've experienced this as well. I tend to have significantly better net performance in Linux than in Windows.

@acidtonic
Copy link
Author

I have a feeling it's related to some minor ABI difference with the model but I see people sharing models with each other often so I'm somewhat confused what causes this or what I can do to fix/identify it.

@Statgator2
Copy link

What I have noticed is that Windows will produce hundreds of boxes on the screen simply by lowering the confidence of the detections. However if you do the same in linux, Darknet will not produce the hundreds of boxes. It might be a couple of extra boxes at lower confidence but not typically hundreds of boxes not even at 1% confidence. Windows I feel does the hundreds of boxes thing even at 40% confidence. I find this behavior to be illogical.

Does anyone know what is causing this net performance difference between Windows and Linux?

@stephanecharette
Copy link
Collaborator

Is this also a problem with the newer fork of Darknet/YOLO? I'm travelling right now and don't have access to my Windows environment to test, but I'd be curious to know if this is a problem with https://github.com/hank-ai/darknet?tab=readme-ov-file#table-of-contents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants