You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having an issue launching the pretraining job with tensorflow 2.15 or above. Tensorflow 2.15 immediately segdumps. With the latest tensorflow 2.16.1 I see there is an unbound or near-100% video memory growth of one of the data loading process, leading to CUDA OOM and cascading failures. One quick workaround is to locally install lower tensorflow versions e.g.
Hi team,
I'm having an issue launching the pretraining job with tensorflow 2.15 or above. Tensorflow 2.15 immediately segdumps. With the latest tensorflow 2.16.1 I see there is an unbound or near-100% video memory growth of one of the data loading process, leading to CUDA OOM and cascading failures. One quick workaround is to locally install lower tensorflow versions e.g.
pip install tensorflow==2.13.1 tensorflow-text==2.13.0
Also works:
pip install tensorflow==2.14.1 tensorflow-text==2.14.0
The text was updated successfully, but these errors were encountered: