You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The g_loss value in "train_second.py" is nan.
Debugging found that the output value of the model.decoder() function was nan. (line 391, line 402)
There was no problem in train_first.py, but I don't know why this problem occurs in train_second.py.
If you can fix these errors, please help me.
Thank you.
log_dir: "C:\Users\user_\Desktop\styleTTS2_test_data"
first_stage_path: "first_stage.pth"
save_freq: 2
log_interval: 10
device: "cuda"
epochs_1st: 200 # number of epochs for first stage training (pre-training)
epochs_2nd: 100 # number of peochs for second stage training (joint training)
batch_size: 4
max_len: 200 # maximum number of frames
pretrained_model: "C:\Users\user_\Desktop\styleTTS2_test_data\epoch_1st_00170.pth"
second_stage_load_pretrained: true # set to true if the pre-trained model is for 2nd stage
load_only_params: true # set to true if do not want to load epoch numbers and optimizer parameters
I have experienced this before in a few situations:
actual model parameters are not being loaded from the checkpoint (there is some weird naming error involving "module" prefix between stages 1 and 2 & whether you are using distributed vs. non-distributed training; try changing strict loading to true and see what happens with keys)
multispeaker is set incorrectly
certain batch sizes with mixed precision (try changing batch sizes)
The g_loss value in "train_second.py" is nan.
Debugging found that the output value of the model.decoder() function was nan. (line 391, line 402)
There was no problem in train_first.py, but I don't know why this problem occurs in train_second.py.
If you can fix these errors, please help me.
Thank you.
log_dir: "C:\Users\user_\Desktop\styleTTS2_test_data"
first_stage_path: "first_stage.pth"
save_freq: 2
log_interval: 10
device: "cuda"
epochs_1st: 200 # number of epochs for first stage training (pre-training)
epochs_2nd: 100 # number of peochs for second stage training (joint training)
batch_size: 4
max_len: 200 # maximum number of frames
pretrained_model: "C:\Users\user_\Desktop\styleTTS2_test_data\epoch_1st_00170.pth"
second_stage_load_pretrained: true # set to true if the pre-trained model is for 2nd stage
load_only_params: true # set to true if do not want to load epoch numbers and optimizer parameters
config for decoder
decoder:
type: 'istftnet' # either hifigan or istftnet
resblock_kernel_sizes: [3,7,11]
upsample_rates : [10, 6]
upsample_initial_channel: 512
resblock_dilation_sizes: [[1,3,5], [1,3,5], [1,3,5]]
upsample_kernel_sizes: [20, 12]
gen_istft_n_fft: 20
gen_istft_hop_size: 5
The text was updated successfully, but these errors were encountered: