Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Did you call validation_step during training? #48

Open
Lucca-cherries opened this issue Dec 29, 2023 · 2 comments
Open

Did you call validation_step during training? #48

Lucca-cherries opened this issue Dec 29, 2023 · 2 comments

Comments

@Lucca-cherries
Copy link

Lucca-cherries commented Dec 29, 2023

Hi, thanks for your great work! I plan to use your model to train on custom dataset. However, when I trained it on a subset of your specified dataset, I did not see a validation procedure. Will the function validation_step be called? Moreover, it seems that you did not have a separate valid_dataloader because only one dataloader was passed into trainer.fit.

trainer.fit(model, dataloader)

Then how do you save the best model parameters which gave the lowerest loss during validation? Yes, I asked this question because I wanted to save the best model on validation dataset but did not make it yet:( I am not so familiar with pytorch lightning and the way controlnet should be trained, so correct me if I said something wrong:)

Btw, why did you repeat some datasets in the ConcatDataset
https://github.com/ali-vilab/AnyDoor/blob/ddcfbafb8fa4f27a2da705a3bcf5bfd2de4fbf98/run_train_anydoor.py#L64C3-L64C3
, but did not repeat image_data?

@XavierCHEN34
Copy link
Collaborator

  1. Validation: No, we did not call validation, instead we infer several training samples to see the converging process.
  2. Picking the best model by observing the lowest loss is not reliable, and neither is calculating some quantitative metrics. So we just save models with different steps.
  3. You could repeat the image_data, it is just a simple way to adjust the ratio of different dataset, feel free to use your own implementation.

@ai1361720220000
Copy link

In the run_train_anydoor.py it seems the training process will not save any model. I'm confused if i should add pl.callbacks.ModelCheckpoint in the callbacks because i'm also not familiar with pytorch lightning..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants