Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve the lip-sync quality in Chinese datasets? #868

Open
maggiez0138 opened this issue Apr 16, 2024 · 0 comments
Open

How to improve the lip-sync quality in Chinese datasets? #868

maggiez0138 opened this issue Apr 16, 2024 · 0 comments

Comments

@maggiez0138
Copy link

Thank you for this great work, it works really well in English.
But I found its performance relatively poor in some Mandarin datasets, especially in TTS.
Thus, I'm exploring ways to improve the quality in Chinese datasets. I tried leveraging the 3DMM coefficients generated by GeneFacePlusPlus and tested them with various Chinese datasets. It performs better in some audios, worse in others, but in English, the performance is significantly worse than before.
I suspect this may be because I didn't appropriately train the Face-vid2vid (including MappingNet) component.

Then my questions are,

  1. Do you think it is necessary to re-train the face-vid2vid part when we change the 3DMM coefficients?
  2. Do you have a plan to open the Face-vid2vid(including MappingNet) training part?
  3. Is there a plan to do some improvement on Mandarin?

Again, thanks for your excellent work. It is really impressive!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant