How to improve the lip-sync quality in Chinese datasets？ #868

maggiez0138 · 2024-04-16T11:17:39Z

Thank you for this great work, it works really well in English.
But I found its performance relatively poor in some Mandarin datasets, especially in TTS.
Thus, I'm exploring ways to improve the quality in Chinese datasets. I tried leveraging the 3DMM coefficients generated by GeneFacePlusPlus and tested them with various Chinese datasets. It performs better in some audios, worse in others, but in English, the performance is significantly worse than before.
I suspect this may be because I didn't appropriately train the Face-vid2vid (including MappingNet) component.

Then my questions are,

Do you think it is necessary to re-train the face-vid2vid part when we change the 3DMM coefficients?
Do you have a plan to open the Face-vid2vid(including MappingNet) training part?
Is there a plan to do some improvement on Mandarin?

Again, thanks for your excellent work. It is really impressive!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to improve the lip-sync quality in Chinese datasets？ #868

How to improve the lip-sync quality in Chinese datasets？ #868

maggiez0138 commented Apr 16, 2024

How to improve the lip-sync quality in Chinese datasets？ #868

How to improve the lip-sync quality in Chinese datasets？ #868

Comments

maggiez0138 commented Apr 16, 2024