Thanks for your excellent work!
I have a question regarding dataset selection. I noticed that datasets such as Motion-X, Motion-X++, and MotionMillion were excluded. As mentioned in the paper, these datasets are quite large in duration but tend to have lower motion quality. Does this imply that scaling up a dataset with lower-quality motions would not lead to promising results?
In other words, is the ViMoGen-228K dataset (around 350 hours) already sufficient for effective learning, and further improvements can only be achieved by providing higher-quality and more diverse data?
Thanks a lot for your clarification!
Thanks for your excellent work!
I have a question regarding dataset selection. I noticed that datasets such as Motion-X, Motion-X++, and MotionMillion were excluded. As mentioned in the paper, these datasets are quite large in duration but tend to have lower motion quality. Does this imply that scaling up a dataset with lower-quality motions would not lead to promising results?
In other words, is the ViMoGen-228K dataset (around 350 hours) already sufficient for effective learning, and further improvements can only be achieved by providing higher-quality and more diverse data?
Thanks a lot for your clarification!