-
Notifications
You must be signed in to change notification settings - Fork 31.5k
Deprecate dtype per sub config #42990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
run-slow: llava, llava_next, llava_next_video, llava_onevision |
|
This comment contains models: ["models/llava", "models/llava_next", "models/llava_next_video", "models/llava_onevision"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
CI Results✅ No failing test specific to this PR 🎉 ! |
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
| logger.warning_once( | ||
| f"The dtype for {sub_config_key}={sub_dtype} is different from the main dtype={main_dtype}. " | ||
| "Setting different dtypes per backbone model might cause device errors downstream, setting the " | ||
| f"dtype={main_dtype} for all modules. Note, using different dtypes per module is deprecated and will " | ||
| "be removed in future versions." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that in the future it will raise an exception, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, after a few released we can switch to an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that if a user sets dtype="auto", all sub modules will be forcibly cast to the specified main_dtype?
Cyrilvallez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do that, then we need to plainly remove dtype being passed as a dict as well!
| if main_dtype != sub_dtype: | ||
| logger.warning_once( | ||
| f"The dtype for {sub_config_key}={sub_dtype} is different from the main dtype={main_dtype}. " | ||
| "Setting different dtypes per backbone model might cause device errors downstream, setting the " | ||
| f"dtype={main_dtype} for all modules. Note, using different dtypes per module is deprecated and will " | ||
| "be removed in future versions." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will launch the warning for all models where the subconfigs saved a different dtype. Is it something frequent that would launch it for a lot of models, or is it more niche?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cyrilvallez
I think this might be more frequent than expected, especially for multimodal models(BLIP, ColPali,...). I’ve already encountered this issue with several models, such as:
What does this PR do?
Fixes #42968, #42641
As discussed, this PR deprecated different dtypes per backbone and falls back to the main dtype if a user is requesting different dtypes.
There is no clean way to support different dtypes when the backbone can have a task head. We either cast hidden states to task dtype in all model files, or we need a workaround to infer task head and force dtype on it when loading. So the better solution is to deprecate dict dtypes, apparently it is not used much