Skip to content

feat(multimodal): add max_image_token_count guard with OOM risk guidance#1308

Merged
hiworldwzj merged 3 commits into
mainfrom
wzj_fix
May 14, 2026
Merged

feat(multimodal): add max_image_token_count guard with OOM risk guidance#1308
hiworldwzj merged 3 commits into
mainfrom
wzj_fix

Conversation

@hiworldwzj
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new configuration parameter, --max_image_token_count, which limits the number of tokens allowed for a single image to prevent potential Out-Of-Memory (OOM) issues. The changes span documentation, CLI argument definitions, and the server's resource allocation logic. Feedback from the review focuses on replacing assert statements with explicit if checks and ValueError exceptions for runtime validation, as assert can be disabled in optimized Python environments.

Comment on lines +184 to +190
def _assert_image_token_count(self, token_num: int):
assert token_num <= self.args.max_image_token_count, (
f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}. "
f"You can increase this limit by setting --max_image_token_count to a larger value when starting "
f"LightLLM. Warning: increasing this limit raises runtime OOM risk."
)
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Avoid using assert for runtime validation of user input or critical safety guards. Python's assert statements are removed when the interpreter is run with optimizations (-O), which would disable this check and potentially lead to OOM. Use an explicit if check and raise a ValueError instead. I've also renamed the method to better reflect its behavior.

Suggested change
def _assert_image_token_count(self, token_num: int):
assert token_num <= self.args.max_image_token_count, (
f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}. "
f"You can increase this limit by setting --max_image_token_count to a larger value when starting "
f"LightLLM. Warning: increasing this limit raises runtime OOM risk."
)
return
def _check_image_token_count(self, token_num: int):
if token_num > self.args.max_image_token_count:
raise ValueError(
f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}. "
f"You can increase this limit by setting --max_image_token_count to a larger value when starting "
f"LightLLM. Warning: increasing this limit raises runtime OOM risk."
)
return

data = img.read()
# must after init_imageitem_extral_params
token_num = self.tokenizer.get_image_token_length(img)
self._assert_image_token_count(token_num)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Update the call to the renamed method.

Suggested change
self._assert_image_token_count(token_num)
self._check_image_token_count(token_num)

self.tokenizer.init_imageitem_extral_params(img, multimodal_params, samping_params)
image_tokens += self.tokenizer.get_image_token_length(img)
token_num = self.tokenizer.get_image_token_length(img)
self._assert_image_token_count(token_num)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Update the call to the renamed method.

Suggested change
self._assert_image_token_count(token_num)
self._check_image_token_count(token_num)

Comment on lines +85 to +89
assert token_num <= self.args.max_image_token_count, (
f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}."
f"You can increase this limit by setting --max_image_token_count to a larger value when starting "
f"LightLLM. Warning: increasing this limit raises runtime OOM risk."
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Avoid using assert for runtime validation as it can be optimized away. Also, fixed a missing space in the error message between the sentences.

Suggested change
assert token_num <= self.args.max_image_token_count, (
f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}."
f"You can increase this limit by setting --max_image_token_count to a larger value when starting "
f"LightLLM. Warning: increasing this limit raises runtime OOM risk."
)
if token_num > self.args.max_image_token_count:
raise ValueError(
f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}. "
f"You can increase this limit by setting --max_image_token_count to a larger value when starting "
f"LightLLM. Warning: increasing this limit raises runtime OOM risk."
)

@hiworldwzj hiworldwzj merged commit f41b8c4 into main May 14, 2026
1 check passed
@hiworldwzj hiworldwzj deleted the wzj_fix branch May 14, 2026 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant