Skip to content

feat: python requirements#1128

Open
akihikokuroda wants to merge 4 commits into
generative-computing:mainfrom
akihikokuroda:issue1119
Open

feat: python requirements#1128
akihikokuroda wants to merge 4 commits into
generative-computing:mainfrom
akihikokuroda:issue1119

Conversation

@akihikokuroda
Copy link
Copy Markdown
Member

@akihikokuroda akihikokuroda commented May 21, 2026

Pull Request

Issue

Fixes #1119, #1121

Description

Add Python code generation requirements

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code was added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

  • AI coding assistants used: claudecode

Adding a new component, requirement, sampling strategy, or tool?

If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.

  • Component
  • Requirement
  • Sampling Strategy
  • Tool

NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.

@akihikokuroda akihikokuroda requested a review from a team as a code owner May 21, 2026 19:52
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

This comment is managed by a bot. Editing it is fine — checking off boxes, adding notes — but please leave the HTML comment marker on the first line alone, otherwise checklist updates will break.

Requirement PR Checklist

Use this checklist when adding or modifying requirements in mellea/stdlib/requirements/.

Base Class

  • Extends appropriate base class:
    • Requirement - standard requirement
    • ALoraRequirement - uses specialized Intrinsic/Adapter for generation-based validation

Validation Logic

  • validation_fn defined (if using Python-based validation)
    • re-usable functionality within the validation_fn should be separated out into mellea/stdlib/tools/
  • validate returns a ValidationResult with
    • a thunk and context if using a backend to generate
    • a specific reason and score when possible

Integration

  • Requirement exported in mellea/stdlib/requirements/__init__.py or, if you are adding a library of requirements, from your sub-module

@github-actions github-actions Bot added the enhancement New feature or request label May 21, 2026
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two of the four exported requirements don't behave as documented. OutputSizeLimit always passes regardless of output volume, and ImportRestrictions([]) allows all imports instead of blocking them all. Both bugs are masked by tautological test assertions. Requesting changes on these before merge.

Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Comment thread mellea/stdlib/requirements/python_tools.py
Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Comment thread test/stdlib/requirements/test_python_tools.py Outdated
Comment thread test/stdlib/requirements/test_python_tools.py Outdated
Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
@akihikokuroda akihikokuroda requested a review from planetf1 May 26, 2026 14:22
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes — I've gone through the latest version against the original push. A few observations from the refactor:

Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Comment thread mellea/stdlib/requirements/python_tools.py Outdated
Comment thread mellea/stdlib/requirements/python_tools.py
Comment thread test/stdlib/requirements/test_python_tools.py
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
@akihikokuroda akihikokuroda requested a review from planetf1 May 27, 2026 14:53
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the fixes from both rounds look good — the two original correctness bugs are properly fixed and tested. A few minor nits inline, none of which should hold this up.

)

code = extraction_result.reason
assert code is not None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: assert gets stripped when Python runs with -O, so if _has_python_code_listing ever returned a None reason on success this would silently blow up rather than give a useful message. Same pattern at lines 144 and 223. In practice the contract is stable and -O is rarely used, so this is low risk — but an explicit if code is None: return ValidationResult(result=False, reason=...) would be more robust.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed them and another one in python_req.py

def __init__(
self,
limit_chars: int = 10_000,
timeout: int = 5,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: limit_chars is validated to be positive but timeout isn't. A zero or negative value would cause subprocess.TimeoutExpired immediately, which the except below catches and turns into a failure result — so nothing actually breaks. Just an inconsistency worth tidying up.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value check is added

result=False,
reason=f"Output size ({output_size} chars) exceeds limit ({self.limit_chars}).",
)
except Exception as e:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: catching Exception broadly means any unexpected error (misconfigured environment, attribute error, etc.) silently becomes "Error checking output size". The code fails closed so there's no safety concern, but it makes debugging harder. A logger.exception(...) call here before the return would help surface those. As a bonus logger on line 26 is currently unused — this would be its first call.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loggger statement is added

reqs = python_tool_requirements(timeout_seconds=15)
execution_req = reqs[2]
assert isinstance(execution_req, PythonExecutionReq)
assert execution_req._timeout == 15
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this and the two tests below check ._timeout, ._use_sandbox, and ._allowed_imports directly on PythonExecutionReq. These work fine today but they're implementation details — if that class changes how it stores those values internally the tests break without any API change. Testing observable behaviour (e.g. a short timeout actually causing a timeout result) would be more resilient, though that's a higher bar for a propagation check.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 tests are replaced by tests that don't use the internal variables.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Requirements Library: Implement generic Python tool requirements

2 participants