feat: python requirements by akihikokuroda · Pull Request #1128 · generative-computing/mellea

akihikokuroda · 2026-05-21T19:52:45Z

Pull Request

Issue

Fixes #1119, #1121

Description

Add Python code generation requirements

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code was added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

AI coding assistants used: claudecode

Adding a new component, requirement, sampling strategy, or tool?

If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.

Component
Requirement
Sampling Strategy
Tool

NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.

github-actions · 2026-05-21T19:52:54Z

This comment is managed by a bot. Editing it is fine — checking off boxes, adding notes — but please leave the HTML comment marker on the first line alone, otherwise checklist updates will break.

Requirement PR Checklist

Use this checklist when adding or modifying requirements in mellea/stdlib/requirements/.

Base Class

Extends appropriate base class:
- Requirement - standard requirement
- ALoraRequirement - uses specialized Intrinsic/Adapter for generation-based validation

Validation Logic

validation_fn defined (if using Python-based validation)
- re-usable functionality within the validation_fn should be separated out into mellea/stdlib/tools/
validate returns a ValidationResult with
- a thunk and context if using a backend to generate
- a specific reason and score when possible

Integration

Requirement exported in mellea/stdlib/requirements/__init__.py or, if you are adding a library of requirements, from your sub-module

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

planetf1

Two of the four exported requirements don't behave as documented. OutputSizeLimit always passes regardless of output volume, and ImportRestrictions([]) allows all imports instead of blocking them all. Both bugs are masked by tautological test assertions. Requesting changes on these before merge.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

planetf1

Thanks for the fixes — I've gone through the latest version against the original push. A few observations from the refactor:

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

planetf1

All the fixes from both rounds look good — the two original correctness bugs are properly fixed and tested. A few minor nits inline, none of which should hold this up.

planetf1 · 2026-05-28T12:33:18Z

+            )
+
+        code = extraction_result.reason
+        assert code is not None


Nit: assert gets stripped when Python runs with -O, so if _has_python_code_listing ever returned a None reason on success this would silently blow up rather than give a useful message. Same pattern at lines 144 and 223. In practice the contract is stable and -O is rarely used, so this is low risk — but an explicit if code is None: return ValidationResult(result=False, reason=...) would be more robust.

fixed them and another one in python_req.py

planetf1 · 2026-05-28T12:33:37Z

+    def __init__(
+        self,
+        limit_chars: int = 10_000,
+        timeout: int = 5,


Nit: limit_chars is validated to be positive but timeout isn't. A zero or negative value would cause subprocess.TimeoutExpired immediately, which the except below catches and turns into a failure result — so nothing actually breaks. Just an inconsistency worth tidying up.

value check is added

planetf1 · 2026-05-28T12:34:06Z

+                    result=False,
+                    reason=f"Output size ({output_size} chars) exceeds limit ({self.limit_chars}).",
+                )
+        except Exception as e:


Nit: catching Exception broadly means any unexpected error (misconfigured environment, attribute error, etc.) silently becomes "Error checking output size". The code fails closed so there's no safety concern, but it makes debugging harder. A logger.exception(...) call here before the return would help surface those. As a bonus logger on line 26 is currently unused — this would be its first call.

loggger statement is added

planetf1 · 2026-05-28T12:34:39Z

+        reqs = python_tool_requirements(timeout_seconds=15)
+        execution_req = reqs[2]
+        assert isinstance(execution_req, PythonExecutionReq)
+        assert execution_req._timeout == 15


Nit: this and the two tests below check ._timeout, ._use_sandbox, and ._allowed_imports directly on PythonExecutionReq. These work fine today but they're implementation details — if that class changes how it stores those values internally the tests break without any API change. Testing observable behaviour (e.g. a short timeout actually causing a timeout result) would be more resilient, though that's a higher bar for a propagation check.

These 3 tests are replaced by tests that don't use the internal variables.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda requested a review from a team as a code owner May 21, 2026 19:52

akihikokuroda requested review from markstur and nrfulton May 21, 2026 19:52

github-actions Bot added the enhancement New feature or request label May 21, 2026

python requrements

204ef40

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda force-pushed the issue1119 branch from 822f197 to 204ef40 Compare May 21, 2026 22:49

planetf1 requested changes May 26, 2026

View reviewed changes

review comments

a804c88

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda requested a review from planetf1 May 26, 2026 14:22

planetf1 reviewed May 26, 2026

View reviewed changes

Comment thread mellea/stdlib/requirements/python_tools.py Outdated

Comment thread mellea/stdlib/requirements/python_tools.py Outdated

Comment thread mellea/stdlib/requirements/python_tools.py

Comment thread test/stdlib/requirements/test_python_tools.py

review commnets

d7724bc

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda requested a review from planetf1 May 27, 2026 14:53

planetf1 approved these changes May 28, 2026

View reviewed changes

planetf1 reviewed May 28, 2026

View reviewed changes

review comments

8c29c6b

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

Conversation

akihikokuroda commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Issue

Description

Testing

Attribution

Adding a new component, requirement, sampling strategy, or tool?

Uh oh!

github-actions Bot commented May 21, 2026 • edited by akihikokuroda Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Requirement PR Checklist

Base Class

Validation Logic

Integration

Uh oh!

planetf1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

planetf1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

planetf1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akihikokuroda commented May 21, 2026 •

edited

Loading

github-actions Bot commented May 21, 2026 •

edited by akihikokuroda

Loading

planetf1 left a comment •

edited

Loading