Add ltp-analyze skill and update convert skill#8
Conversation
fc0e4c2 to
a89d429
Compare
Add the ltp-analyze skill for evaluating test quality, robustness, and coverage, and the ltp-import skill for re-authoring an external test (kselftest, glibc, bare C reproducer, syzkaller/CVE PoC, shell repro) into a native LTP test. Both new skills, together with ltp-convert, repeat the same classification and finalization logic. Factor that duplication into two shared fragments so the rules live in one place and cannot drift: - agents/classify.md: test type, API, test-vs-helper, runtest lookup - agents/finalize.md: lint, build, runtime, commit, fresh-context review, apply fixes - agents/test-intent.md: set of rules to define the test operations Slim ltp-convert to reference both fragments and update AGENTS.md with the analyze and import task triggers and the adjusted precedence order. Signed-off-by: Andrea Cervesato <andrea.cervesato@suse.com>
a89d429 to
e36527f
Compare
| Check whether the file's basename (without `.c`) appears in any `runtest/` | ||
| file: | ||
|
|
||
| grep -RFw <basename> runtest/ |
There was a problem hiding this comment.
I would like to avoid references to runtest files. I want to start generating runtest files for a few categories during build soon. I plan to start with generating runtest/cve in this development cycle.
There was a problem hiding this comment.
maybe I dont understand the comment. At the moment runtest/ folder is used to analyze the type of file we are going to refactor (check a few lines below)
| 6. **What is the pass/fail oracle?** Expected return value, expected errno, | ||
| expected side-effect. For a security/regression reproducer, the | ||
| crash/corruption/leak the source demonstrates — not merely running to | ||
| completion. |
There was a problem hiding this comment.
Maybe mention here that some old tests may not make sense and that it's okay to return this information?
There was a problem hiding this comment.
Yes, the idea is that during this process the skill will tell the user if test is not really important. This is happening during ltp-analysis execution, where LLM gives an evaluation. Then it's up to the user to continue or to cancel the test if it doesn't make sense.
| - **Type**: classify per `agents/classify.md` (C test new/old API, shell test, | ||
| Open POSIX test, helper binary, or design-only). | ||
| - **Runtest entry**: check whether the test basename appears in any | ||
| `runtest/` file (`grep -rw <basename> runtest/`). Flag if missing. |
There was a problem hiding this comment.
Again, I want to eventually generate the runtest files or even get rid of them completely. I guess it's fine as long as you adjust the configuration once we start to generate the runtest files on the fly
There was a problem hiding this comment.
This is just a flag, so the analysis will tell us if the analyzed test has runtest file or not. I can actually remove it because it doesn't make sense, but at the moment we are still using runtest files. What we should do ?
| 5. **Resource lifecycle**: static state, what `.setup` allocates, what | ||
| `.cleanup` releases on every exit path. | ||
| 6. **Synchronization**: if it forks, use checkpoints / `SAFE_WAITPID` / | ||
| process-state wait — NEVER sleep-based timing (Ground Rule 2). |
There was a problem hiding this comment.
This part is duplicated in the ltp-convert/SKILL.md do we want to refactor parts of the ltp-import/SKILL.md and ltp-convert/SKILL.md into a separate file?
There was a problem hiding this comment.
I totally agree...will try to find a way
| - Replace any `sleep()`-based synchronization with checkpoints or | ||
| process-state polling (Ground Rule 2). | ||
| - Use runtime feature detection, not compile-time assumptions (Ground Rule 3). | ||
| Return `TCONF` when the feature/syscall is unavailable. |
There was a problem hiding this comment.
Maybe add a description how we detect syscall presence.
- syscalls that could be disabled with a kernel CONFIG variable MUST have corresponding .needs_kconfig check
- syscalls that were added into kernel unconditionally and are older than oldest LTP supported kernel are assumed to exist
- syscalls that were added into kernel unconditionally and are younger than oldest LTP supported kernel need a check in the test setup, if kernel version the test runs on is newer than minimal version supporting the syscall, it's assumed the syscall is present, otherwise we do a dummy call to the syscall and check for EINVAL.
Or does that belong into c-test.md?
There was a problem hiding this comment.
this is one of the moments where things overlap I guess. I think the right place is c-tests indeed
Add new ltp-analyze skill for evaluating test quality, robustness, and coverage. Update AGENTS.md with the analyze task trigger and adjust task precedence order.
Refresh ltp-convert skill with improved conversion guidelines.