feat: add pretty run report#416
Conversation
saulshanabrook
left a comment
There was a problem hiding this comment.
Thank you for this! Added a few comments. Could you also add this to the changelog file with a link to this PR?
Merging this PR will degrade performance by 67.23%
Performance Changes
Comparing Footnotes
|
There was a problem hiding this comment.
Thanks for the fixes, I left a few small comments. There are also some mypy and formatting issues I think.
There is a bigger question about performance, if the codspeed is correct it looks like this slows things down by a ton!
Taking almost 40% of the time in a bigger benchmark just to translate bindings.
It makes me wonder about a different approach, where we set each rewrite and rule with a manual name like 1, 2, 3, ... and then we don't have to do the name searching and mangling and can just parse the name as an int then look it up? And if it's a birewrite just take off the <= or >=?
It would make the egglog file a bit more verbose, but makes parsing the reports more straightforward and more performant which seems like a good tradeoff?
I was also going back and forth on whether the RunReport should store a RewriteOrRule or the decl? If we just store the RewriteOrRule it's easier to pretty print, can just use the builtin one, and it's easier for users to grab that off and compare it or use it... But most of the other exposed objects just store the decls, so I will leave it up to you!
EDIT: It looks like the docs failures also highlight some other exceptions from this. I imagine also if we name the rules here that might also help since it seems like it's hitting on looking up the string?
| self.egg_rule_to_command_decl[_normalize_rule_key(str(egg_cmd))] = cmd | ||
| if name: | ||
| self.egg_rule_to_command_decl[name] = cmd |
There was a problem hiding this comment.
If a name is provided, we don't need to save the normalized version right?
| def _format_rule_key(decls: Declarations, key: CommandDecl) -> str: | ||
| return pretty_decl(decls, key) |
There was a problem hiding this comment.
nit: remove this extra function
| self.egg_rule_to_command_decl[normalized + "=>"] = cmd | ||
| self.egg_rule_to_command_decl[normalized + "<="] = cmd |
There was a problem hiding this comment.
Could you leave a comment here that the birewrite is de-sugared into these two names in egglog which is why we have both of them?
| return cls( | ||
| changed=report.changed, | ||
| rule_reports={ | ||
| translate_key(k): [RuleReport._from_bindings(rr) for rr in v] for k, v in report.rule_reports.items() |
There was a problem hiding this comment.
A birewrite will actually show up twice right? So if we want to keep both of them as one Python rule, then I think then we would need to add the timings for both of them?
| egraph.register(Num(1) + Num(2)) | ||
| report = egraph.run(10) | ||
|
|
||
| output = str(report) |
There was a problem hiding this comment.
Could you add to the tests asserting the string of the rule is in rewrite output, not just that it doesn't container some strings?
| search_and_apply_time_per_ruleset: dict[str, timedelta] | ||
| merge_time_per_ruleset: dict[str, timedelta] | ||
| rebuild_time_per_ruleset: dict[str, timedelta] |
There was a problem hiding this comment.
We don't have to do this, but we could also change these strings into actual ruleset objects? I am not sure if that would be too hard or awkward, and isn't essential for this PR.
|
@saulshanabrook Thanks for the thorough review! I do agree that the performance looks concerning. The numeric name approach you mentioned would work for bindings with a "name" field - so not for, |
Ah yeah I kept forgetting about this! I just talked to some other folks on the egglog team and they said that sounds like a great feature to add, just something we hadn't gotten around to yet. It should also I think be relatively straightforward so a good first PR to egglog core if you don't mind doing that... Then once that is merged hopefully should just be able to update the pin here and can use that feature. I believe the version of egglog we depend on here is pretty recent, so hopefully won't be other changes we have to adapt to. |
Resolves #398.
Here is an example code:
Output before:
Output after: