Question about reproducing Table 7 TRL (2D) pretrained jitter result

Hi,
Thank you for releasing the code, dataset, and pretrained checkpoint.
I am trying to reproduce the pretrained turbulent_radiative_layer_2D result in Table 7 with jitter_patches=True, but I am getting noticeably larger values than the paper reports. I am using the full pretrained checkpoint provided in the release for my tests.

I am evaluating on rollout_valid:
without start_rollout_valid_output_at_t=17: 1.9817
with start_rollout_valid_output_at_t=17: 1.4758
Both are still much higher than the 0.75 reported in Table 7. Could you clarify the exact evaluation setup used for that number?

In particular:
Was the Table 7 value computed on rollout_valid with start_rollout_valid_output_at_t=17?
Was it taken directly from the logged aggregate metric, or from post-processing / slicing the saved time logs?
Are there any additional settings or dataset filtering details needed to reproduce the reported value with the released pretrained checkpoint?

Thanks for any clarification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about reproducing Table 7 TRL (2D) pretrained jitter result #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about reproducing Table 7 TRL (2D) pretrained jitter result #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions