Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Refactoring the evaluation workflow makes it correct at scale, easier to change, and faster to iterate.
Modification
Bump version to v0.2.0
BC-breaking (Optional)
Does the modification introduce changes that break the back-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
Major breaks in evaluation usage:
vln_multi=>vln_distributedusa_agent_serverUse cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Distributed evaluation now are supported for slurm, torchrun, alicloud
scripts/eval/bash/eval_vln_distributed.sh {habitat|internutopia|internutopia_vec_env} [--config xxx]Checklist
Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
If the modification has potential influence on downstream projects, this PR should be tested with downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.
Changelog of v0.2.0 (2025/12/4)
Highlights
DistributedEvaluatorandHabitatEnvintegrated into the InternNav framework ([Fix] Habitat Refactor & Support Distributed VLN-PE Evaluation #168)[habitat],[isaac],[model]([FIX] solve import issue, isolate dependency and requirements #135)New Features
eval.py, with new Habitat evaluation configs inscripts/eval/configs([Fix] Habitat Refactor & Support Distributed VLN-PE Evaluation #168)Improvements
HabitatEnvwith episode pool management ([Fix] Habitat Refactor & Support Distributed VLN-PE Evaluation #168)InternUtopiaEnvfor distributed execution and episode pool management ([Fix] Habitat Refactor & Support Distributed VLN-PE Evaluation #168)episode_loaderin VLN-PE with new distributed mode compatibility ([Fix] Habitat Refactor & Support Distributed VLN-PE Evaluation #168)data_collectorto support progress checkpointing and incremental result aggregation in distributed evaluation. ([Fix] Habitat Refactor & Support Distributed VLN-PE Evaluation #168)Bug Fixes
revise_one_data()was incorrectly applied to all datasets ([Fix] Habitat Refactor & Support Distributed VLN-PE Evaluation #168)Contributors
A total of 3 developers contributed to this release.
@kew6688, @Gariscat, @yuqiang-yang