Question about the agent roles and prompts used for MATH500 and AIME experiments

Hi, thank you for open-sourcing this great work!

I am trying to reproduce the MATH500 and AIME experiments in KVComm. I read the paper and the repository, and I found that the appendix provides prompt designs for GSM8K, MMLU, and HumanEval. However, I could not find a separate prompt design for the MATH500 and AIME experiments reported in the appendix/table.

I would like to ask a few questions:

1. For the MATH500 and AIME experiments, did you directly reuse the GSM8K-style `MathSolver` prompts?
2. When using 2, 3, or 4 collaborating agents on MATH500/AIME, what were the exact agent roles?
   - For example, was it:
     - 2 agents: Math Solver + Mathematical Analyst
     - 3 agents: Math Solver + Mathematical Analyst + Programming Expert
     - 4 agents: Math Solver + Mathematical Analyst + Programming Expert + Inspector
     - plus a final `FinalRefer` agent for answer aggregation?
3. Was the `Programming Expert` agent enabled for MATH500/AIME, and was its Python code actually executed during inference?
4. Did you modify the final-answer format for AIME, e.g., requiring the final answer to be an integer from 0 to 999?
5. Could you share the exact prompt/config/script used to run the MATH500 and AIME experiments, if available?

The reason I am asking is that I am trying to evaluate KV reuse on mathematical reasoning benchmarks such as MATH500 and AIME, and I want to make sure my agent configuration and answer extraction are consistent with the original experiments.

Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the agent roles and prompts used for MATH500 and AIME experiments #4

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Question about the agent roles and prompts used for MATH500 and AIME experiments #4

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions