Skip to content

Draft: Add automated test runner and visualization script for stress testing#1080

Open
ahzero7d1 wants to merge 11 commits intodevelopfrom
stress-testing
Open

Draft: Add automated test runner and visualization script for stress testing#1080
ahzero7d1 wants to merge 11 commits intodevelopfrom
stress-testing

Conversation

@ahzero7d1
Copy link
Copy Markdown
Collaborator

  • Add extra arguments for running full test configurations
  • Separate test configuration and execution script
  • Add visualization script

Copy link
Copy Markdown
Collaborator

@JulienVig JulienVig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ahyoung, the scripts are very handy! I've left a few comments and questions.

  • Can you rename scripts to something like python to clarify the difference with the other typescript scripts also in cli/src?
  • Can you also add a short README.md in the scripts (python) folder with instructions and example to use the two scripts?

Comment on lines +6 to +7
import pandas as pd
import matplotlib.pyplot as plt
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a requirements.txt with the libraries contributors should install?

def plot_mean_std(df, metric, output_path, title, ylabel):
summary = df.groupby("step")[metric].agg(["mean", "std"]).reset_index()

summary["std"] = summary["std"].fillna(0)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were there NaN stds?


def main():
# path of configuration file for experiments
config_path = Path(sys.argv[1])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you set the default value to the path to basic_tests.json?

"experiments": [
{
"testID": "mnist_fed_mean_cnn3_p3_d600_e50_r2",
"task": "mnist_federated",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting this error:

cli/dist/args.js:46
    throw Error(`${unsafeArgs.task} not implemented.`);
          ^
Error: mnist_federated not implemented

Did you create new default tasks (mnist_federated, titanic_decentralized, etc.)?

Comment on lines +41 to +50
const streamPath = path.join(dir, `client${userIndex}_local_log.ndjson`);

const finalLog: SummaryLogs[] = [];
// create a write stream that saves learning logs during the train
let ndjsonStream: ReturnType<typeof createWriteStream> | null = null;

if (args.save){
ndjsonStream = createWriteStream(streamPath, {flags: "w"});
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you choose the "ndjson" name over jsonl for a particular reason? I had only ever seen jsonl until now so unless you particularly prefer ndjson I would change it to jsonl

Comment on lines +42 to +51
/**
* Return validation metrics
*
* TODO: currently only works for TFJS, gpt models
*/
evaluate(
_validationDataset?: Dataset<Batched<DataFormat.ModelEncoded[D]>>
): Promise<ValidationMetrics>{
throw new Error("Evaluation not supported for this model");
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* Return validation metrics
*
* TODO: currently only works for TFJS, gpt models
*/
evaluate(
_validationDataset?: Dataset<Batched<DataFormat.ModelEncoded[D]>>
): Promise<ValidationMetrics>{
throw new Error("Evaluation not supported for this model");
}
/**
* Return validation metrics
*/
abstract evaluate(
_validationDataset?: Dataset<Batched<DataFormat.ModelEncoded[D]>>
): Promise<ValidationMetrics>;

I think we can make this method abstract rather than throwing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants