Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions .github/workflows/benchmark.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

name: Benchmarks
on:
push:
branches: [main]
pull_request:
paths:
- ".github/workflows/benchmark.yaml"
- "ci/scripts/bench.sh"
- "ci/scripts/bench_adapt.py"
- "perf/**"
workflow_dispatch:
permissions:
contents: read
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Set up Node.js
uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0
with:
node-version: '20'
cache: npm
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.11'
- name: Run Benchmarks
if: github.event_name != 'push'
run: bash ci/scripts/bench.sh $(pwd)
- name: Upload results
if: github.event_name == 'push' && github.repository == 'apache/arrow-js' && github.ref_name == 'main'
env:
CONBENCH_URL: https://conbench.arrow-dev.org
CONBENCH_EMAIL: ${{ secrets.CONBENCH_EMAIL }}
CONBENCH_PASSWORD: ${{ secrets.CONBENCH_PASS }}
CONBENCH_REF: ${{ github.ref_name }}
CONBENCH_MACHINE_INFO_NAME: amd64-ubuntu-24
run: |
python3 -m pip install benchadapt@git+https://github.com/conbench/conbench.git@main#subdirectory=benchadapt/python
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow installs benchadapt directly from the Conbench repo’s main branch. This is not reproducible and can cause CI breakages if upstream changes, and it also increases supply-chain risk compared to pinning. Consider pinning to a specific tag or commit SHA (and optionally using a constraints/requirements file) so benchmark submissions remain stable over time.

Suggested change
python3 -m pip install benchadapt@git+https://github.com/conbench/conbench.git@main#subdirectory=benchadapt/python
python3 -m pip install benchadapt@git+https://github.com/conbench/conbench.git@0123456789abcdef0123456789abcdef01234567#subdirectory=benchadapt/python

Copilot uses AI. Check for mistakes.
python3 ci/scripts/bench_adapt.py
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -94,3 +94,5 @@ dev/release/rat.xml

# Release
dev/release/.env
bench_stats.json
__pycache__/
42 changes: 42 additions & 0 deletions ci/scripts/bench.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# Runs JavaScript benchmarks. If `--json` is passed as the second argument,
# benchmark results are written to bench_stats.json in the calling directory.

set -ex

if [ -z "$1" ]; then
echo "Error: Missing source directory argument"
exit 1
fi

source_dir="$1"

pushd "${source_dir}"

npm ci

if [[ "$2" = "--json" ]]; then
npm run perf -- --json 2>"${OLDPWD}/bench_stats.json"
else
npm run perf
fi
Comment on lines +36 to +40
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When --json is used, results are redirected to ${OLDPWD}/bench_stats.json (the directory where the script was invoked), not to ${source_dir}. Since other tooling (e.g. bench_adapt.py) expects the results file under the repo root, this can break if bench.sh is called from outside the repo root. Consider writing the results to a path based on source_dir (or accepting an explicit output file path) to make the output location deterministic.

Copilot uses AI. Check for mistakes.

popd
120 changes: 120 additions & 0 deletions ci/scripts/bench_adapt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
#!/usr/bin/env python3
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import json
import os
import uuid
import logging
from pathlib import Path
from typing import List

from benchadapt import BenchmarkResult
from benchadapt.adapters import BenchmarkAdapter
from benchadapt.log import log

log.setLevel(logging.DEBUG)

ARROW_ROOT = Path(__file__).parent.parent.parent.resolve()
SCRIPTS_PATH = ARROW_ROOT / "ci" / "scripts"

# `github_commit_info` is meant to communicate GitHub-flavored commit
# information to Conbench. See
# https://github.com/conbench/conbench/blob/cf7931f/benchadapt/python/benchadapt/result.py#L66
# for a specification.
github_commit_info = {"repository": "https://github.com/apache/arrow-js"}

if os.environ.get("CONBENCH_REF") == "main":
# Assume GitHub Actions CI. The environment variable lookups below are
# expected to fail when not running in GitHub Actions.
github_commit_info = {
"repository": f'{os.environ["GITHUB_SERVER_URL"]}/{os.environ["GITHUB_REPOSITORY"]}',
"commit": os.environ["GITHUB_SHA"],
"pr_number": None, # implying default branch
}
Comment on lines +42 to +49
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script decides it’s running in GitHub Actions based on CONBENCH_REF == "main" and then unconditionally requires GITHUB_SERVER_URL, GITHUB_REPOSITORY, and GITHUB_SHA. This can crash in local runs if someone sets CONBENCH_REF=main for labeling/submission purposes outside of GitHub Actions. Consider detecting CI via GITHUB_ACTIONS (or checking for required GITHUB_* vars) instead of using CONBENCH_REF as the switch.

Copilot uses AI. Check for mistakes.
run_reason = "commit"
else:
# Local dev environment. Do not include commit information since this is
# not a controlled CI environment.
# Allow user to optionally inject a custom piece of information into the
# run reason via environment.
run_reason = "localdev"
custom_reason_suffix = os.getenv("CONBENCH_CUSTOM_RUN_REASON")
if custom_reason_suffix is not None:
run_reason += f" {custom_reason_suffix.strip()}"


class JSAdapter(BenchmarkAdapter):
# bench.sh writes bench_stats.json into the calling directory (repo root)
result_file = str(ARROW_ROOT / "bench_stats.json")
command = ["bash", str(SCRIPTS_PATH / "bench.sh"), str(ARROW_ROOT), "--json"]

def __init__(self, *args, **kwargs) -> None:
Comment on lines +63 to +67
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bench.sh writes bench_stats.json to the calling directory (${OLDPWD}), but JSAdapter.result_file always reads from ${ARROW_ROOT}/bench_stats.json. If bench_adapt.py is executed from a directory other than the repo root, the adapter will look in the wrong place and fail to parse results. Consider making bench.sh write to a path derived from source_dir (or pass an explicit output path) so production and local runs are consistent with result_file.

Suggested change
# bench.sh writes bench_stats.json into the calling directory (repo root)
result_file = str(ARROW_ROOT / "bench_stats.json")
command = ["bash", str(SCRIPTS_PATH / "bench.sh"), str(ARROW_ROOT), "--json"]
def __init__(self, *args, **kwargs) -> None:
# bench.sh writes bench_stats.json into the calling directory of this
# process, so resolve the result file from the current working directory
# at runtime instead of assuming the repo root is the caller's cwd.
command = ["bash", str(SCRIPTS_PATH / "bench.sh"), str(ARROW_ROOT), "--json"]
def __init__(self, *args, **kwargs) -> None:
self.result_file = str(Path.cwd() / "bench_stats.json")

Copilot uses AI. Check for mistakes.
super().__init__(command=self.command, *args, **kwargs)

def _transform_results(self) -> List[BenchmarkResult]:
with open(self.result_file, "r") as f:
raw_results = json.load(f)

run_id = uuid.uuid4().hex

# Group results by suite so each suite shares a batch_id
suite_batch_ids: dict = {}

parsed_results = []
for result in raw_results:
suite = result.get("suite", "unknown")
if suite not in suite_batch_ids:
suite_batch_ids[suite] = uuid.uuid4().hex
batch_id = suite_batch_ids[suite]

# benny reports:
# ops - operations per second
# details.median - median time per operation, in seconds
# samples - number of samples collected
parsed = BenchmarkResult(
run_id=run_id,
batch_id=batch_id,
stats={
"data": [result["ops"]],
"unit": "i/s",
"times": [result["details"]["median"]],
"time_unit": "s",
"iterations": result["samples"],
},
context={
"benchmark_language": "JavaScript",
},
tags={
"suite": suite,
"name": result["name"],
},
run_reason=run_reason,
github=github_commit_info,
)
parsed.run_name = (
f"{parsed.run_reason}: {github_commit_info.get('commit')}"
Comment on lines +110 to +111
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_name is always set to include github_commit_info.get('commit'). In the local-dev path github_commit_info does not include a commit, so the run name becomes localdev: None, which is likely not intended and makes Conbench runs harder to read/filter. Consider only appending the commit when it’s present (or use a local identifier like the generated run_id).

Suggested change
parsed.run_name = (
f"{parsed.run_reason}: {github_commit_info.get('commit')}"
commit = github_commit_info.get("commit")
parsed.run_name = (
f"{parsed.run_reason}: {commit}" if commit else parsed.run_reason

Copilot uses AI. Check for mistakes.
)
parsed_results.append(parsed)

return parsed_results


if __name__ == "__main__":
js_adapter = JSAdapter(result_fields_override={"info": {}})
js_adapter()
44 changes: 44 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"build": "cross-env NODE_NO_WARNINGS=1 gulp build",
"clean": "cross-env NODE_NO_WARNINGS=1 gulp clean",
"debug": "cross-env NODE_NO_WARNINGS=1 gulp debug",
"perf": "node --no-warnings --loader ts-node/esm/transpile-only perf/index.ts",
"perf": "node --no-warnings --import tsx/esm perf/index.ts",
"test:integration": "bin/integration.ts --mode validate",
"release": "./npm-release.sh",
"test:coverage": "gulp test -t src --coverage",
Expand Down Expand Up @@ -100,6 +100,7 @@
"rollup": "4.59.0",
"rxjs": "7.8.2",
"ts-jest": "29.1.4",
"tsx": "^4.19.3",
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tsx is the only devDependency using a caret range (^4.19.3) while the rest of devDependencies are pinned to exact versions. This can cause the installed tsx version to drift over time (even with a lockfile, it makes version intent less clear and increases risk of breakage when regenerating the lockfile). Consider pinning tsx to an exact version to match the repository’s devDependency versioning pattern.

Suggested change
"tsx": "^4.19.3",
"tsx": "4.19.3",

Copilot uses AI. Check for mistakes.
"typedoc": "0.28.17",
"typescript": "5.4.5",
"typescript-eslint": "8.57.0",
Expand Down