Benchmark revamp + run benchmark as part of CI #3176

stefannibrasil · 2026-01-12T23:04:14Z

Motivation / Background

We want to run some experiments to improve the library's performance needs and before making any changes, we need to have baseline stats to guarantee new code does not degrade performance. Plus, we want to have benchmark scripts to eventually be part of our CI.

To keep everything in a single place, I moved the previous benchmark tasks to a folder. The goal is to use the folder to document experiment results.

Closes #3159, #3160

Inspired by ruby/json#606.

Results from running the scripts (January 12th, 2026). Machine specs: Apple M1 Pro 16GB memory on MacOS Sequoia 15.7.3.

Require

faker % RUBYOPT="-W0" ruby benchmark/require.rb
took 119.12799999117851ms to load

Load locales - YML vs JSON

faker % RUBYOPT="-W0" ruby benchmark/load_yml_vs_json.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [arm64-darwin24]
Warming up --------------------------------------
                 YML    37.000 i/100ms
                JSON   953.000 i/100ms
Calculating -------------------------------------
                 YML    374.033 (± 0.5%) i/s    (2.67 ms/i) -      1.887k in   5.045222s
                JSON      9.691k (± 1.1%) i/s  (103.19 μs/i) -     48.603k in   5.015937s

Comparison:
                 YML:      374.0 i/s
                JSON:     9691.0 i/s - 25.91x  faster

Generators

faker % RUBYOPT="-W0" ruby benchmark/generators.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [arm64-darwin24]
Warming up --------------------------------------
Number of generators: 659
                         1.000 i/100ms
Calculating -------------------------------------
Number of generators: 659
                         31.451 (± 9.5%) i/s   (31.80 ms/i) -    156.000 in   5.010814s

benchmark/generators.rb

Inspired by ruby/json#606

Having these in a folder helps because we can document experiment results in it as well. And we can edit the require script to raise an error if it takes longer than a threshold to load faker. Co-Authored-By: Thiago Araujo <thd.araujo@gmail.com>

thdaraujo · 2026-01-13T00:29:11Z

.github/workflows/bench.yml

+    branches: [ main ]
+  pull_request:
+    branches: [ main ]
+


maybe

Suggested change

permissions:

contents: read

Fair enough. I also added an extra permission for pull requests, and for the tests workflow as well in 07769e9

benchmark/generators.rb

thdaraujo · 2026-01-13T01:01:17Z

benchmark/generators.rb

+  x.report("Number of generators: #{all_generators.count}") do
+    all_generators.each { |generator| eval(generator) }
+  end


we should build the list of generators outside so that we're only benchmarking generator execution

Suggested change

x.report("Number of generators: #{all_generators.count}") do

all_generators.each { |generator| eval(generator) }

end

generators = all_generators

x.report("Number of generators: #{all_generators.count}") do

generators.each { |klass, generator| klass.send(generator) }

end

Good catch, thanks! Changed in 8714ae3

thdaraujo · 2026-01-13T01:03:00Z

benchmark/load_yml_vs_json.rb

+  x.report('JSON') { JSON.load_file("#{File.dirname(__FILE__)}/../test/fixtures/locales/es-MX.json") }
+
+  x.compare!(order: :baseline)
+end


Do we still want to keep this? Benchmarking json vs yaml load times is not really relevant to Faker, this was just an experiment.

Good point. I was only copying things over from the benchmark rake task. Removed it in 7745001

thdaraujo · 2026-01-13T01:03:21Z

.rubocop.yml

+  Description: The use of eval represents a serious security risk.
+  Exclude:
+    - 'lib/faker/default/json.rb'
+    - 'benchmark/generators.rb'


maybe we don't need eval on generators, see previous comment

Done in 8714ae3

thdaraujo

Nice work on this! do you mind adding to the description the machine you're using so we can compare apples to apples?

e.g. Apple M1 16GB memory on MacOS X.Y.Z

… execution Looping over the constants instead of using `eval` is a more secure approach. Plus, build the list of generators outside so that we're only benchmarking generator execution. Co-Authored-By: Thiago Araujo <thd.araujo@gmail.com>

stefannibrasil · 2026-01-13T19:02:50Z

Nice work on this! do you mind adding to the description the machine you're using so we can compare apples to apples?

e.g. Apple M1 16GB memory on MacOS X.Y.Z

Yes! Added to the PR description.

thdaraujo

very nice, thank you!

stefannibrasil force-pushed the sb-3159-benchmark-revamp branch 9 times, most recently from 8a7e4f2 to a937aac Compare January 12, 2026 23:32

stefannibrasil commented Jan 12, 2026

View reviewed changes

benchmark/generators.rb Show resolved Hide resolved

stefannibrasil force-pushed the sb-3159-benchmark-revamp branch 2 times, most recently from 0d24c5e to ca35c75 Compare January 12, 2026 23:42

stefannibrasil and others added 2 commits January 12, 2026 16:49

Organize benchmark gems in a group

f09c7b1

Inspired by ruby/json#606

Reorganize benchmark scripts

a109a55

Having these in a folder helps because we can document experiment results in it as well. And we can edit the require script to raise an error if it takes longer than a threshold to load faker. Co-Authored-By: Thiago Araujo <thd.araujo@gmail.com>

stefannibrasil force-pushed the sb-3159-benchmark-revamp branch from 5fee76f to a109a55 Compare January 12, 2026 23:55

stefannibrasil requested a review from thdaraujo January 12, 2026 23:57

thdaraujo reviewed Jan 13, 2026

View reviewed changes

benchmark/generators.rb Show resolved Hide resolved

thdaraujo reviewed Jan 13, 2026

View reviewed changes

stefannibrasil and others added 3 commits January 13, 2026 11:35

Strict permissions for gh workflows

07769e9

This was a one-time experiment

7745001

stefannibrasil requested a review from thdaraujo January 13, 2026 19:02

Don't need to keep this around anymore after comparing YML vs JSON

766c36c

thdaraujo approved these changes Jan 14, 2026

View reviewed changes

stefannibrasil merged commit c48530e into main Jan 14, 2026
9 checks passed

stefannibrasil deleted the sb-3159-benchmark-revamp branch January 14, 2026 02:34

stefannibrasil mentioned this pull request Jan 14, 2026

Benchmark load time #3160

Closed

Benchmark revamp + run benchmark as part of CI #3176

Benchmark revamp + run benchmark as part of CI #3176

Uh oh!

Conversation

stefannibrasil commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation / Background

Require

Load locales - YML vs JSON

Generators

Uh oh!

Uh oh!

thdaraujo Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

stefannibrasil Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thdaraujo Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

stefannibrasil Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

thdaraujo Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

stefannibrasil Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

thdaraujo Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

stefannibrasil Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thdaraujo left a comment

Choose a reason for hiding this comment

Uh oh!

stefannibrasil commented Jan 13, 2026

Uh oh!

thdaraujo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stefannibrasil commented Jan 12, 2026 •

edited

Loading

stefannibrasil Jan 13, 2026 •

edited

Loading