JSqlParser Parse Benchmark

A rigorous JMH benchmark suite that runs the exact same SQL queries as SQLGlot's benchmarks/parse.py, enabling a direct, apples-to-apples comparison of parse times across JSqlParser, SQLGlot, and polyglot-sql.

In addition to the synthetic queries defined by SQLGlot, this benchmark includes a large collection of real-world SQL extracted from JSqlParser's own test suite — covering DDL, DML, dialect-specific syntax, and edge cases that stress-test parser breadth as much as raw speed.

Queries

Synthetic queries (from SQLGlot)

Name	Description
`tpch`	Realistic TPC-H Q2 (CTEs, sub-queries, joins)
`short_query`	Tiny SELECT with CASE
`deep_arithmetic`	500-deep `1+0+1+2+…` and `2012…`
`large_in`	20 000-element IN lists (strings + numbers)
`values`	INSERT … VALUES with 2 000 × 20 cells
`many_joins`	199 JOINs
`many_unions`	500 UNION ALLs
`nested_subqueries`	20-deep nested `(SELECT * FROM …)`
`many_columns`	SELECT with 1 000 columns
`large_case`	CASE with 1 000 WHEN branches
`complex_where`	200 ANDed compound predicates
`many_ctes`	200 CTEs
`many_windows`	200 window functions
`nested_functions`	20-deep nested COALESCE
`large_strings`	500 × 100-char string literals
`many_numbers`	10 000 numeric literals in SELECT

Real-world queries (from JSqlParser's test suite)

The queries/ directory contains SQL files drawn from JSqlParser's extensive test suite — CREATE TABLE, ALTER, INSERT, SELECT, MERGE, DELETE, UPDATE, Hive, PostgreSQL, SQLite, and more. Each file bundles dozens to hundreds of statements that exercise real dialect features: MySQL key syntax, Oracle tablespaces, PostgreSQL arrays, Hive-specific DDL, complex nested brackets, window functions, and many others.

These files were pre-filtered against SQLGlot (see sqlglot_prefilter.py): statements that SQLGlot cannot parse are separated into *_bad.sql files and excluded from the benchmark. The remaining *_good.sql files contain only statements that both parsers accept, ensuring a fair comparison.

Custom queries

Drop .sql files into the queries/ directory. Both sides pick them up automatically — the filename (without .sql) becomes the benchmark name.

queries/
├── hive_retention.sql      → benchmark name: "hive_retention"
├── my_complex_report.sql   → benchmark name: "my_complex_report"
└── ...

Rules for .sql files:

One or more SQL statements per file (semicolons separate statements)
UTF-8 encoding
The filename stem must not collide with a built-in query name

# run just your custom query
./gradlew jmhRun -Pquery=hive_retention

# or in the Python comparison
python compare_parsers.py --queries hive_retention --skip-java

Running

Full JMH benchmark (rigorous)

./gradlew jmhRun

Results go to build/results/jmh/results.json.

Single query

./gradlew jmhRun -Pquery=tpch

Quick sanity check (no JMH, best-of-5)

./gradlew quickBench

Comparing parsers

The included compare_parsers.py benchmarks all available parsers and prints a multi-column comparison table. Each Python variant runs in its own subprocess to isolate mypyc .so loading from pure-Python measurement.

Python parser variants (auto-detected)

Variant	Install	What it is
`sqlglot(pure)`	`pip install sqlglot`	Interpreted Python — no C extensions
`sqlglot[c]`	`pip install "sqlglot[c]"`	mypyc-compiled C extensions (includes Rust tokenizer)
`polyglot-sql`	`pip install polyglot-sql`	Full Rust rewrite of SQLGlot (PyO3 bindings)

If you install sqlglot[c], the script automatically benchmarks both the C-extension and pure-Python variants. It temporarily hides the .so files to get a clean interpreted-Python measurement (the same technique SQLGlot's own make hidec / make showc uses).

Install whichever variants you want to compare. Unavailable parsers are silently skipped.

Usage

python -m venv .venv
source .venv/bin/activate
pip install "sqlglot" "sqlglot[c]" "polyglot-sql" --upgrade pip

# full rigorous run — both sides use matched settings
# Python: 2 forks x (3 warmup x 3s + 5 measure x 3s) = ~48s/query
# JMH:    2 forks x (3 warmup x 3s + 5 measure x 3s) = ~48s/query
python compare_parsers.py

# quick smoke test — ~2 min total
python compare_parsers.py --queries tpch,short_query,many_joins \
    --py-forks 1 --py-warmup-iters 1 --py-warmup-time 2 \
    --py-measure-iters 2 --py-measure-time 2 \
    --jmh-warmup 1 --jmh-iterations 2 --jmh-forks 1

# only Python parsers (no Java/Gradle needed)
python compare_parsers.py --skip-java

# only JSqlParser (skip Python)
python compare_parsers.py --skip-python

The Python parameters mirror JMH's flags:

Python flag	JMH equivalent	Default	Meaning
`--py-forks`	`-f`	2	Separate processes
`--py-warmup-iters`	`-wi`	3	Warmup iterations per fork
`--py-warmup-time`	`-w`	3s	Duration of each warmup iter
`--py-measure-iters`	`-i`	5	Measurement iterations per fork
`--py-measure-time`	`-r`	3s	Duration of each measure iter

Example output

  =======================================================================================================
    Parser Comparison  (all times in ms/op, lower is better)
  =======================================================================================================

                               Query  |  sqlglot(pure)  |    sqlglot[c]  |  polyglot-sql  |    JSqlParser
  ------------------------------------+-----------------+----------------+----------------+--------------
                                tpch  |        2.52 ms  |      710.5 us  |    308.5 us *  |      494.1 us
                         short_query  |       192.6 us  |       58.9 us  |     57.6 us *  |       62.9 us
                     deep_arithmetic  |        7.68 ms  |       2.40 ms  |      950.1 us  |    715.3 us *
                            large_in  |      403.88 ms  |     112.55 ms  |    64.47 ms *  |     153.77 ms
                              values  |      469.46 ms  |     145.54 ms  |    68.74 ms *  |     155.75 ms
                          many_joins  |       11.58 ms  |       3.27 ms  |     1.19 ms *  |       2.17 ms
                         many_unions  |       45.08 ms  |      10.82 ms  |     3.88 ms *  |       5.87 ms
                   nested_subqueries  |       981.1 us  |      256.9 us  |    105.3 us *  |      162.1 us
                        many_columns  |       12.02 ms  |       3.08 ms  |     2.20 ms *  |       3.95 ms
                          large_case  |       36.55 ms  |      10.86 ms  |     4.94 ms *  |       5.27 ms
                       complex_where  |       30.74 ms  |       7.79 ms  |     3.44 ms *  |       4.24 ms
                           many_ctes  |       18.22 ms  |       4.05 ms  |     1.21 ms *  |       1.79 ms
                        many_windows  |       21.66 ms  |       5.64 ms  |     2.26 ms *  |       3.50 ms
                    nested_functions  |       614.4 us  |      210.6 us  |    156.7 us *  |      317.9 us
                       large_strings  |        4.75 ms  |     1.65 ms *  |       1.73 ms  |       2.87 ms
                        many_numbers  |       98.69 ms  |      33.35 ms  |    16.74 ms *  |      21.77 ms
                      AlterTest_good  |       13.06 ms  |       5.28 ms  |           ---  |     1.12 ms *
             CreateSequenceTest_good  |        1.66 ms  |      473.5 us  |           ---  |    109.0 us *
                CreateTableTest_good  |       27.01 ms  |      11.91 ms  |           ---  |     2.47 ms *
                     DeleteTest_good  |        2.41 ms  |      757.5 us  |           ---  |    329.0 us *
                       HiveTest_good  |       537.2 us  |      210.8 us  |           ---  |    138.5 us *
                     InsertTest_good  |        7.58 ms  |       2.34 ms  |           ---  |     1.30 ms *
                      MergeTest_good  |        2.32 ms  |      865.0 us  |           ---  |    326.7 us *
  NestedBracketsPerformanceTest_good  |        9.96 ms  |       2.85 ms  |           ---  |     2.27 ms *
                   PostgresTest_good  |        1.48 ms  |      536.5 us  |           ---  |    249.9 us *
                     SQLiteTest_good  |       133.0 us  |       48.9 us  |     44.8 us *  |       46.1 us
                     SelectTest_good  |      206.88 ms  |      84.85 ms  |           ---  |     4.40 ms *
                     UpdateTest_good  |        5.57 ms  |       1.88 ms  |           ---  |    906.0 us *
             WindowFunctionTest_good  |       244.8 us  |      105.5 us  |     57.6 us *  |      100.2 us
       deep_function_parameters_good  |        7.44 ms  |       2.95 ms  |     1.02 ms *  |       2.11 ms
                 hive_retention_good  |        5.65 ms  |       2.57 ms  |    664.1 us *  |       1.48 ms
                      issue2401_good  |       27.52 ms  |      10.27 ms  |     3.02 ms *  |       9.66 ms
  ------------------------------------+-----------------+----------------+----------------+--------------

  * = fastest (within 2%)          --- = not available / parse error

Observations

polyglot-sql is the fastest on synthetic queries — but it fails to parse the majority of real-world SQL. A parser that only supports a narrow slice of SQL syntax can afford to be very fast on what it does handle.
JSqlParser dominates on real-world SQL. On every real-world test-suite file — SELECT, CREATE TABLE, ALTER, INSERT, UPDATE, MERGE, DELETE, Hive, PostgreSQL — JSqlParser is the fastest parser, often by a wide margin (e.g. SelectTest_good at 4.40 ms vs sqlglot[c]'s 84.85 ms — a 19× difference). These are the queries that matter in production.
Breadth and speed are not a trade-off. JSqlParser parses a substantially larger set of SQL dialects and syntax variants than either SQLGlot or polyglot-sql. The --- entries in the table — queries that a parser could not handle at all — appear exclusively in the SQLGlot/polyglot-sql columns, never in JSqlParser's. Broad dialect coverage and competitive performance go hand in hand.
On the synthetic queries, polyglot-sql (Rust) leads in token-heavy workloads like large_in and values where native-code tokenisation shines. JSqlParser holds its own as the second-fastest on most of these, and takes the top spot on deep_arithmetic — all while running on the JVM with no native extensions.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
__pycache__		__pycache__
gradle/wrapper		gradle/wrapper
queries		queries
src/jmh/java/net/sf/jsqlparser/benchmark		src/jmh/java/net/sf/jsqlparser/benchmark
.gitignore		.gitignore
README.md		README.md
benchmark_score.svg		benchmark_score.svg
build.gradle		build.gradle
compare_parsers.py		compare_parsers.py
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle
sqlglot_prefilter.py		sqlglot_prefilter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JSqlParser Parse Benchmark

Queries

Synthetic queries (from SQLGlot)

Real-world queries (from JSqlParser's test suite)

Custom queries

Running

Full JMH benchmark (rigorous)

Single query

Quick sanity check (no JMH, best-of-5)

Comparing parsers

Python parser variants (auto-detected)

Usage

Example output

Observations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

JSqlParser Parse Benchmark

Queries

Synthetic queries (from SQLGlot)

Real-world queries (from JSqlParser's test suite)

Custom queries

Running

Full JMH benchmark (rigorous)

Single query

Quick sanity check (no JMH, best-of-5)

Comparing parsers

Python parser variants (auto-detected)

Usage

Example output

Observations

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages