Skip to content

Commit 4ba800d

Browse files
refactor: restructure token handling with NonReservedWord() sentinels
Replace the manual updateKeywords task and hardcoded keyword whitelist with a NonReservedWord() BNF production bracketed by MIN/MAX sentinel tokens. Non-reserved keywords are now determined by O(1) range check in isIdentifierAhead() instead of enumerating them in RelObjectNameWithoutValue(). - Refactor ParserKeywordsUtils: derive keywords dynamically from CCJSqlParserConstants + grammar file, remove hardcoded ALL_RESERVED_KEYWORDS array and RESTRICTED_* flags - Consolidate ConditionalKeywordsTest into KeywordsTest - Simplify updateKeywords Gradle task (generates RST doc only) - Update contribution.rst and usage.rst documentation - allow nested comments Fixes #1175 Signed-off-by: Andreas Reichel <andreas@manticore-projects.com> Signed-off-by: manticore-projects <andreas@manticore-projects.com>
1 parent 08d0bcc commit 4ba800d

File tree

17 files changed

+980
-983
lines changed

17 files changed

+980
-983
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ jobs:
8787
run: sudo apt-get install -y xsltproc sphinx-common
8888

8989
- name: Install Python dependencies
90-
run: pip install furo myst_parser sphinx_substitution_extensions sphinx_issues sphinx_inline_tabs pygments
90+
run: pip install manticore_sphinx_theme myst_parser sphinx_substitution_extensions sphinx_issues sphinx_inline_tabs pygments
9191

9292
- name: Build Sphinx documentation with Gradle
9393
run: FLOATING_TOC=false ./gradlew -DFLOATING_TOC=false gitChangelogTask renderRR xslt xmldoc sphinx

build.gradle

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ tasks.register('generateBuildInfo') {
9191
def buildTime = Instant.now().toString()
9292

9393
def content = """\
94-
|package ai.starlake.jsqltranspiler;
94+
|package net.sf.jsqlparser;
9595
|
9696
|public final class BuildInfo {
9797
| public static final String NAME = "${project.name}";
@@ -539,7 +539,7 @@ Version {{name}}
539539

540540
tasks.register('updateKeywords', JavaExec) {
541541
group = "Execution"
542-
description = "Run the main class with JavaExecTask"
542+
description = "Generate the Reserved Keywords documentation"
543543
classpath = sourceSets.main.runtimeClasspath
544544
args = [
545545
file('src/main/jjtree/net/sf/jsqlparser/parser/JSqlParserCC.jjt').absolutePath

src/main/java/net/sf/jsqlparser/expression/OracleHint.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ public class OracleHint extends ASTNodeAccessImpl implements Expression {
2424

2525
private static final Pattern SINGLE_LINE = Pattern.compile("--\\+ *([^ ].*[^ ])");
2626
private static final Pattern MULTI_LINE =
27-
Pattern.compile("\\/\\*\\+ *([^ ].*[^ ]) *\\*+\\/", Pattern.MULTILINE | Pattern.DOTALL);
27+
Pattern.compile("/\\*\\+ *([^ ].*[^ ]) *\\*+/", Pattern.MULTILINE | Pattern.DOTALL);
2828

2929
private String value;
3030
private boolean singleLine = false;

src/main/java/net/sf/jsqlparser/parser/ParserKeywordsUtils.java

Lines changed: 159 additions & 399 deletions
Large diffs are not rendered by default.

src/main/jjtree/net/sf/jsqlparser/parser/JSqlParserCC.jjt

Lines changed: 395 additions & 124 deletions
Large diffs are not rendered by default.

src/site/sphinx/contribution.rst

Lines changed: 11 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -97,49 +97,27 @@ The JSQLParser is generated by ``JavaCC`` based on the provided Grammar. The Gra
9797
Manage Reserved Keywords
9898
------------------------------
9999

100-
Since JSQLParser is built by JavaCC from a Token based Grammar, ``Reserved Keywords`` need a special treatment. All Tokens of the Grammar would become ``Reserved Keywords`` -- unless explicitly allowed and white-listened.
100+
Since JSQLParser is built by JavaCC from a Token based Grammar, ``Reserved Keywords`` need a special treatment. All Tokens of the Grammar would become ``Reserved Keywords`` -- unless explicitly allowed as identifiers.
101+
102+
The Grammar uses a ``NonReservedWord()`` BNF production with inline token declarations, bracketed by ``MIN_NON_RESERVED_WORD`` and ``MAX_NON_RESERVED_WORD`` sentinel tokens. JavaCC assigns consecutive token kind values to the inline declarations, which enables an efficient O(1) range check in ``isIdentifierAhead()`` to determine whether a token can be used as an unquoted identifier.
101103

102104
.. code-block:: sql
103-
:caption: White-list Keyword example
105+
:caption: Non-reserved keyword example
104106
105-
-- <K_OVERLAPS:"OVERLAPS"> is a Token, recently defined in the Grammar
106-
-- Although it is not restricted by the SQL Standard and could be used for Column, Table and Alias names
107-
-- Explicitly white-listing OVERLAPS by adding it to the RelObjectNameWithoutValue() Production will allow for parsing the following statement
107+
-- <K_OVERLAPS:"OVERLAPS"> is defined as a non-reserved keyword inside the NonReservedWord() production
108+
-- It can be used for Column, Table and Alias names without quoting
108109
109110
SELECT Overlaps( overlaps ) AS overlaps
110111
FROM overlaps.overlaps overlaps
111112
WHERE overlaps = 'overlaps'
112113
AND (CURRENT_TIME, INTERVAL '1' HOUR) OVERLAPS (CURRENT_TIME, INTERVAL -'1' HOUR)
113114
;
114115
115-
So we will need to define and white-list any Keywords which may be allowed for Object Names (such as `Schema`, `Table`, `Column`, `Function`, `Alias`). This White-List must be updated whenever the Tokens of the Grammar change (e. |_| g. when adding a new Token or Production).
116-
117-
There is a task ``updateKeywords`` for Gradle and Maven, which will:
118-
119-
1) Parse the Grammar in order to find all Token definitions
120-
2) Read the list of explicitly ``Reserved Keywords`` from ``net/sf/jsqlparser/parser/ParserKeywordsUtils.java``
121-
3) Derive the list of ``White-Listed Keywords`` as difference between ``All Tokens`` and ``Reserved Keywords``
122-
4) Modifies the Grammar Productions ``RelObjectNameWithoutValue...`` adding all Tokens according to ``White-Listed Keywords``
123-
5) Run two special Unit Tests to verify parsing of all ``White-Listed Keywords`` (as `Schema`, `Table`, `Column`, `Function` or `Alias`)
124-
6) Update the web page about the Reserved Keywords
125-
126-
127-
.. tab:: Gradle
128-
129-
.. code-block:: shell
130-
:caption: Gradle `updateKeywords` Task
131-
132-
gradle updateKeywords
133-
134-
.. tab:: Maven
135-
136-
.. code-block:: shell
137-
:caption: Maven `updateKeywords` Task
138-
139-
mvn exec:java
140-
116+
When adding a new keyword token to the Grammar:
141117

142-
Without this Gradle Task, any new Token or Production will become a ``Reserved Keyword`` automatically and can't be used for Object Names without quoting.
118+
1) If the keyword should be usable as an unquoted identifier (the common case), add its inline token declaration to the ``NonReservedWord()`` production. It will automatically be placed between the sentinel tokens and recognised by the range check.
119+
2) If the keyword must be reserved (e. |_| g. core SQL syntax like ``SELECT``, ``FROM``, ``WHERE``), add it to the ``Reserved SQL Keywords`` TOKEN block **after** the ``MAX_NON_RESERVED_WORD`` sentinel.
120+
3) Verify that existing tests pass and that the keyword can be used as a ``Schema``, ``Table``, ``Column``, ``Function`` or ``Alias`` name where expected.
143121

144122

145123
Commit a Pull Request
@@ -196,4 +174,4 @@ Please consider using `Conventional Commits` and structure your commit message a
196174
* - **revert**
197175
- reverts one or many previous commits
198176

199-
Please visit `Better Programming <https://betterprogramming.pub/write-better-git-commit-messages-to-increase-your-productivity-89fa773e8375>`_ for more information and guidance.
177+
Please visit `Better Programming <https://betterprogramming.pub/write-better-git-commit-messages-to-increase-your-productivity-89fa773e8375>`_ for more information and guidance.

0 commit comments

Comments
 (0)