Support octal escape sequences. by kaos · Pull Request #100 · arithy/packcc

kaos · 2026-06-15T20:10:33Z

I've also verified this change by looking at the generated parser rule code.

Diff for parser.c when using the octal character class before/after this change:

    PCC_DEBUG(ctx->auxil, PCC_DBG_EVALUATE, "UPPER_LETTER", ctx->level, chunk->pos, ctx->buffer.p + chunk->pos, ctx->buffer.n - chunk->pos);
    ctx->level++;
    {
        int u;
        const size_t n = pcc_get_char_as_utf32(ctx, &u);
        if (n == 0) goto L0000;
        if (!(
-            u == 0x000031 ||
-            u == 0x000030 ||
-            (u >= 0x000031 && u <= 0x000031) ||
-            u == 0x000033 ||
-            u == 0x000032
+            (u >= 0x000041 && u <= 0x00005a)
        )) goto L0000;
        ctx->cur += n;
    }

arithy · 2026-06-15T21:34:38Z

Hi @kaos , thank you very much for your contribution.
I want to accept this PR. However, it fails the test null.d.
So, I would be thankful if you could fix it.
(Initially I misread your implementation and closed the PR once..., sorry.)

kaos · 2026-06-16T12:19:14Z

@arithy

Hi, certainly!

I thought I ran all the tests before submitting, but I obviously failed to do so 🤦🏽

Looking at the null.d test, there are escape sequences on the form \0123. In the docs you say you support ANSI C escape codes, and in (this was the best freely available one I could find) https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf it is stated in 6.4.4.4 section 7:

Each octal or hexadecimal escape sequence is the longest sequence of characters that can
constitute the escape sequence.

So the previous escape should be parsed into two characters: \012 and 3. Before this change, it was parsed into four: \0, 1, 2 and 3.

It would seem that this is a breaking change, but one that follows the ISO C specification. Not sure how you would prefer to resolve this?

(I'll take this opportunity to note that it seems PackCC enforces the use of two hex digits for \x escape sequences as well, which made me raise an eyebrow as well, as \x5 is a valid escape according to ISO C.)

This diff to the tests make them pass again (by adding a x0 at the affected places):

--- a/tests/null.d/input.peg
+++ b/tests/null.d/input.peg
@@ -6,8 +6,8 @@ CHAR_CLASS_0 <- "char_class_0_a:" [abc\0-!123]+ { printf("CHAR_CLASS_0_A\n"); }
 CHAR_CLASS_1 <- "char_class_1_a:" [abc\0]+ { printf("CHAR_CLASS_1_A\n"); } / "char_class_1_b:" [abc\x00]+ { printf("CHAR_CLASS_1_B\n"); }
 CHAR_CLASS_2 <- "char_class_2_a:" [\0-!]+ { printf("CHAR_CLASS_2_A\n"); } / "char_class_2_b:" [\x00-!]+ { printf("CHAR_CLASS_2_B\n"); }
 CHAR_CLASS_3 <- "char_class_3_a:" [\0]+ { printf("CHAR_CLASS_3_A\n"); } / "char_class_3_b:" [\x00]+ { printf("CHAR_CLASS_3_B\n"); }
-STRING_0 <- "string_0_a:" "abc\0123" { printf("STRING_0_A\n"); } / "string_0_b:" "abc\x00123" { printf("STRING_0_B\n"); }
+STRING_0 <- "string_0_a:" "abc\x00123" { printf("STRING_0_A\n"); } / "string_0_b:" "abc\x00123" { printf("STRING_0_B\n"); }
 STRING_1 <- "string_1_a:" "abc\0" { printf("STRING_1_A\n"); } / "string_1_b:" "abc\x00" { printf("STRING_1_B\n"); }
-STRING_2 <- "string_2_a:" "\0123" { printf("STRING_2_A\n"); } / "string_2_b:" "\x00123" { printf("STRING_2_B\n"); }
+STRING_2 <- "string_2_a:" "\x00123" { printf("STRING_2_A\n"); } / "string_2_b:" "\x00123" { printf("STRING_2_B\n"); }
 STRING_3 <- "string_3_a:" "\0" { printf("STRING_3_A\n"); } / "string_3_b:" "\x00" { printf("STRING_3_B\n"); }
-CAPTURED <- "captured_a:" < CHAR_CLASS_0 "xyz\0123" > "|" $1 { printf("CAPTURED_A\n"); } / "captured_b:" < CHAR_CLASS_0 "xyz\x00123" > "|" $2 { printf("CAPTURED_B\n"); }
+CAPTURED <- "captured_a:" < CHAR_CLASS_0 "xyz\x00123" > "|" $1 { printf("CAPTURED_A\n"); } / "captured_b:" < CHAR_CLASS_0 "xyz\x00123" > "|" $2 { printf("CAPTURED_B\n"); }

Support octal escape sequences.

631b016

arithy closed this Jun 15, 2026

arithy reopened this Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support octal escape sequences.#100

Support octal escape sequences.#100
kaos wants to merge 1 commit into
arithy:mainfrom
kaos:octal_escape

kaos commented Jun 15, 2026

Uh oh!

arithy commented Jun 15, 2026

Uh oh!

kaos commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

kaos commented Jun 15, 2026

Uh oh!

arithy commented Jun 15, 2026

Uh oh!

kaos commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants