Skip to content

THRIFT-5941: Add Ruby ext cppcheck coverage#3387

Open
kpumuk wants to merge 1 commit intoapache:masterfrom
kpumuk:rb-cppcheck
Open

THRIFT-5941: Add Ruby ext cppcheck coverage#3387
kpumuk wants to merge 1 commit intoapache:masterfrom
kpumuk:rb-cppcheck

Conversation

@kpumuk
Copy link
Copy Markdown
Contributor

@kpumuk kpumuk commented Apr 9, 2026

Introduce cppcheck SCA for the Ruby native extension under lib/rb/ext and fix the existing findings so it can run in CI without noise.

Offenses addressed:

lib/rb/ext/compact_protocol.c:128:24: error: Shifting signed 32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
  return (n << 1) ^ (n >> 31);
                       ^
lib/rb/ext/compact_protocol.c:132:24: error: Shifting signed 64-bit value by 63 bits is undefined behaviour [shiftTooManyBitsSigned]
  return (n << 1) ^ (n >> 63);
                       ^
lib/rb/ext/compact_protocol.c:427:33: error: Signed integer overflow for expression '-(n&1)'. [integerOverflow]
  return (((uint32_t)n) >> 1) ^ -(n & 1);
                                ^
lib/rb/ext/struct.c:255:11: style: The scope of the variable 'key' can be reduced. [variableScope]
    VALUE key;
          ^
lib/rb/ext/struct.c:256:11: style: The scope of the variable 'val' can be reduced. [variableScope]
    VALUE val;
          ^
lib/rb/ext/struct.c:492:9: style: The scope of the variable 'i' can be reduced. [variableScope]
    int i;
        ^
lib/rb/ext/struct.c:531:9: style: The scope of the variable 'i' can be reduced. [variableScope]
    int i;
        ^
lib/rb/ext/struct.c:558:9: style: The scope of the variable 'i' can be reduced. [variableScope]
    int i;
        ^
lib/rb/ext/thrift_native.c:124:0: style: The function 'Init_thrift_native' is never used. [unusedFunction]
RUBY_FUNC_EXPORTED void Init_thrift_native(void) {
^
  • Did you create an Apache Jira ticket? THRIFT-5941
  • If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
  • Did you squash your changes to a single commit? (not required, but preferred)
  • Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
  • If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

@mergeable mergeable bot added ruby github_actions Pull requests that update GitHub Actions code labels Apr 9, 2026
}

static int64_t read_varint64(VALUE self) {
static uint64_t read_varint64(VALUE self) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we change that to uint64_t? That introduces a lot of implicit things going on. Do we really want that?

Copy link
Copy Markdown
Contributor Author

@kpumuk kpumuk Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_varint64() should return uint64_t because it reads raw varint bits, not a signed Thrift value yet. For values like the ZigZag encoding of INT64_MIN, the wire value is 0xffffffffffffffff, which is representable as uint64_t but not as a meaningful positive int64_t.

UBSan reports the problem in the 64-bit decode path when read_varint64() returns int64_t and the raw unsigned value is forced through a signed type too early:

runtime error: implicit conversion from type 'uint64_t' ... value 18446744073709551615 to type 'int64_t' ... changed the value to -1
  • read_varint64() -> uint64_t for raw varint bits
  • zig_zag_to_ll(uint64_t) -> int64_t for the actual signed decode

Copy link
Copy Markdown
Contributor Author

@kpumuk kpumuk Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the last amend:

  • All conversions are explicit, including (potentially overkill unnecessary seqid overflow, mirroring logic in Ruby)
  • Separated int32 from int64 path via read_varint32 / read_varint64
  • ZigZag decode uses unsigned ints

Still lacking:

  • Size enforcement (C++ enforces 5 bytes for int32, 10 bytes for int64, binary/name length to INT32_MAX via signed out int32_t and >= 0 comparison). Semantically, we should fail on > INT32_MAX

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cast to int32_t seems stale:

  • read_varint64() returns uint64_t
  • zig_zag_to_int() expects uint32_t

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. I was following the cppcheck complaints and missed this :-(
I also noticed that we do not validate the size of varint in Ruby (cpp fails on >10 bytes), but will leave it for the future.

@kpumuk kpumuk force-pushed the rb-cppcheck branch 2 times, most recently from a2ab025 to 4f496f7 Compare April 13, 2026 23:52
Copy link
Copy Markdown
Member

@Jens-G Jens-G left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM with one extra request

end

it "should decode i32 minima from direct canonical zigzag bytes" do
trans = Thrift::MemoryBufferTransport.new
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have that for i64 too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Client: rb

Co-Authored-By: OpenAI Codex (GPT-5.4) <codex@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

github_actions Pull requests that update GitHub Actions code ruby

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants