From 2f4e0c450a77cf197cd0778df226381d05de9a62 Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 16 Jun 2026 14:37:55 +0000 Subject: [PATCH] Add binary serialization of the parsed AST This is the bridge that lets the WebAssembly build hand a parsed AST back to Ruby without the Ruby C API: the parser serializes the tree to a compact binary buffer, and pure-Ruby code rebuilds the same RBS::AST objects on the other side. This is what will let RBS run on JRuby. Both ends are generated from config.yml, alongside the existing C -> Ruby translation, so they stay in sync: - src/serialize.c (rbs_serialize_node): walks the C AST into the binary format. - lib/rbs/wasm/serialization_schema.rb: the table the decoder follows. - lib/rbs/wasm/deserializer.rb: pure-Ruby decoder, the counterpart of ast_translation.c. Locations go through the public RBS::Location API so it works whether Location is C-backed (CRuby) or pure Ruby (JRuby). The format is documented in docs/wasm_serialization.md. To validate it on CRuby, the extension exposes `_parse_*_to_bytes`, and test/rbs/wasm/serialization_test.rb round-trips the whole bundled RBS corpus (core/stdlib/sig) plus type/method-type batteries, asserting the rebuilt tree is deeply identical to the direct C -> Ruby translation, down to locations and string encodings. Notably, rbs_hash_t does not maintain its `length` field (unlike rbs_node_list_t), so the serializer counts hash entries by walking the list. The template generator now also emits Ruby files (with a Ruby-style header). https://claude.ai/code/session_01LTveMt3NLbYHEboXuzAKpA --- Rakefile | 4 + docs/wasm_serialization.md | 80 ++ ext/rbs_extension/main.c | 130 +++ include/rbs/serialize.h | 33 + lib/rbs/wasm/deserializer.rb | 188 ++++ lib/rbs/wasm/serialization_schema.rb | 110 ++ sig/wasm/deserializer.rbs | 58 ++ sig/wasm/serialization_schema.rbs | 13 + src/serialize.c | 946 ++++++++++++++++++ templates/include/rbs/serialize.h.erb | 26 + .../lib/rbs/wasm/serialization_schema.rb.erb | 82 ++ templates/src/serialize.c.erb | 221 ++++ templates/template.rb | 27 +- test/rbs/wasm/serialization_test.rb | 181 ++++ 14 files changed, 2091 insertions(+), 8 deletions(-) create mode 100644 docs/wasm_serialization.md create mode 100644 include/rbs/serialize.h create mode 100644 lib/rbs/wasm/deserializer.rb create mode 100644 lib/rbs/wasm/serialization_schema.rb create mode 100644 sig/wasm/deserializer.rbs create mode 100644 sig/wasm/serialization_schema.rbs create mode 100644 src/serialize.c create mode 100644 templates/include/rbs/serialize.h.erb create mode 100644 templates/lib/rbs/wasm/serialization_schema.rb.erb create mode 100644 templates/src/serialize.c.erb create mode 100644 test/rbs/wasm/serialization_test.rb diff --git a/Rakefile b/Rakefile index 8fe05a9d7..9cf0e66cc 100644 --- a/Rakefile +++ b/Rakefile @@ -160,6 +160,10 @@ task :templates do sh "#{ruby} templates/template.rb include/rbs/ast.h" sh "#{ruby} templates/template.rb src/ast.c" + sh "#{ruby} templates/template.rb include/rbs/serialize.h" + sh "#{ruby} templates/template.rb src/serialize.c" + sh "#{ruby} templates/template.rb lib/rbs/wasm/serialization_schema.rb" + # Format the generated files Rake::Task["format:c"].invoke end diff --git a/docs/wasm_serialization.md b/docs/wasm_serialization.md new file mode 100644 index 000000000..adfa16200 --- /dev/null +++ b/docs/wasm_serialization.md @@ -0,0 +1,80 @@ +# RBS AST binary serialization + +This document describes the binary format used to move a parsed RBS AST out of +the parser and into Ruby objects without going through the Ruby C API. It exists +so that RBS can run on Ruby implementations that cannot load the C extension +(notably JRuby): the parser runs inside WebAssembly, serializes the result with +this format, and the host rebuilds `RBS::AST` objects in pure Ruby. + +The encoder (`rbs_serialize_node`, `src/serialize.c`) and the schema that drives +the decoder (`RBS::WASM::SerializationSchema`, `lib/rbs/wasm/serialization_schema.rb`) +are both generated from `config.yml`, so they always agree. The decoder itself +is `RBS::WASM::Deserializer`. + +## Conventions + +- All multi-byte integers are **little-endian**. +- `u8`, `u32` are unsigned; `i32` is signed. +- `str` is a `u32` byte length followed by that many raw bytes (no terminator). +- A value is reconstructed to mirror exactly what `ast_translation.c` produces, + including string encodings: string/integer literal nodes are UTF-8, while + comments, annotations and symbols use the source buffer's encoding. + +## Nodes + +Every node begins with a `u8` **tag**: + +- `0` — a NULL node (`nil` on the Ruby side). +- `1..N` — a node type, in the order they appear in `SerializationSchema::SCHEMA`. +- `SYMBOL_TAG` (`N + 1`) — an interned symbol, followed by `str` (the symbol's + bytes). Decoded with `String#to_sym`. + +A few node types are encoded specially, matching their bespoke handling in +`ast_translation.c`: + +| Node | Payload after tag | Decoded as | +| --- | --- | --- | +| `RBS::AST::Bool` | `u8` | `true` / `false` | +| `RBS::AST::Integer` | `str` | `String#to_i` | +| `RBS::AST::String` | `str` | the string (UTF-8) | +| `RBS::Types::Record::FieldType` | node, then `u8` | `[type, required]` | +| `RBS::Signature` | node-list, then node-list | `[directives, declarations]` | +| `RBS::Namespace` | node-list, then `u8` | `RBS::Namespace[path, absolute]` | +| `RBS::TypeName` | node, then node | `RBS::TypeName[namespace, name]` | + +Every other node is encoded generically: + +1. If the node exposes a location, its **base location** is written (see below), + followed by one location range per declared child, in order. +2. Each field is written in declaration order, encoded by its type (see below). + +The decoder constructs `Klass.new(location:, **fields)` (omitting `location:` +for nodes that do not expose one). For `Class`, `Module`, `Interface`, +`TypeAlias` and `MethodType`, `RBS::AST::TypeParam.resolve_variables` is applied +to `type_params` first, exactly as the C translation does. + +## Fields + +| Field type | Encoding | +| --- | --- | +| node (`rbs_node`, `rbs_type_name`, `rbs_ast_comment`, `rbs_ast_symbol`, ...) | a node (recursive; NULL allowed) | +| `rbs_node_list` | `u32` count, then that many nodes | +| `rbs_hash` | `u32` count, then count × (key node, value node) | +| `rbs_string` | `str` (source encoding) | +| `bool` | `u8` | +| enum | `u8` index into the enum's values (see `SCHEMA`) | +| `rbs_location_range` | a location range | +| `rbs_location_range_list` | `u32` count, then that many location ranges | +| `rbs_attr_ivar_name` | `u8` tag: `0` → `nil`, `1` → `false`, `2` → `str` → symbol | + +## Location ranges + +A location range is a `u8` presence flag: + +- `0` — null range (`nil`, or a node with no location). +- `1` — followed by `i32` start and `i32` end **character** positions. + +The base location and child ranges together let the decoder rebuild an +`RBS::Location` (with its required/optional children) through the public +`RBS::Location` API, so the same decoder works whether `RBS::Location` is backed +by the C extension or a pure-Ruby implementation. diff --git a/ext/rbs_extension/main.c b/ext/rbs_extension/main.c index bcbb73386..ef845fc82 100644 --- a/ext/rbs_extension/main.c +++ b/ext/rbs_extension/main.c @@ -2,6 +2,7 @@ #include "rbs/util/rbs_assert.h" #include "rbs/util/rbs_allocator.h" #include "rbs/util/rbs_constant_pool.h" +#include "rbs/serialize.h" #include "ast_translation.h" #include "legacy_location.h" #include "rbs_string_bridging.h" @@ -290,6 +291,132 @@ static VALUE rbsparser_parse_signature(VALUE self, VALUE buffer, VALUE start_pos return result; } +// Serialize a parsed node into a binary Ruby string using the same encoder the +// WebAssembly build uses. These `_*_to_bytes` entry points exist so the +// round-trip (parse -> serialize -> deserialize) can be exercised on CRuby, +// where it can be compared against the direct C -> Ruby translation. +static VALUE serialized_node_to_string(rbs_parser_t *parser, rbs_node_t *node) { + rbs_string_t bytes = rbs_serialize_node(parser->allocator, &parser->constant_pool, node); + return rb_str_new(bytes.start, (long) rbs_string_len(bytes)); +} + +static VALUE parse_type_to_bytes_try(VALUE a) { + struct parse_type_arg *arg = (struct parse_type_arg *) a; + rbs_parser_t *parser = arg->parser; + + if (parser->next_token.type == pEOF) { + return Qnil; + } + + rbs_node_t *type; + rbs_parse_type(parser, &type, RTEST(arg->void_allowed), RTEST(arg->self_allowed), RTEST(arg->classish_allowed)); + + raise_error_if_any(parser, arg->buffer); + + if (RB_TEST(arg->require_eof)) { + rbs_parser_advance(parser); + if (parser->current_token.type != pEOF) { + rbs_parser_set_error(parser, parser->current_token, true, "expected a token `%s`", rbs_token_type_str(pEOF)); + raise_error(parser->error, arg->buffer); + } + } + + return serialized_node_to_string(parser, type); +} + +static VALUE rbsparser_parse_type_to_bytes(VALUE self, VALUE buffer, VALUE start_pos, VALUE end_pos, VALUE variables, VALUE require_eof, VALUE void_allowed, VALUE self_allowed, VALUE classish_allowed) { + VALUE string = rb_funcall(buffer, rb_intern("content"), 0); + StringValue(string); + rb_encoding *encoding = rb_enc_get(string); + + rbs_parser_t *parser = alloc_parser_from_buffer(buffer, FIX2INT(start_pos), FIX2INT(end_pos)); + declare_type_variables(parser, variables, buffer); + struct parse_type_arg arg = { + .buffer = buffer, + .encoding = encoding, + .parser = parser, + .require_eof = require_eof, + .void_allowed = void_allowed, + .self_allowed = self_allowed, + .classish_allowed = classish_allowed + }; + + VALUE result = rb_ensure(parse_type_to_bytes_try, (VALUE) &arg, ensure_free_parser, (VALUE) parser); + + RB_GC_GUARD(string); + + return result; +} + +static VALUE parse_method_type_to_bytes_try(VALUE a) { + struct parse_method_type_arg *arg = (struct parse_method_type_arg *) a; + rbs_parser_t *parser = arg->parser; + + if (parser->next_token.type == pEOF) { + return Qnil; + } + + rbs_method_type_t *method_type = NULL; + rbs_parse_method_type(parser, &method_type, RB_TEST(arg->require_eof), true); + + raise_error_if_any(parser, arg->buffer); + + return serialized_node_to_string(parser, (rbs_node_t *) method_type); +} + +static VALUE rbsparser_parse_method_type_to_bytes(VALUE self, VALUE buffer, VALUE start_pos, VALUE end_pos, VALUE variables, VALUE require_eof) { + VALUE string = rb_funcall(buffer, rb_intern("content"), 0); + StringValue(string); + rb_encoding *encoding = rb_enc_get(string); + + rbs_parser_t *parser = alloc_parser_from_buffer(buffer, FIX2INT(start_pos), FIX2INT(end_pos)); + declare_type_variables(parser, variables, buffer); + struct parse_method_type_arg arg = { + .buffer = buffer, + .encoding = encoding, + .parser = parser, + .require_eof = require_eof + }; + + VALUE result = rb_ensure(parse_method_type_to_bytes_try, (VALUE) &arg, ensure_free_parser, (VALUE) parser); + + RB_GC_GUARD(string); + + return result; +} + +static VALUE parse_signature_to_bytes_try(VALUE a) { + struct parse_signature_arg *arg = (struct parse_signature_arg *) a; + rbs_parser_t *parser = arg->parser; + + rbs_signature_t *signature = NULL; + rbs_parse_signature(parser, &signature); + + raise_error_if_any(parser, arg->buffer); + + return serialized_node_to_string(parser, (rbs_node_t *) signature); +} + +static VALUE rbsparser_parse_signature_to_bytes(VALUE self, VALUE buffer, VALUE start_pos, VALUE end_pos) { + VALUE string = rb_funcall(buffer, rb_intern("content"), 0); + StringValue(string); + rb_encoding *encoding = rb_enc_get(string); + + rbs_parser_t *parser = alloc_parser_from_buffer(buffer, FIX2INT(start_pos), FIX2INT(end_pos)); + struct parse_signature_arg arg = { + .buffer = buffer, + .encoding = encoding, + .parser = parser, + .require_eof = false + }; + + VALUE result = rb_ensure(parse_signature_to_bytes_try, (VALUE) &arg, ensure_free_parser, (VALUE) parser); + + RB_GC_GUARD(string); + + return result; +} + struct parse_type_params_arg { VALUE buffer; rb_encoding *encoding; @@ -462,6 +589,9 @@ void rbs__init_parser(void) { rb_define_singleton_method(RBS_Parser, "_parse_type", rbsparser_parse_type, 8); rb_define_singleton_method(RBS_Parser, "_parse_method_type", rbsparser_parse_method_type, 5); rb_define_singleton_method(RBS_Parser, "_parse_signature", rbsparser_parse_signature, 3); + rb_define_singleton_method(RBS_Parser, "_parse_type_to_bytes", rbsparser_parse_type_to_bytes, 8); + rb_define_singleton_method(RBS_Parser, "_parse_method_type_to_bytes", rbsparser_parse_method_type_to_bytes, 5); + rb_define_singleton_method(RBS_Parser, "_parse_signature_to_bytes", rbsparser_parse_signature_to_bytes, 3); rb_define_singleton_method(RBS_Parser, "_parse_type_params", rbsparser_parse_type_params, 4); rb_define_singleton_method(RBS_Parser, "_parse_inline_leading_annotation", rbsparser_parse_inline_leading_annotation, 4); rb_define_singleton_method(RBS_Parser, "_parse_inline_trailing_annotation", rbsparser_parse_inline_trailing_annotation, 4); diff --git a/include/rbs/serialize.h b/include/rbs/serialize.h new file mode 100644 index 000000000..e55b7b64c --- /dev/null +++ b/include/rbs/serialize.h @@ -0,0 +1,33 @@ +/*----------------------------------------------------------------------------*/ +/* This file is generated by the templates/template.rb script and should not */ +/* be modified manually. */ +/* To change the template see */ +/* templates/include/rbs/serialize.h.erb */ +/*----------------------------------------------------------------------------*/ + +#ifndef RBS__SERIALIZE_H +#define RBS__SERIALIZE_H + +#include "rbs/ast.h" +#include "rbs/string.h" +#include "rbs/util/rbs_allocator.h" +#include "rbs/util/rbs_constant_pool.h" + +/** + * Serialize a parsed AST node into a compact, portable binary buffer. + * + * The format is consumed by RBS::WASM::Deserializer on the Ruby side, which + * rebuilds the same `RBS::AST` objects that the C extension would have built + * directly. This is what lets RBS run on Ruby implementations that cannot load + * the C extension (notably JRuby): the parser runs inside WebAssembly, produces + * this buffer, and the host reconstructs the tree in pure Ruby. + * + * The buffer is allocated from `allocator`, so its lifetime is tied to that + * allocator. `constant_pool` must be the pool the node was parsed with; it is + * used to resolve interned symbol/identifier ids back into their bytes. + * + * See `docs/wasm_serialization.md` for the wire format. + */ +rbs_string_t rbs_serialize_node(rbs_allocator_t *allocator, rbs_constant_pool_t *constant_pool, rbs_node_t *node); + +#endif diff --git a/lib/rbs/wasm/deserializer.rb b/lib/rbs/wasm/deserializer.rb new file mode 100644 index 000000000..4b19331a1 --- /dev/null +++ b/lib/rbs/wasm/deserializer.rb @@ -0,0 +1,188 @@ +# frozen_string_literal: true + +require_relative "serialization_schema" + +module RBS + module WASM + # Rebuilds RBS::AST objects from the binary buffer produced by + # `rbs_serialize_node` (src/serialize.c), driven by the generated + # SerializationSchema. This is the pure-Ruby counterpart of the C extension's + # ast_translation.c, used when the parser runs inside WebAssembly (JRuby). + # + # All locations are reconstructed through the public RBS::Location API, so the + # same code works whether RBS::Location is backed by the C extension (CRuby) + # or by a pure-Ruby implementation (JRuby). + class Deserializer + # Deserialize a buffer produced for a whole signature, returning + # `[directives, declarations]` to match RBS::Parser._parse_signature. + def self.deserialize(bytes, buffer) + new(bytes, buffer).read_node + end + + def initialize(bytes, buffer) + @bytes = bytes + @buffer = buffer + # Symbols and rbs_string fields (comments, annotations) inherit the + # source encoding, matching ast_translation.c. String/Integer literal + # nodes are always UTF-8 (see read_node). + @encoding = buffer.content.encoding + @pos = 0 + @class_cache = {} #: Hash[String, untyped] + end + + def read_node + tag = read_u8 + return nil if tag == 0 + return read_string(@encoding).to_sym if tag == SerializationSchema::SYMBOL_TAG + + entry = SerializationSchema::SCHEMA[tag] or raise "Unknown node tag: #{tag}" + + case entry[0] + when :node then read_struct(entry) + when :bool then read_u8 != 0 + when :integer then read_string(Encoding::UTF_8).to_i + when :string_value then read_string(Encoding::UTF_8) + when :record_field then [read_node, read_u8 != 0] + when :signature then [read_node_list, read_node_list] + when :namespace then RBS::Namespace[read_node_list, read_u8 != 0] + when :type_name then RBS::TypeName[read_node, read_node] + else raise "Unknown schema entry kind: #{entry[0].inspect}" + end + end + + private + + def read_struct(entry) + _, class_name, expose_location, loc_children, fields, resolve_type_params = entry + + location = read_location(loc_children) if expose_location + + kwargs = {} #: Hash[Symbol, untyped] + (fields || []).each do |name, reader| + kwargs[name] = read_field(reader) + end + + RBS::AST::TypeParam.resolve_variables(kwargs[:type_params]) if resolve_type_params + + klass = class_for(class_name) + if expose_location + klass.new(location: location, **kwargs) + else + klass.new(**kwargs) + end + end + + def read_field(reader) + case reader + when :node then read_node + when :node_list then read_node_list + when :hash then read_hash + when :string then read_string(@encoding) + when :bool then read_u8 != 0 + when :location_range then read_location_value + when :location_range_list then read_location_value_list + when :attr_ivar_name then read_attr_ivar_name + else # [:enum, [value_or_nil, ...]] + reader[1][read_u8] + end + end + + def read_node_list + Array.new(read_count) { read_node } + end + + def read_hash + hash = {} #: Hash[untyped, untyped] + read_count.times do + key = read_node + hash[key] = read_node + end + hash + end + + # A count of nested items. Each item is at least one byte, so a count that + # exceeds the bytes remaining signals the cursor has drifted out of sync. + def read_count + count = read_u32 + if count > @bytes.bytesize - @pos + raise "Corrupt buffer: count #{count} exceeds #{@bytes.bytesize - @pos} remaining bytes at offset #{@pos}" + end + count + end + + # The base location of a node, followed by its named child ranges. + def read_location(loc_children) + base = read_range + children = (loc_children || []).map { |name, required| [name, required, read_range] } + + return nil unless base + + location = RBS::Location.new(@buffer, base[0], base[1]) + children.each do |name, required, range| + if required + location.add_required_child(name, range[0]...range[1]) if range + else + location.add_optional_child(name, range ? (range[0]...range[1]) : nil) + end + end + location + end + + # A standalone location range field: nil or an RBS::Location without children. + def read_location_value + range = read_range + range && RBS::Location.new(@buffer, range[0], range[1]) + end + + def read_location_value_list + Array.new(read_count) { read_location_value } + end + + def read_attr_ivar_name + case read_u8 + when 0 then nil # inferred instance variable + when 1 then false # no instance variable + else read_string(@encoding).to_sym + end + end + + # Reads a presence byte and, when present, the start/end character positions. + def read_range + return nil if read_u8 == 0 + + start_char = read_i32 + end_char = read_i32 + [start_char, end_char] + end + + def read_u8 + byte = @bytes.getbyte(@pos) or raise "Unexpected end of buffer" + @pos += 1 + byte + end + + def read_u32 + value = @bytes.unpack1("L<", offset: @pos) #: Integer + @pos += 4 + value + end + + def read_i32 + value = @bytes.unpack1("l<", offset: @pos) #: Integer + @pos += 4 + value + end + + def read_string(encoding) + length = read_u32 + string = @bytes.byteslice(@pos, length) or raise "Unexpected end of buffer" + @pos += length + string.force_encoding(encoding) + end + + def class_for(name) + @class_cache[name] ||= Object.const_get(name) + end + end + end +end diff --git a/lib/rbs/wasm/serialization_schema.rb b/lib/rbs/wasm/serialization_schema.rb new file mode 100644 index 000000000..018be8562 --- /dev/null +++ b/lib/rbs/wasm/serialization_schema.rb @@ -0,0 +1,110 @@ +# frozen_string_literal: true +# +# This file is generated by the templates/template.rb script and should not be +# modified manually. To change the template see +# templates/lib/rbs/wasm/serialization_schema.rb.erb + +module RBS + module WASM + # Describes how to decode the binary buffer produced by `rbs_serialize_node` + # (src/serialize.c) back into RBS::AST objects. RBS::WASM::Deserializer walks + # this table; the matching encoder is generated from the same config.yml, so + # the two stay in sync. + # + # SCHEMA is indexed by node tag: tag 0 is NULL and SYMBOL_TAG is the + # interned-symbol tag. Each remaining entry is one of: + # + # [:node, class_name, expose_location, loc_children, fields, resolve_type_params] + # [:bool] / [:integer] / [:string_value] / [:record_field] + # [:signature] / [:namespace] / [:type_name] + # + # where loc_children is [[name, required?], ...] and fields is + # [[name, reader], ...] with reader one of :node, :node_list, :hash, :string, + # :bool, :location_range, :location_range_list, :attr_ivar_name, or + # [:enum, [value_or_nil, ...]]. + module SerializationSchema + SYMBOL_TAG = 78 + + SCHEMA = [ + nil, # tag 0 is reserved for NULL + [:node, "RBS::AST::Annotation", true, nil, [[:string, :string]], false], + [:bool], + [:node, "RBS::AST::Comment", true, nil, [[:string, :string]], false], + [:node, "RBS::AST::Declarations::Class", true, [[:keyword, true], [:name, true], [:end, true], [:type_params, false], [:lt, false]], [[:name, :node], [:type_params, :node_list], [:super_class, :node], [:members, :node_list], [:annotations, :node_list], [:comment, :node]], true], + [:node, "RBS::AST::Declarations::Class::Super", true, [[:name, true], [:args, false]], [[:name, :node], [:args, :node_list]], false], + [:node, "RBS::AST::Declarations::ClassAlias", true, [[:keyword, true], [:new_name, true], [:eq, true], [:old_name, true]], [[:new_name, :node], [:old_name, :node], [:comment, :node], [:annotations, :node_list]], false], + [:node, "RBS::AST::Declarations::Constant", true, [[:name, true], [:colon, true]], [[:name, :node], [:type, :node], [:comment, :node], [:annotations, :node_list]], false], + [:node, "RBS::AST::Declarations::Global", true, [[:name, true], [:colon, true]], [[:name, :node], [:type, :node], [:comment, :node], [:annotations, :node_list]], false], + [:node, "RBS::AST::Declarations::Interface", true, [[:keyword, true], [:name, true], [:end, true], [:type_params, false]], [[:name, :node], [:type_params, :node_list], [:members, :node_list], [:annotations, :node_list], [:comment, :node]], true], + [:node, "RBS::AST::Declarations::Module", true, [[:keyword, true], [:name, true], [:end, true], [:type_params, false], [:colon, false], [:self_types, false]], [[:name, :node], [:type_params, :node_list], [:self_types, :node_list], [:members, :node_list], [:annotations, :node_list], [:comment, :node]], true], + [:node, "RBS::AST::Declarations::Module::Self", true, [[:name, true], [:args, false]], [[:name, :node], [:args, :node_list]], false], + [:node, "RBS::AST::Declarations::ModuleAlias", true, [[:keyword, true], [:new_name, true], [:eq, true], [:old_name, true]], [[:new_name, :node], [:old_name, :node], [:comment, :node], [:annotations, :node_list]], false], + [:node, "RBS::AST::Declarations::TypeAlias", true, [[:keyword, true], [:name, true], [:eq, true], [:type_params, false]], [[:name, :node], [:type_params, :node_list], [:type, :node], [:annotations, :node_list], [:comment, :node]], true], + [:node, "RBS::AST::Directives::Use", true, [[:keyword, true]], [[:clauses, :node_list]], false], + [:node, "RBS::AST::Directives::Use::SingleClause", true, [[:type_name, true], [:keyword, false], [:new_name, false]], [[:type_name, :node], [:new_name, :node]], false], + [:node, "RBS::AST::Directives::Use::WildcardClause", true, [[:namespace, true], [:star, true]], [[:namespace, :node]], false], + [:integer], + [:node, "RBS::AST::Members::Alias", true, [[:keyword, true], [:new_name, true], [:old_name, true], [:new_kind, false], [:old_kind, false]], [[:new_name, :node], [:old_name, :node], [:kind, [:enum, [:instance, :singleton]]], [:annotations, :node_list], [:comment, :node]], false], + [:node, "RBS::AST::Members::AttrAccessor", true, [[:keyword, true], [:name, true], [:colon, true], [:kind, false], [:ivar, false], [:ivar_name, false], [:visibility, false]], [[:name, :node], [:type, :node], [:ivar_name, :attr_ivar_name], [:kind, [:enum, [:instance, :singleton]]], [:annotations, :node_list], [:comment, :node], [:visibility, [:enum, [nil, :public, :private]]]], false], + [:node, "RBS::AST::Members::AttrReader", true, [[:keyword, true], [:name, true], [:colon, true], [:kind, false], [:ivar, false], [:ivar_name, false], [:visibility, false]], [[:name, :node], [:type, :node], [:ivar_name, :attr_ivar_name], [:kind, [:enum, [:instance, :singleton]]], [:annotations, :node_list], [:comment, :node], [:visibility, [:enum, [nil, :public, :private]]]], false], + [:node, "RBS::AST::Members::AttrWriter", true, [[:keyword, true], [:name, true], [:colon, true], [:kind, false], [:ivar, false], [:ivar_name, false], [:visibility, false]], [[:name, :node], [:type, :node], [:ivar_name, :attr_ivar_name], [:kind, [:enum, [:instance, :singleton]]], [:annotations, :node_list], [:comment, :node], [:visibility, [:enum, [nil, :public, :private]]]], false], + [:node, "RBS::AST::Members::ClassInstanceVariable", true, [[:name, true], [:colon, true], [:kind, false]], [[:name, :node], [:type, :node], [:comment, :node]], false], + [:node, "RBS::AST::Members::ClassVariable", true, [[:name, true], [:colon, true], [:kind, false]], [[:name, :node], [:type, :node], [:comment, :node]], false], + [:node, "RBS::AST::Members::Extend", true, [[:name, true], [:keyword, true], [:args, false]], [[:name, :node], [:args, :node_list], [:annotations, :node_list], [:comment, :node]], false], + [:node, "RBS::AST::Members::Include", true, [[:name, true], [:keyword, true], [:args, false]], [[:name, :node], [:args, :node_list], [:annotations, :node_list], [:comment, :node]], false], + [:node, "RBS::AST::Members::InstanceVariable", true, [[:name, true], [:colon, true], [:kind, false]], [[:name, :node], [:type, :node], [:comment, :node]], false], + [:node, "RBS::AST::Members::MethodDefinition", true, [[:keyword, true], [:name, true], [:kind, false], [:overloading, false], [:visibility, false]], [[:name, :node], [:kind, [:enum, [:instance, :singleton, :singleton_instance]]], [:overloads, :node_list], [:annotations, :node_list], [:comment, :node], [:overloading, :bool], [:visibility, [:enum, [nil, :public, :private]]]], false], + [:node, "RBS::AST::Members::MethodDefinition::Overload", false, nil, [[:annotations, :node_list], [:method_type, :node]], false], + [:node, "RBS::AST::Members::Prepend", true, [[:name, true], [:keyword, true], [:args, false]], [[:name, :node], [:args, :node_list], [:annotations, :node_list], [:comment, :node]], false], + [:node, "RBS::AST::Members::Private", true, nil, nil, false], + [:node, "RBS::AST::Members::Public", true, nil, nil, false], + [:node, "RBS::AST::Ruby::Annotations::BlockParamTypeAnnotation", true, nil, [[:prefix_location, :location_range], [:ampersand_location, :location_range], [:name_location, :location_range], [:colon_location, :location_range], [:question_location, :location_range], [:type_location, :location_range], [:type, :node], [:comment_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::ClassAliasAnnotation", true, nil, [[:prefix_location, :location_range], [:keyword_location, :location_range], [:type_name, :node], [:type_name_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::ColonMethodTypeAnnotation", true, nil, [[:prefix_location, :location_range], [:annotations, :node_list], [:method_type, :node]], false], + [:node, "RBS::AST::Ruby::Annotations::DoubleSplatParamTypeAnnotation", true, nil, [[:prefix_location, :location_range], [:star2_location, :location_range], [:name_location, :location_range], [:colon_location, :location_range], [:param_type, :node], [:comment_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::InstanceVariableAnnotation", true, nil, [[:prefix_location, :location_range], [:ivar_name, :node], [:ivar_name_location, :location_range], [:colon_location, :location_range], [:type, :node], [:comment_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::MethodTypesAnnotation", true, nil, [[:prefix_location, :location_range], [:overloads, :node_list], [:vertical_bar_locations, :location_range_list], [:dot3_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::ModuleAliasAnnotation", true, nil, [[:prefix_location, :location_range], [:keyword_location, :location_range], [:type_name, :node], [:type_name_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::ModuleSelfAnnotation", true, nil, [[:prefix_location, :location_range], [:keyword_location, :location_range], [:colon_location, :location_range], [:name, :node], [:args, :node_list], [:open_bracket_location, :location_range], [:close_bracket_location, :location_range], [:args_comma_locations, :location_range_list], [:comment_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::NodeTypeAssertion", true, nil, [[:prefix_location, :location_range], [:type, :node]], false], + [:node, "RBS::AST::Ruby::Annotations::ParamTypeAnnotation", true, nil, [[:prefix_location, :location_range], [:name_location, :location_range], [:colon_location, :location_range], [:param_type, :node], [:comment_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::ReturnTypeAnnotation", true, nil, [[:prefix_location, :location_range], [:return_location, :location_range], [:colon_location, :location_range], [:return_type, :node], [:comment_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::SkipAnnotation", true, nil, [[:prefix_location, :location_range], [:skip_location, :location_range], [:comment_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::SplatParamTypeAnnotation", true, nil, [[:prefix_location, :location_range], [:star_location, :location_range], [:name_location, :location_range], [:colon_location, :location_range], [:param_type, :node], [:comment_location, :location_range]], false], + [:node, "RBS::AST::Ruby::Annotations::TypeApplicationAnnotation", true, nil, [[:prefix_location, :location_range], [:type_args, :node_list], [:close_bracket_location, :location_range], [:comma_locations, :location_range_list]], false], + [:string_value], + [:node, "RBS::AST::TypeParam", true, [[:name, true], [:variance, false], [:unchecked, false], [:upper_bound, false], [:lower_bound, false], [:default, false]], [[:name, :node], [:variance, [:enum, [:invariant, :covariant, :contravariant]]], [:upper_bound, :node], [:lower_bound, :node], [:default_type, :node], [:unchecked, :bool]], false], + [:node, "RBS::MethodType", true, [[:type, true], [:type_params, false]], [[:type_params, :node_list], [:type, :node], [:block, :node]], true], + [:namespace], + [:signature], + [:type_name], + [:node, "RBS::Types::Alias", true, [[:name, true], [:args, false]], [[:name, :node], [:args, :node_list]], false], + [:node, "RBS::Types::Bases::Any", true, nil, [[:todo, :bool]], false], + [:node, "RBS::Types::Bases::Bool", true, nil, nil, false], + [:node, "RBS::Types::Bases::Bottom", true, nil, nil, false], + [:node, "RBS::Types::Bases::Class", true, nil, nil, false], + [:node, "RBS::Types::Bases::Instance", true, nil, nil, false], + [:node, "RBS::Types::Bases::Nil", true, nil, nil, false], + [:node, "RBS::Types::Bases::Self", true, nil, nil, false], + [:node, "RBS::Types::Bases::Top", true, nil, nil, false], + [:node, "RBS::Types::Bases::Void", true, nil, nil, false], + [:node, "RBS::Types::Block", true, nil, [[:type, :node], [:required, :bool], [:self_type, :node]], false], + [:node, "RBS::Types::ClassInstance", true, [[:name, true], [:args, false]], [[:name, :node], [:args, :node_list]], false], + [:node, "RBS::Types::ClassSingleton", true, [[:name, true], [:args, false]], [[:name, :node], [:args, :node_list]], false], + [:node, "RBS::Types::Function", false, nil, [[:required_positionals, :node_list], [:optional_positionals, :node_list], [:rest_positionals, :node], [:trailing_positionals, :node_list], [:required_keywords, :hash], [:optional_keywords, :hash], [:rest_keywords, :node], [:return_type, :node]], false], + [:node, "RBS::Types::Function::Param", true, [[:name, false]], [[:type, :node], [:name, :node]], false], + [:node, "RBS::Types::Interface", true, [[:name, true], [:args, false]], [[:name, :node], [:args, :node_list]], false], + [:node, "RBS::Types::Intersection", true, nil, [[:types, :node_list]], false], + [:node, "RBS::Types::Literal", true, nil, [[:literal, :node]], false], + [:node, "RBS::Types::Optional", true, nil, [[:type, :node]], false], + [:node, "RBS::Types::Proc", true, nil, [[:type, :node], [:block, :node], [:self_type, :node]], false], + [:node, "RBS::Types::Record", true, nil, [[:all_fields, :hash]], false], + [:record_field], + [:node, "RBS::Types::Tuple", true, nil, [[:types, :node_list]], false], + [:node, "RBS::Types::Union", true, nil, [[:types, :node_list]], false], + [:node, "RBS::Types::UntypedFunction", false, nil, [[:return_type, :node]], false], + [:node, "RBS::Types::Variable", true, nil, [[:name, :node]], false], + ].freeze + end + end +end diff --git a/sig/wasm/deserializer.rbs b/sig/wasm/deserializer.rbs new file mode 100644 index 000000000..4e04e43c2 --- /dev/null +++ b/sig/wasm/deserializer.rbs @@ -0,0 +1,58 @@ +module RBS + module WASM + # Rebuilds RBS::AST objects from the binary buffer produced by + # `rbs_serialize_node` (src/serialize.c), driven by SerializationSchema. + class Deserializer + @bytes: String + + @buffer: Buffer + + @encoding: Encoding + + @pos: Integer + + @class_cache: Hash[String, untyped] + + # Deserialize a buffer produced for a whole signature, returning + # `[directives, declarations]` to match RBS::Parser._parse_signature. + def self.deserialize: (String bytes, Buffer buffer) -> untyped + + def initialize: (String bytes, Buffer buffer) -> void + + # Reads the next node and returns the reconstructed Ruby value. + def read_node: () -> untyped + + private + + def read_struct: (Array[untyped] entry) -> untyped + + def read_field: (untyped reader) -> untyped + + def read_node_list: () -> Array[untyped] + + def read_hash: () -> Hash[untyped, untyped] + + def read_count: () -> Integer + + def read_location: (Array[untyped]? loc_children) -> Location? + + def read_location_value: () -> Location? + + def read_location_value_list: () -> Array[Location?] + + def read_attr_ivar_name: () -> (Symbol | false | nil) + + def read_range: () -> [ Integer, Integer ]? + + def read_u8: () -> Integer + + def read_u32: () -> Integer + + def read_i32: () -> Integer + + def read_string: (Encoding encoding) -> String + + def class_for: (String name) -> untyped + end + end +end diff --git a/sig/wasm/serialization_schema.rbs b/sig/wasm/serialization_schema.rbs new file mode 100644 index 000000000..81d1525ea --- /dev/null +++ b/sig/wasm/serialization_schema.rbs @@ -0,0 +1,13 @@ +module RBS + module WASM + # Generated from config.yml. See `templates/lib/rbs/wasm/serialization_schema.rb.erb`. + module SerializationSchema + # The node tag used for interned symbols (`rbs_ast_symbol`). + SYMBOL_TAG: Integer + + # Indexed by node tag. Each entry describes how to decode one node; see the + # generated file for the exact shape. + SCHEMA: Array[Array[untyped]?] + end + end +end diff --git a/src/serialize.c b/src/serialize.c new file mode 100644 index 000000000..77162f074 --- /dev/null +++ b/src/serialize.c @@ -0,0 +1,946 @@ +/*----------------------------------------------------------------------------*/ +/* This file is generated by the templates/template.rb script and should not */ +/* be modified manually. */ +/* To change the template see */ +/* templates/src/serialize.c.erb */ +/*----------------------------------------------------------------------------*/ + +#include "rbs/serialize.h" + +#include "rbs/location.h" +#include "rbs/util/rbs_assert.h" +#include "rbs/util/rbs_buffer.h" + +#include + +/** + * State threaded through the recursive serializer: the arena the output buffer + * grows in, the constant pool used to resolve interned ids, and the buffer + * itself. + */ +typedef struct { + rbs_allocator_t *allocator; + rbs_constant_pool_t *constant_pool; + rbs_buffer_t buffer; +} rbs_serialize_state; + +/* All multi-byte integers are written little-endian. */ + +static void w_bytes(rbs_serialize_state *state, const char *value, size_t length) { + if (length > 0) { + rbs_buffer_append_string(state->allocator, &state->buffer, value, length); + } +} + +static void w_u8(rbs_serialize_state *state, uint8_t value) { + w_bytes(state, (const char *) &value, 1); +} + +static void w_u32(rbs_serialize_state *state, uint32_t value) { + unsigned char bytes[4] = { + (unsigned char) (value & 0xff), + (unsigned char) ((value >> 8) & 0xff), + (unsigned char) ((value >> 16) & 0xff), + (unsigned char) ((value >> 24) & 0xff), + }; + w_bytes(state, (const char *) bytes, 4); +} + +static void w_i32(rbs_serialize_state *state, int32_t value) { + w_u32(state, (uint32_t) value); +} + +static void w_string(rbs_serialize_state *state, rbs_string_t string) { + size_t length = rbs_string_len(string); + w_u32(state, (uint32_t) length); + w_bytes(state, string.start, length); +} + +static void w_constant(rbs_serialize_state *state, rbs_constant_id_t id) { + rbs_constant_t *constant = rbs_constant_pool_id_to_constant(state->constant_pool, id); + RBS_ASSERT(constant != NULL, "constant is NULL"); + w_u32(state, (uint32_t) constant->length); + w_bytes(state, (const char *) constant->start, constant->length); +} + +// A location range is encoded as a presence byte, followed by the start/end +// character positions when present. A null range encodes as a single 0 byte +// (it becomes `nil` on the Ruby side). +static void w_loc_range(rbs_serialize_state *state, rbs_location_range range) { + if (RBS_LOCATION_NULL_RANGE_P(range)) { + w_u8(state, 0); + } else { + w_u8(state, 1); + w_i32(state, range.start_char); + w_i32(state, range.end_char); + } +} + +static void w_loc_range_list(rbs_serialize_state *state, rbs_location_range_list_t *list) { + if (list == NULL) { + w_u32(state, 0); + return; + } + + w_u32(state, (uint32_t) list->length); + for (rbs_location_range_list_node_t *n = list->head; n != NULL; n = n->next) { + w_loc_range(state, n->range); + } +} + +static void w_attr_ivar_name(rbs_serialize_state *state, rbs_attr_ivar_name_t ivar_name) { + w_u8(state, (uint8_t) ivar_name.tag); + if (ivar_name.tag == RBS_ATTR_IVAR_NAME_TAG_NAME) { + w_constant(state, ivar_name.name); + } +} + +static void serialize_node(rbs_serialize_state *state, rbs_node_t *instance); + +static void w_node_list(rbs_serialize_state *state, rbs_node_list_t *list) { + if (list == NULL) { + w_u32(state, 0); + return; + } + + w_u32(state, (uint32_t) list->length); + for (rbs_node_list_node_t *n = list->head; n != NULL; n = n->next) { + serialize_node(state, n->node); + } +} + +static void w_hash(rbs_serialize_state *state, rbs_hash_t *hash) { + if (hash == NULL) { + w_u32(state, 0); + return; + } + + // rbs_hash_t does not maintain its `length` field (unlike rbs_node_list_t), + // so count the entries by walking the list. + uint32_t count = 0; + for (rbs_hash_node_t *n = hash->head; n != NULL; n = n->next) { + count++; + } + w_u32(state, count); + + for (rbs_hash_node_t *n = hash->head; n != NULL; n = n->next) { + serialize_node(state, n->key); + serialize_node(state, n->value); + } +} + +// The node tag written ahead of every node. Tag 0 is reserved for NULL, tags +// 1..N are the nodes below (in the same order the Ruby schema is generated), +// and the final tag is `rbs_ast_symbol`, which is not a config.yml node. +#define RBS_SERIALIZE_TAG_SYMBOL 78 + +static void serialize_node(rbs_serialize_state *state, rbs_node_t *instance) { + if (instance == NULL) { + w_u8(state, 0); + return; + } + + switch (instance->type) { + case RBS_AST_ANNOTATION: { + w_u8(state, 1); + rbs_ast_annotation_t *node = (rbs_ast_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_string(state, node->string); + return; + } + case RBS_AST_BOOL: { + w_u8(state, 2); + w_u8(state, ((rbs_ast_bool_t *) instance)->value ? 1 : 0); + return; + } + case RBS_AST_COMMENT: { + w_u8(state, 3); + rbs_ast_comment_t *node = (rbs_ast_comment_t *) instance; + w_loc_range(state, node->base.location); + w_string(state, node->string); + return; + } + case RBS_AST_DECLARATIONS_CLASS: { + w_u8(state, 4); + rbs_ast_declarations_class_t *node = (rbs_ast_declarations_class_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->name_range); + w_loc_range(state, node->end_range); + w_loc_range(state, node->type_params_range); + w_loc_range(state, node->lt_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->type_params); + serialize_node(state, (rbs_node_t *) node->super_class); + w_node_list(state, node->members); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_DECLARATIONS_CLASS_SUPER: { + w_u8(state, 5); + rbs_ast_declarations_class_super_t *node = (rbs_ast_declarations_class_super_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + return; + } + case RBS_AST_DECLARATIONS_CLASS_ALIAS: { + w_u8(state, 6); + rbs_ast_declarations_class_alias_t *node = (rbs_ast_declarations_class_alias_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->new_name_range); + w_loc_range(state, node->eq_range); + w_loc_range(state, node->old_name_range); + serialize_node(state, (rbs_node_t *) node->new_name); + serialize_node(state, (rbs_node_t *) node->old_name); + serialize_node(state, (rbs_node_t *) node->comment); + w_node_list(state, node->annotations); + return; + } + case RBS_AST_DECLARATIONS_CONSTANT: { + w_u8(state, 7); + rbs_ast_declarations_constant_t *node = (rbs_ast_declarations_constant_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->colon_range); + serialize_node(state, (rbs_node_t *) node->name); + serialize_node(state, (rbs_node_t *) node->type); + serialize_node(state, (rbs_node_t *) node->comment); + w_node_list(state, node->annotations); + return; + } + case RBS_AST_DECLARATIONS_GLOBAL: { + w_u8(state, 8); + rbs_ast_declarations_global_t *node = (rbs_ast_declarations_global_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->colon_range); + serialize_node(state, (rbs_node_t *) node->name); + serialize_node(state, (rbs_node_t *) node->type); + serialize_node(state, (rbs_node_t *) node->comment); + w_node_list(state, node->annotations); + return; + } + case RBS_AST_DECLARATIONS_INTERFACE: { + w_u8(state, 9); + rbs_ast_declarations_interface_t *node = (rbs_ast_declarations_interface_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->name_range); + w_loc_range(state, node->end_range); + w_loc_range(state, node->type_params_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->type_params); + w_node_list(state, node->members); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_DECLARATIONS_MODULE: { + w_u8(state, 10); + rbs_ast_declarations_module_t *node = (rbs_ast_declarations_module_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->name_range); + w_loc_range(state, node->end_range); + w_loc_range(state, node->type_params_range); + w_loc_range(state, node->colon_range); + w_loc_range(state, node->self_types_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->type_params); + w_node_list(state, node->self_types); + w_node_list(state, node->members); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_DECLARATIONS_MODULE_SELF: { + w_u8(state, 11); + rbs_ast_declarations_module_self_t *node = (rbs_ast_declarations_module_self_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + return; + } + case RBS_AST_DECLARATIONS_MODULE_ALIAS: { + w_u8(state, 12); + rbs_ast_declarations_module_alias_t *node = (rbs_ast_declarations_module_alias_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->new_name_range); + w_loc_range(state, node->eq_range); + w_loc_range(state, node->old_name_range); + serialize_node(state, (rbs_node_t *) node->new_name); + serialize_node(state, (rbs_node_t *) node->old_name); + serialize_node(state, (rbs_node_t *) node->comment); + w_node_list(state, node->annotations); + return; + } + case RBS_AST_DECLARATIONS_TYPE_ALIAS: { + w_u8(state, 13); + rbs_ast_declarations_type_alias_t *node = (rbs_ast_declarations_type_alias_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->name_range); + w_loc_range(state, node->eq_range); + w_loc_range(state, node->type_params_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->type_params); + serialize_node(state, (rbs_node_t *) node->type); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_DIRECTIVES_USE: { + w_u8(state, 14); + rbs_ast_directives_use_t *node = (rbs_ast_directives_use_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_node_list(state, node->clauses); + return; + } + case RBS_AST_DIRECTIVES_USE_SINGLE_CLAUSE: { + w_u8(state, 15); + rbs_ast_directives_use_single_clause_t *node = (rbs_ast_directives_use_single_clause_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->type_name_range); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->new_name_range); + serialize_node(state, (rbs_node_t *) node->type_name); + serialize_node(state, (rbs_node_t *) node->new_name); + return; + } + case RBS_AST_DIRECTIVES_USE_WILDCARD_CLAUSE: { + w_u8(state, 16); + rbs_ast_directives_use_wildcard_clause_t *node = (rbs_ast_directives_use_wildcard_clause_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->namespace_range); + w_loc_range(state, node->star_range); + serialize_node(state, (rbs_node_t *) node->rbs_namespace); + return; + } + case RBS_AST_INTEGER: { + w_u8(state, 17); + w_string(state, ((rbs_ast_integer_t *) instance)->string_representation); + return; + } + case RBS_AST_MEMBERS_ALIAS: { + w_u8(state, 18); + rbs_ast_members_alias_t *node = (rbs_ast_members_alias_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->new_name_range); + w_loc_range(state, node->old_name_range); + w_loc_range(state, node->new_kind_range); + w_loc_range(state, node->old_kind_range); + serialize_node(state, (rbs_node_t *) node->new_name); + serialize_node(state, (rbs_node_t *) node->old_name); + w_u8(state, (uint8_t) node->kind); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_MEMBERS_ATTR_ACCESSOR: { + w_u8(state, 19); + rbs_ast_members_attr_accessor_t *node = (rbs_ast_members_attr_accessor_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->name_range); + w_loc_range(state, node->colon_range); + w_loc_range(state, node->kind_range); + w_loc_range(state, node->ivar_range); + w_loc_range(state, node->ivar_name_range); + w_loc_range(state, node->visibility_range); + serialize_node(state, (rbs_node_t *) node->name); + serialize_node(state, (rbs_node_t *) node->type); + w_attr_ivar_name(state, node->ivar_name); + w_u8(state, (uint8_t) node->kind); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + w_u8(state, (uint8_t) node->visibility); + return; + } + case RBS_AST_MEMBERS_ATTR_READER: { + w_u8(state, 20); + rbs_ast_members_attr_reader_t *node = (rbs_ast_members_attr_reader_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->name_range); + w_loc_range(state, node->colon_range); + w_loc_range(state, node->kind_range); + w_loc_range(state, node->ivar_range); + w_loc_range(state, node->ivar_name_range); + w_loc_range(state, node->visibility_range); + serialize_node(state, (rbs_node_t *) node->name); + serialize_node(state, (rbs_node_t *) node->type); + w_attr_ivar_name(state, node->ivar_name); + w_u8(state, (uint8_t) node->kind); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + w_u8(state, (uint8_t) node->visibility); + return; + } + case RBS_AST_MEMBERS_ATTR_WRITER: { + w_u8(state, 21); + rbs_ast_members_attr_writer_t *node = (rbs_ast_members_attr_writer_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->name_range); + w_loc_range(state, node->colon_range); + w_loc_range(state, node->kind_range); + w_loc_range(state, node->ivar_range); + w_loc_range(state, node->ivar_name_range); + w_loc_range(state, node->visibility_range); + serialize_node(state, (rbs_node_t *) node->name); + serialize_node(state, (rbs_node_t *) node->type); + w_attr_ivar_name(state, node->ivar_name); + w_u8(state, (uint8_t) node->kind); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + w_u8(state, (uint8_t) node->visibility); + return; + } + case RBS_AST_MEMBERS_CLASS_INSTANCE_VARIABLE: { + w_u8(state, 22); + rbs_ast_members_class_instance_variable_t *node = (rbs_ast_members_class_instance_variable_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->colon_range); + w_loc_range(state, node->kind_range); + serialize_node(state, (rbs_node_t *) node->name); + serialize_node(state, (rbs_node_t *) node->type); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_MEMBERS_CLASS_VARIABLE: { + w_u8(state, 23); + rbs_ast_members_class_variable_t *node = (rbs_ast_members_class_variable_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->colon_range); + w_loc_range(state, node->kind_range); + serialize_node(state, (rbs_node_t *) node->name); + serialize_node(state, (rbs_node_t *) node->type); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_MEMBERS_EXTEND: { + w_u8(state, 24); + rbs_ast_members_extend_t *node = (rbs_ast_members_extend_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_MEMBERS_INCLUDE: { + w_u8(state, 25); + rbs_ast_members_include_t *node = (rbs_ast_members_include_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_MEMBERS_INSTANCE_VARIABLE: { + w_u8(state, 26); + rbs_ast_members_instance_variable_t *node = (rbs_ast_members_instance_variable_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->colon_range); + w_loc_range(state, node->kind_range); + serialize_node(state, (rbs_node_t *) node->name); + serialize_node(state, (rbs_node_t *) node->type); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_MEMBERS_METHOD_DEFINITION: { + w_u8(state, 27); + rbs_ast_members_method_definition_t *node = (rbs_ast_members_method_definition_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->name_range); + w_loc_range(state, node->kind_range); + w_loc_range(state, node->overloading_range); + w_loc_range(state, node->visibility_range); + serialize_node(state, (rbs_node_t *) node->name); + w_u8(state, (uint8_t) node->kind); + w_node_list(state, node->overloads); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + w_u8(state, node->overloading ? 1 : 0); + w_u8(state, (uint8_t) node->visibility); + return; + } + case RBS_AST_MEMBERS_METHOD_DEFINITION_OVERLOAD: { + w_u8(state, 28); + rbs_ast_members_method_definition_overload_t *node = (rbs_ast_members_method_definition_overload_t *) instance; + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->method_type); + return; + } + case RBS_AST_MEMBERS_PREPEND: { + w_u8(state, 29); + rbs_ast_members_prepend_t *node = (rbs_ast_members_prepend_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->keyword_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->comment); + return; + } + case RBS_AST_MEMBERS_PRIVATE: { + w_u8(state, 30); + rbs_ast_members_private_t *node = (rbs_ast_members_private_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_AST_MEMBERS_PUBLIC: { + w_u8(state, 31); + rbs_ast_members_public_t *node = (rbs_ast_members_public_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_BLOCK_PARAM_TYPE_ANNOTATION: { + w_u8(state, 32); + rbs_ast_ruby_annotations_block_param_type_annotation_t *node = (rbs_ast_ruby_annotations_block_param_type_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->ampersand_location); + w_loc_range(state, node->name_location); + w_loc_range(state, node->colon_location); + w_loc_range(state, node->question_location); + w_loc_range(state, node->type_location); + serialize_node(state, (rbs_node_t *) node->type_); + w_loc_range(state, node->comment_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_CLASS_ALIAS_ANNOTATION: { + w_u8(state, 33); + rbs_ast_ruby_annotations_class_alias_annotation_t *node = (rbs_ast_ruby_annotations_class_alias_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->keyword_location); + serialize_node(state, (rbs_node_t *) node->type_name); + w_loc_range(state, node->type_name_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_COLON_METHOD_TYPE_ANNOTATION: { + w_u8(state, 34); + rbs_ast_ruby_annotations_colon_method_type_annotation_t *node = (rbs_ast_ruby_annotations_colon_method_type_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_node_list(state, node->annotations); + serialize_node(state, (rbs_node_t *) node->method_type); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_DOUBLE_SPLAT_PARAM_TYPE_ANNOTATION: { + w_u8(state, 35); + rbs_ast_ruby_annotations_double_splat_param_type_annotation_t *node = (rbs_ast_ruby_annotations_double_splat_param_type_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->star2_location); + w_loc_range(state, node->name_location); + w_loc_range(state, node->colon_location); + serialize_node(state, (rbs_node_t *) node->param_type); + w_loc_range(state, node->comment_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_INSTANCE_VARIABLE_ANNOTATION: { + w_u8(state, 36); + rbs_ast_ruby_annotations_instance_variable_annotation_t *node = (rbs_ast_ruby_annotations_instance_variable_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + serialize_node(state, (rbs_node_t *) node->ivar_name); + w_loc_range(state, node->ivar_name_location); + w_loc_range(state, node->colon_location); + serialize_node(state, (rbs_node_t *) node->type); + w_loc_range(state, node->comment_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_METHOD_TYPES_ANNOTATION: { + w_u8(state, 37); + rbs_ast_ruby_annotations_method_types_annotation_t *node = (rbs_ast_ruby_annotations_method_types_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_node_list(state, node->overloads); + w_loc_range_list(state, node->vertical_bar_locations); + w_loc_range(state, node->dot3_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_MODULE_ALIAS_ANNOTATION: { + w_u8(state, 38); + rbs_ast_ruby_annotations_module_alias_annotation_t *node = (rbs_ast_ruby_annotations_module_alias_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->keyword_location); + serialize_node(state, (rbs_node_t *) node->type_name); + w_loc_range(state, node->type_name_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_MODULE_SELF_ANNOTATION: { + w_u8(state, 39); + rbs_ast_ruby_annotations_module_self_annotation_t *node = (rbs_ast_ruby_annotations_module_self_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->keyword_location); + w_loc_range(state, node->colon_location); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + w_loc_range(state, node->open_bracket_location); + w_loc_range(state, node->close_bracket_location); + w_loc_range_list(state, node->args_comma_locations); + w_loc_range(state, node->comment_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_NODE_TYPE_ASSERTION: { + w_u8(state, 40); + rbs_ast_ruby_annotations_node_type_assertion_t *node = (rbs_ast_ruby_annotations_node_type_assertion_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + serialize_node(state, (rbs_node_t *) node->type); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_PARAM_TYPE_ANNOTATION: { + w_u8(state, 41); + rbs_ast_ruby_annotations_param_type_annotation_t *node = (rbs_ast_ruby_annotations_param_type_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->name_location); + w_loc_range(state, node->colon_location); + serialize_node(state, (rbs_node_t *) node->param_type); + w_loc_range(state, node->comment_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_RETURN_TYPE_ANNOTATION: { + w_u8(state, 42); + rbs_ast_ruby_annotations_return_type_annotation_t *node = (rbs_ast_ruby_annotations_return_type_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->return_location); + w_loc_range(state, node->colon_location); + serialize_node(state, (rbs_node_t *) node->return_type); + w_loc_range(state, node->comment_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_SKIP_ANNOTATION: { + w_u8(state, 43); + rbs_ast_ruby_annotations_skip_annotation_t *node = (rbs_ast_ruby_annotations_skip_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->skip_location); + w_loc_range(state, node->comment_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_SPLAT_PARAM_TYPE_ANNOTATION: { + w_u8(state, 44); + rbs_ast_ruby_annotations_splat_param_type_annotation_t *node = (rbs_ast_ruby_annotations_splat_param_type_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_loc_range(state, node->star_location); + w_loc_range(state, node->name_location); + w_loc_range(state, node->colon_location); + serialize_node(state, (rbs_node_t *) node->param_type); + w_loc_range(state, node->comment_location); + return; + } + case RBS_AST_RUBY_ANNOTATIONS_TYPE_APPLICATION_ANNOTATION: { + w_u8(state, 45); + rbs_ast_ruby_annotations_type_application_annotation_t *node = (rbs_ast_ruby_annotations_type_application_annotation_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->prefix_location); + w_node_list(state, node->type_args); + w_loc_range(state, node->close_bracket_location); + w_loc_range_list(state, node->comma_locations); + return; + } + case RBS_AST_STRING: { + w_u8(state, 46); + w_string(state, ((rbs_ast_string_t *) instance)->string); + return; + } + case RBS_AST_TYPE_PARAM: { + w_u8(state, 47); + rbs_ast_type_param_t *node = (rbs_ast_type_param_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->variance_range); + w_loc_range(state, node->unchecked_range); + w_loc_range(state, node->upper_bound_range); + w_loc_range(state, node->lower_bound_range); + w_loc_range(state, node->default_range); + serialize_node(state, (rbs_node_t *) node->name); + w_u8(state, (uint8_t) node->variance); + serialize_node(state, (rbs_node_t *) node->upper_bound); + serialize_node(state, (rbs_node_t *) node->lower_bound); + serialize_node(state, (rbs_node_t *) node->default_type); + w_u8(state, node->unchecked ? 1 : 0); + return; + } + case RBS_METHOD_TYPE: { + w_u8(state, 48); + rbs_method_type_t *node = (rbs_method_type_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->type_range); + w_loc_range(state, node->type_params_range); + w_node_list(state, node->type_params); + serialize_node(state, (rbs_node_t *) node->type); + serialize_node(state, (rbs_node_t *) node->block); + return; + } + case RBS_NAMESPACE: { + w_u8(state, 49); + rbs_namespace_t *node = (rbs_namespace_t *) instance; + w_node_list(state, node->path); + w_u8(state, node->absolute ? 1 : 0); + return; + } + case RBS_SIGNATURE: { + w_u8(state, 50); + rbs_signature_t *node = (rbs_signature_t *) instance; + w_node_list(state, node->directives); + w_node_list(state, node->declarations); + return; + } + case RBS_TYPE_NAME: { + w_u8(state, 51); + rbs_type_name_t *node = (rbs_type_name_t *) instance; + serialize_node(state, (rbs_node_t *) node->rbs_namespace); + serialize_node(state, (rbs_node_t *) node->name); + return; + } + case RBS_TYPES_ALIAS: { + w_u8(state, 52); + rbs_types_alias_t *node = (rbs_types_alias_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + return; + } + case RBS_TYPES_BASES_ANY: { + w_u8(state, 53); + rbs_types_bases_any_t *node = (rbs_types_bases_any_t *) instance; + w_loc_range(state, node->base.location); + w_u8(state, node->todo ? 1 : 0); + return; + } + case RBS_TYPES_BASES_BOOL: { + w_u8(state, 54); + rbs_types_bases_bool_t *node = (rbs_types_bases_bool_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_TYPES_BASES_BOTTOM: { + w_u8(state, 55); + rbs_types_bases_bottom_t *node = (rbs_types_bases_bottom_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_TYPES_BASES_CLASS: { + w_u8(state, 56); + rbs_types_bases_class_t *node = (rbs_types_bases_class_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_TYPES_BASES_INSTANCE: { + w_u8(state, 57); + rbs_types_bases_instance_t *node = (rbs_types_bases_instance_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_TYPES_BASES_NIL: { + w_u8(state, 58); + rbs_types_bases_nil_t *node = (rbs_types_bases_nil_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_TYPES_BASES_SELF: { + w_u8(state, 59); + rbs_types_bases_self_t *node = (rbs_types_bases_self_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_TYPES_BASES_TOP: { + w_u8(state, 60); + rbs_types_bases_top_t *node = (rbs_types_bases_top_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_TYPES_BASES_VOID: { + w_u8(state, 61); + rbs_types_bases_void_t *node = (rbs_types_bases_void_t *) instance; + w_loc_range(state, node->base.location); + return; + } + case RBS_TYPES_BLOCK: { + w_u8(state, 62); + rbs_types_block_t *node = (rbs_types_block_t *) instance; + w_loc_range(state, node->base.location); + serialize_node(state, (rbs_node_t *) node->type); + w_u8(state, node->required ? 1 : 0); + serialize_node(state, (rbs_node_t *) node->self_type); + return; + } + case RBS_TYPES_CLASS_INSTANCE: { + w_u8(state, 63); + rbs_types_class_instance_t *node = (rbs_types_class_instance_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + return; + } + case RBS_TYPES_CLASS_SINGLETON: { + w_u8(state, 64); + rbs_types_class_singleton_t *node = (rbs_types_class_singleton_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + return; + } + case RBS_TYPES_FUNCTION: { + w_u8(state, 65); + rbs_types_function_t *node = (rbs_types_function_t *) instance; + w_node_list(state, node->required_positionals); + w_node_list(state, node->optional_positionals); + serialize_node(state, (rbs_node_t *) node->rest_positionals); + w_node_list(state, node->trailing_positionals); + w_hash(state, node->required_keywords); + w_hash(state, node->optional_keywords); + serialize_node(state, (rbs_node_t *) node->rest_keywords); + serialize_node(state, (rbs_node_t *) node->return_type); + return; + } + case RBS_TYPES_FUNCTION_PARAM: { + w_u8(state, 66); + rbs_types_function_param_t *node = (rbs_types_function_param_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + serialize_node(state, (rbs_node_t *) node->type); + serialize_node(state, (rbs_node_t *) node->name); + return; + } + case RBS_TYPES_INTERFACE: { + w_u8(state, 67); + rbs_types_interface_t *node = (rbs_types_interface_t *) instance; + w_loc_range(state, node->base.location); + w_loc_range(state, node->name_range); + w_loc_range(state, node->args_range); + serialize_node(state, (rbs_node_t *) node->name); + w_node_list(state, node->args); + return; + } + case RBS_TYPES_INTERSECTION: { + w_u8(state, 68); + rbs_types_intersection_t *node = (rbs_types_intersection_t *) instance; + w_loc_range(state, node->base.location); + w_node_list(state, node->types); + return; + } + case RBS_TYPES_LITERAL: { + w_u8(state, 69); + rbs_types_literal_t *node = (rbs_types_literal_t *) instance; + w_loc_range(state, node->base.location); + serialize_node(state, (rbs_node_t *) node->literal); + return; + } + case RBS_TYPES_OPTIONAL: { + w_u8(state, 70); + rbs_types_optional_t *node = (rbs_types_optional_t *) instance; + w_loc_range(state, node->base.location); + serialize_node(state, (rbs_node_t *) node->type); + return; + } + case RBS_TYPES_PROC: { + w_u8(state, 71); + rbs_types_proc_t *node = (rbs_types_proc_t *) instance; + w_loc_range(state, node->base.location); + serialize_node(state, (rbs_node_t *) node->type); + serialize_node(state, (rbs_node_t *) node->block); + serialize_node(state, (rbs_node_t *) node->self_type); + return; + } + case RBS_TYPES_RECORD: { + w_u8(state, 72); + rbs_types_record_t *node = (rbs_types_record_t *) instance; + w_loc_range(state, node->base.location); + w_hash(state, node->all_fields); + return; + } + case RBS_TYPES_RECORD_FIELD_TYPE: { + w_u8(state, 73); + rbs_types_record_field_type_t *node = (rbs_types_record_field_type_t *) instance; + serialize_node(state, node->type); + w_u8(state, node->required ? 1 : 0); + return; + } + case RBS_TYPES_TUPLE: { + w_u8(state, 74); + rbs_types_tuple_t *node = (rbs_types_tuple_t *) instance; + w_loc_range(state, node->base.location); + w_node_list(state, node->types); + return; + } + case RBS_TYPES_UNION: { + w_u8(state, 75); + rbs_types_union_t *node = (rbs_types_union_t *) instance; + w_loc_range(state, node->base.location); + w_node_list(state, node->types); + return; + } + case RBS_TYPES_UNTYPED_FUNCTION: { + w_u8(state, 76); + rbs_types_untyped_function_t *node = (rbs_types_untyped_function_t *) instance; + serialize_node(state, (rbs_node_t *) node->return_type); + return; + } + case RBS_TYPES_VARIABLE: { + w_u8(state, 77); + rbs_types_variable_t *node = (rbs_types_variable_t *) instance; + w_loc_range(state, node->base.location); + serialize_node(state, (rbs_node_t *) node->name); + return; + } + case RBS_AST_SYMBOL: { + w_u8(state, RBS_SERIALIZE_TAG_SYMBOL); + w_constant(state, ((rbs_ast_symbol_t *) instance)->constant_id); + return; + } + } + + RBS_ASSERT(false, "rbs_serialize_node: unknown node type: %d", instance->type); +} + +rbs_string_t rbs_serialize_node(rbs_allocator_t *allocator, rbs_constant_pool_t *constant_pool, rbs_node_t *node) { + rbs_serialize_state state = { + .allocator = allocator, + .constant_pool = constant_pool, + }; + rbs_buffer_init(allocator, &state.buffer); + + serialize_node(&state, node); + + return rbs_buffer_to_string(&state.buffer); +} diff --git a/templates/include/rbs/serialize.h.erb b/templates/include/rbs/serialize.h.erb new file mode 100644 index 000000000..c9794f62e --- /dev/null +++ b/templates/include/rbs/serialize.h.erb @@ -0,0 +1,26 @@ +#ifndef RBS__SERIALIZE_H +#define RBS__SERIALIZE_H + +#include "rbs/ast.h" +#include "rbs/string.h" +#include "rbs/util/rbs_allocator.h" +#include "rbs/util/rbs_constant_pool.h" + +/** + * Serialize a parsed AST node into a compact, portable binary buffer. + * + * The format is consumed by RBS::WASM::Deserializer on the Ruby side, which + * rebuilds the same `RBS::AST` objects that the C extension would have built + * directly. This is what lets RBS run on Ruby implementations that cannot load + * the C extension (notably JRuby): the parser runs inside WebAssembly, produces + * this buffer, and the host reconstructs the tree in pure Ruby. + * + * The buffer is allocated from `allocator`, so its lifetime is tied to that + * allocator. `constant_pool` must be the pool the node was parsed with; it is + * used to resolve interned symbol/identifier ids back into their bytes. + * + * See `docs/wasm_serialization.md` for the wire format. + */ +rbs_string_t rbs_serialize_node(rbs_allocator_t *allocator, rbs_constant_pool_t *constant_pool, rbs_node_t *node); + +#endif diff --git a/templates/lib/rbs/wasm/serialization_schema.rb.erb b/templates/lib/rbs/wasm/serialization_schema.rb.erb new file mode 100644 index 000000000..df96adcd7 --- /dev/null +++ b/templates/lib/rbs/wasm/serialization_schema.rb.erb @@ -0,0 +1,82 @@ +<%- + # Node types that need `RBS::AST::TypeParam.resolve_variables` applied to + # their `type_params` before construction (matches ast_translation.c.erb). + resolve_type_params = %w[ + RBS::AST::Declarations::Class + RBS::AST::Declarations::Module + RBS::AST::Declarations::Interface + RBS::AST::Declarations::TypeAlias + RBS::MethodType + ] + + # How the deserializer should read a single field, derived from its c_type. + reader_for = ->(field) { + case field.type.name + when "rbs_node_list" then ":node_list" + when "rbs_hash" then ":hash" + when "rbs_string" then ":string" + when "bool" then ":bool" + when "rbs_location_range" then ":location_range" + when "rbs_location_range_list" then ":location_range_list" + when "rbs_attr_ivar_name" then ":attr_ivar_name" + else + if field.type.is_a?(RBS::Template::EnumType) + symbols = [] + field.type.descr.each_symbol { |_sym, _const, ruby_value| symbols << (ruby_value.nil? ? "nil" : ":#{ruby_value}") } + "[:enum, [#{symbols.join(", ")}]]" + else + ":node" + end + end + } + + entry_for = ->(node) { + case node.ruby_full_name + when "RBS::AST::Bool" then "[:bool]" + when "RBS::AST::Integer" then "[:integer]" + when "RBS::AST::String" then "[:string_value]" + when "RBS::Types::Record::FieldType" then "[:record_field]" + when "RBS::Signature" then "[:signature]" + when "RBS::Namespace" then "[:namespace]" + when "RBS::TypeName" then "[:type_name]" + else + children = (node.locations || []).map { |loc| "[:#{loc.name}, #{loc.required?}]" } + fields = node.fields.map { |field| "[:#{field.name}, #{reader_for.call(field)}]" } + # Emit `nil` rather than `[]` for empty lists, so the generated file has no + # unannotated empty collections for Steep to flag. + children_literal = children.empty? ? "nil" : "[#{children.join(", ")}]" + fields_literal = fields.empty? ? "nil" : "[#{fields.join(", ")}]" + "[:node, #{node.ruby_full_name.inspect}, #{node.expose_location?}, #{children_literal}, #{fields_literal}, #{resolve_type_params.include?(node.ruby_full_name)}]" + end + } +-%> +module RBS + module WASM + # Describes how to decode the binary buffer produced by `rbs_serialize_node` + # (src/serialize.c) back into RBS::AST objects. RBS::WASM::Deserializer walks + # this table; the matching encoder is generated from the same config.yml, so + # the two stay in sync. + # + # SCHEMA is indexed by node tag: tag 0 is NULL and SYMBOL_TAG is the + # interned-symbol tag. Each remaining entry is one of: + # + # [:node, class_name, expose_location, loc_children, fields, resolve_type_params] + # [:bool] / [:integer] / [:string_value] / [:record_field] + # [:signature] / [:namespace] / [:type_name] + # + # where loc_children is [[name, required?], ...] and fields is + # [[name, reader], ...] with reader one of :node, :node_list, :hash, :string, + # :bool, :location_range, :location_range_list, :attr_ivar_name, or + # [:enum, [value_or_nil, ...]]. + module SerializationSchema + SYMBOL_TAG = <%= nodes.size + 1 %> + + SCHEMA = [ + nil, # tag 0 is reserved for NULL + <%- nodes.each do |node| -%> + <%= entry_for.call(node) %>, + <%- end -%> + ].freeze + end + end +end diff --git a/templates/src/serialize.c.erb b/templates/src/serialize.c.erb new file mode 100644 index 000000000..b0c4aceaf --- /dev/null +++ b/templates/src/serialize.c.erb @@ -0,0 +1,221 @@ +#include "rbs/serialize.h" + +#include "rbs/location.h" +#include "rbs/util/rbs_assert.h" +#include "rbs/util/rbs_buffer.h" + +#include + +/** + * State threaded through the recursive serializer: the arena the output buffer + * grows in, the constant pool used to resolve interned ids, and the buffer + * itself. + */ +typedef struct { + rbs_allocator_t *allocator; + rbs_constant_pool_t *constant_pool; + rbs_buffer_t buffer; +} rbs_serialize_state; + +/* All multi-byte integers are written little-endian. */ + +static void w_bytes(rbs_serialize_state *state, const char *value, size_t length) { + if (length > 0) { + rbs_buffer_append_string(state->allocator, &state->buffer, value, length); + } +} + +static void w_u8(rbs_serialize_state *state, uint8_t value) { + w_bytes(state, (const char *) &value, 1); +} + +static void w_u32(rbs_serialize_state *state, uint32_t value) { + unsigned char bytes[4] = { + (unsigned char) (value & 0xff), + (unsigned char) ((value >> 8) & 0xff), + (unsigned char) ((value >> 16) & 0xff), + (unsigned char) ((value >> 24) & 0xff), + }; + w_bytes(state, (const char *) bytes, 4); +} + +static void w_i32(rbs_serialize_state *state, int32_t value) { + w_u32(state, (uint32_t) value); +} + +static void w_string(rbs_serialize_state *state, rbs_string_t string) { + size_t length = rbs_string_len(string); + w_u32(state, (uint32_t) length); + w_bytes(state, string.start, length); +} + +static void w_constant(rbs_serialize_state *state, rbs_constant_id_t id) { + rbs_constant_t *constant = rbs_constant_pool_id_to_constant(state->constant_pool, id); + RBS_ASSERT(constant != NULL, "constant is NULL"); + w_u32(state, (uint32_t) constant->length); + w_bytes(state, (const char *) constant->start, constant->length); +} + +// A location range is encoded as a presence byte, followed by the start/end +// character positions when present. A null range encodes as a single 0 byte +// (it becomes `nil` on the Ruby side). +static void w_loc_range(rbs_serialize_state *state, rbs_location_range range) { + if (RBS_LOCATION_NULL_RANGE_P(range)) { + w_u8(state, 0); + } else { + w_u8(state, 1); + w_i32(state, range.start_char); + w_i32(state, range.end_char); + } +} + +static void w_loc_range_list(rbs_serialize_state *state, rbs_location_range_list_t *list) { + if (list == NULL) { + w_u32(state, 0); + return; + } + + w_u32(state, (uint32_t) list->length); + for (rbs_location_range_list_node_t *n = list->head; n != NULL; n = n->next) { + w_loc_range(state, n->range); + } +} + +static void w_attr_ivar_name(rbs_serialize_state *state, rbs_attr_ivar_name_t ivar_name) { + w_u8(state, (uint8_t) ivar_name.tag); + if (ivar_name.tag == RBS_ATTR_IVAR_NAME_TAG_NAME) { + w_constant(state, ivar_name.name); + } +} + +static void serialize_node(rbs_serialize_state *state, rbs_node_t *instance); + +static void w_node_list(rbs_serialize_state *state, rbs_node_list_t *list) { + if (list == NULL) { + w_u32(state, 0); + return; + } + + w_u32(state, (uint32_t) list->length); + for (rbs_node_list_node_t *n = list->head; n != NULL; n = n->next) { + serialize_node(state, n->node); + } +} + +static void w_hash(rbs_serialize_state *state, rbs_hash_t *hash) { + if (hash == NULL) { + w_u32(state, 0); + return; + } + + // rbs_hash_t does not maintain its `length` field (unlike rbs_node_list_t), + // so count the entries by walking the list. + uint32_t count = 0; + for (rbs_hash_node_t *n = hash->head; n != NULL; n = n->next) { + count++; + } + w_u32(state, count); + + for (rbs_hash_node_t *n = hash->head; n != NULL; n = n->next) { + serialize_node(state, n->key); + serialize_node(state, n->value); + } +} + +// The node tag written ahead of every node. Tag 0 is reserved for NULL, tags +// 1..N are the nodes below (in the same order the Ruby schema is generated), +// and the final tag is `rbs_ast_symbol`, which is not a config.yml node. +#define RBS_SERIALIZE_TAG_SYMBOL <%= nodes.size + 1 %> + +static void serialize_node(rbs_serialize_state *state, rbs_node_t *instance) { + if (instance == NULL) { + w_u8(state, 0); + return; + } + + switch (instance->type) { + <%- nodes.each_with_index do |node, index| -%> + case <%= node.c_node_enum_name %>: { + w_u8(state, <%= index + 1 %>); + <%- case node.ruby_full_name -%> + <%- when "RBS::AST::Bool" -%> + w_u8(state, ((rbs_ast_bool_t *) instance)->value ? 1 : 0); + <%- when "RBS::AST::Integer" -%> + w_string(state, ((rbs_ast_integer_t *) instance)->string_representation); + <%- when "RBS::AST::String" -%> + w_string(state, ((rbs_ast_string_t *) instance)->string); + <%- when "RBS::Types::Record::FieldType" -%> + rbs_types_record_field_type_t *node = (rbs_types_record_field_type_t *) instance; + serialize_node(state, node->type); + w_u8(state, node->required ? 1 : 0); + <%- when "RBS::Signature" -%> + rbs_signature_t *node = (rbs_signature_t *) instance; + w_node_list(state, node->directives); + w_node_list(state, node->declarations); + <%- when "RBS::Namespace" -%> + rbs_namespace_t *node = (rbs_namespace_t *) instance; + w_node_list(state, node->path); + w_u8(state, node->absolute ? 1 : 0); + <%- when "RBS::TypeName" -%> + rbs_type_name_t *node = (rbs_type_name_t *) instance; + serialize_node(state, (rbs_node_t *) node->rbs_namespace); + serialize_node(state, (rbs_node_t *) node->name); + <%- else -%> + <%= node.c_type_name %> *node = (<%= node.c_type_name %> *) instance; + <%- if node.expose_location? -%> + w_loc_range(state, node->base.location); + <%- if node.locations -%> + <%- node.locations.each do |location_field| -%> + w_loc_range(state, node-><%= location_field.attribute_name %>); + <%- end -%> + <%- end -%> + <%- end -%> + <%- node.fields.each do |field| -%> + <%- case field.type.name -%> + <%- when "rbs_node_list" -%> + w_node_list(state, node-><%= field.c_name %>); + <%- when "rbs_hash" -%> + w_hash(state, node-><%= field.c_name %>); + <%- when "rbs_string" -%> + w_string(state, node-><%= field.c_name %>); + <%- when "bool" -%> + w_u8(state, node-><%= field.c_name %> ? 1 : 0); + <%- when "rbs_location_range" -%> + w_loc_range(state, node-><%= field.c_name %>); + <%- when "rbs_location_range_list" -%> + w_loc_range_list(state, node-><%= field.c_name %>); + <%- when "rbs_attr_ivar_name" -%> + w_attr_ivar_name(state, node-><%= field.c_name %>); + <%- else -%> + <%- if field.type.is_a?(RBS::Template::EnumType) -%> + w_u8(state, (uint8_t) node-><%= field.c_name %>); + <%- else -%> + serialize_node(state, (rbs_node_t *) node-><%= field.c_name %>); + <%- end -%> + <%- end -%> + <%- end -%> + <%- end -%> + return; + } + <%- end -%> + case RBS_AST_SYMBOL: { + w_u8(state, RBS_SERIALIZE_TAG_SYMBOL); + w_constant(state, ((rbs_ast_symbol_t *) instance)->constant_id); + return; + } + } + + RBS_ASSERT(false, "rbs_serialize_node: unknown node type: %d", instance->type); +} + +rbs_string_t rbs_serialize_node(rbs_allocator_t *allocator, rbs_constant_pool_t *constant_pool, rbs_node_t *node) { + rbs_serialize_state state = { + .allocator = allocator, + .constant_pool = constant_pool, + }; + rbs_buffer_init(allocator, &state.buffer); + + serialize_node(&state, node); + + return rbs_buffer_to_string(&state.buffer); +} diff --git a/templates/template.rb b/templates/template.rb index b581ddbd7..c97052872 100644 --- a/templates/template.rb +++ b/templates/template.rb @@ -281,14 +281,25 @@ def render(out_file) erb = read_template(template) extension = File.extname(filepath.gsub(".erb", "")) - heading = <<~HEADING - /*----------------------------------------------------------------------------*/ - /* This file is generated by the templates/template.rb script and should not */ - /* be modified manually. */ - /* To change the template see */ - /* #{filepath + " " * (74 - filepath.size) } */ - /*----------------------------------------------------------------------------*/ - HEADING + heading = + if extension == ".rb" + <<~HEADING + # frozen_string_literal: true + # + # This file is generated by the templates/template.rb script and should not be + # modified manually. To change the template see + # #{filepath} + HEADING + else + <<~HEADING + /*----------------------------------------------------------------------------*/ + /* This file is generated by the templates/template.rb script and should not */ + /* be modified manually. */ + /* To change the template see */ + /* #{filepath + " " * (74 - filepath.size) } */ + /*----------------------------------------------------------------------------*/ + HEADING + end write_to = File.expand_path("../#{out_file}", __dir__) contents = heading + "\n" + erb.result_with_hash(locals) diff --git a/test/rbs/wasm/serialization_test.rb b/test/rbs/wasm/serialization_test.rb new file mode 100644 index 000000000..ac2ab10ae --- /dev/null +++ b/test/rbs/wasm/serialization_test.rb @@ -0,0 +1,181 @@ +# frozen_string_literal: true + +require "test_helper" +require "rbs/wasm/deserializer" + +# Verifies that the binary serialization produced by `rbs_serialize_node` +# (src/serialize.c) round-trips back into exactly the same AST objects that the +# C extension builds directly via ast_translation.c. +# +# The parser is run twice over the same buffer: once through the normal +# C -> Ruby translation, and once through serialize -> deserialize. The two +# results must be deeply identical, down to locations and string encodings. This +# is what gives us confidence that the same format, produced inside WebAssembly, +# will rebuild correct objects on JRuby. +class RBS::WASM::SerializationTest < Test::Unit::TestCase + ROOT = File.expand_path("../../..", __dir__) + + def buffer(source) + RBS::Buffer.new(content: source, name: "test.rbs") + end + + def assert_round_trips(buf) + directives, decls = RBS::Parser._parse_signature(buf, 0, buf.content.bytesize) + bytes = RBS::Parser._parse_signature_to_bytes(buf, 0, buf.content.bytesize) + actual = RBS::WASM::Deserializer.deserialize(bytes, buf) + + diff = ast_diff([directives, decls], actual) + assert_nil diff, "round-trip mismatch in #{buf.name}: #{diff}" + end + + def test_signature_round_trip_for_bundled_rbs + paths = Dir.glob(File.join(ROOT, "{core,stdlib,sig}/**/*.rbs")).sort + assert_operator paths.size, :>, 0, "expected to find bundled RBS files" + + paths.each do |path| + source = File.read(path) + assert_round_trips(RBS::Buffer.new(content: source, name: path)) + end + end + + def test_signature_round_trip_for_features + sources = [ + "class Foo end", + "class Foo < Bar end", + "class Foo[A, out B, unchecked in C < Comparable[A]] end", + "module M : Comparable, _Each[Integer] end", + <<~RBS, + # A documented class. + class Account + @balance: Integer + self.@registry: Hash[Symbol, Account] + @@count: Integer + + attr_reader name: String + attr_accessor age (@years): Integer + attr_writer secret (): String + + public + def deposit: (Integer amount) -> void + | (Float) -> void + private + def self.find: (Symbol) -> Account? + alias credit deposit + include Comparable + extend ClassMethods + prepend Logging + end + RBS + "type t[T] = [T, t[T]?] | { value: T, ?next: t[T] } | ^(T) { () -> void } -> bool", + "type lit = 1 | -2 | :sym | \"str\" | true | false | nil", + "interface _Each[A] def each: () { (A) -> void } -> void end", + "$global: Integer\nCONST: String\nFoo::Bar: bool", + "class A = B\nmodule M = N", + "use Foo::Bar, Baz::*, Qux::Quux as Q\nclass A end", + "# resolve-type-names: false\nclass A end", + ] + + sources.each { |source| assert_round_trips(buffer(source)) } + end + + def test_type_round_trip + types = [ + "Integer", "::Foo::Bar::Baz", "Array[Integer]", "Integer | String | nil", + "(Integer & Comparable)", "Integer?", "[Integer, String, bool]", + "{ name: String, ?age: Integer }", "^(Integer, ?String) { () -> void } -> bool", + "^() [self: Foo] -> void", "singleton(String)", "self", "instance", "class", + "void", "untyped", "bool", "top", "bot", "nil", + "1", "-42", ":symbol", '"string"', "true", "false", + "123456789012345678901234567890", "Hash[Symbol, Array[Integer]]", "_Each[String]", + ] + + types.each do |source| + buf = buffer(source) + expected = RBS::Parser._parse_type(buf, 0, source.bytesize, nil, true, true, true, true) + bytes = RBS::Parser._parse_type_to_bytes(buf, 0, source.bytesize, nil, true, true, true, true) + actual = RBS::WASM::Deserializer.deserialize(bytes, buf) + + assert_nil ast_diff(expected, actual), "type round-trip mismatch for #{source.inspect}" + end + end + + def test_method_type_round_trip + method_types = [ + "() -> void", + "(Integer) -> String", + "[T] (T) -> T", + "(Integer, ?String, *Symbol, foo: bool, ?bar: Integer, **untyped) -> void", + "() { (Integer) -> void } -> bool", + "() ?{ () -> void } -> void", + "[A, B < Comparable[A]] (A) -> B", + ] + + method_types.each do |source| + buf = buffer(source) + expected = RBS::Parser._parse_method_type(buf, 0, source.bytesize, nil, true) + bytes = RBS::Parser._parse_method_type_to_bytes(buf, 0, source.bytesize, nil, true) + actual = RBS::WASM::Deserializer.deserialize(bytes, buf) + + assert_nil ast_diff(expected, actual), "method type round-trip mismatch for #{source.inspect}" + end + end + + private + + # Returns nil when the two trees are deeply identical, or a String describing + # the first difference found. This is stricter than RBS object `==` (which + # ignores locations and comments) and also checks string encodings, so it + # catches anything the serialization could get subtly wrong. + def ast_diff(a, b, path = "") + return nil if a.equal?(b) + + case a + when nil, true, false, Symbol, Integer, Float + a == b ? nil : "#{path}: #{a.inspect} != #{b.inspect}" + when String + if a == b && a.encoding == b.encoding + nil + else + "#{path}: #{a.inspect} (#{a.encoding}) != #{b.inspect} (#{b.encoding})" + end + when Array + return "#{path}: expected Array, got #{b.class}" unless b.is_a?(Array) + return "#{path}: size #{a.size} != #{b.size}" unless a.size == b.size + + a.each_index do |i| + diff = ast_diff(a[i], b[i], "#{path}[#{i}]") + return diff if diff + end + nil + when Hash + return "#{path}: expected Hash, got #{b.class}" unless b.is_a?(Hash) + return "#{path}: size #{a.size} != #{b.size}" unless a.size == b.size + + a.each do |key, value| + return "#{path}: missing key #{key.inspect}" unless b.key?(key) + + diff = ast_diff(value, b[key], "#{path}{#{key.inspect}}") + return diff if diff + end + nil + when RBS::Location + return "#{path}: expected Location, got #{b.class}" unless b.is_a?(RBS::Location) + a == b ? nil : "#{path}: location #{a} != #{b}" + when RBS::Buffer + a.name == b.name ? nil : "#{path}: buffer #{a.name.inspect} != #{b.name.inspect}" + else + return "#{path}: class #{a.class} != #{b.class}" unless a.class == b.class + + a_ivars = a.instance_variables.sort + unless a_ivars == b.instance_variables.sort + return "#{path}: ivars #{a_ivars} != #{b.instance_variables.sort}" + end + + a_ivars.each do |ivar| + diff = ast_diff(a.instance_variable_get(ivar), b.instance_variable_get(ivar), "#{path}.#{ivar}") + return diff if diff + end + nil + end + end +end