Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .agents/docs-and-formatting.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ Load this file when changing documentation, public APIs, protocol specs, benchma
## Formatting Commands

- Markdown: `prettier --write <file>`
- Do not format Markdown under `tasks/`, including task design, plan, progress, state, history,
and lessons files. These files are agent working state rather than repository documentation.
- Python code, including `compiler/`, `benchmarks/`, `integration_tests/`, and `python/`:
`python -m ruff format <changed-python-files>` and
`python -m ruff check --fix <changed-python-files>`
Expand Down
3 changes: 3 additions & 0 deletions .agents/languages/java.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ Load this file when changing anything under `java/` or when Java drives a cross-
- Do not add normal-JVM process-global caches keyed by user classes, generated classes, serializer classes, classloaders, or class-bound method handles. Prefer per-runtime state, immutable shared metadata, or build-time-only template data.
- Concrete serializers may opt into sharing only after auditing retained fields. Treat serializers retaining `TypeResolver`, `RefResolver`, mutable scratch buffers, runtime state, or classloader-sensitive state as non-shareable unless that state is externalized.
- Resolver and serializer hot paths should keep the fast-path/null-slow-path shape obvious. Hoist repeated buffer or cache-state access into locals for multi-step operations and keep rebuild/restoration logic cold.
- In Java codec hot paths, avoid `Preconditions.checkArgument` for attacker-controlled primitive
validation. Use direct primitive branches and throw on the cold error path to preserve inlining and
avoid varargs/helper overhead.
- Keep GraalVM feature code as a thin metadata/registration layer. Build time should publish metadata needed for runtime reconstruction, not retain concrete generated or user serializer instances in the image heap.
- If changes touch GraalVM bootstrap, serializer retention, native-image metadata, or `ObjectStreamSerializer` GraalVM behavior, verify the native-image build and run the produced binary; a plain Java compile is insufficient.
- Put latest-JDK or virtual-thread tests in the latest-JDK test modules with the matching compiler/profile floor, and centralize runtime-version probing in existing compatibility utilities.
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ test.md

benchmarks/dart/profile_output
**/*.fory.dart
integration_tests/idl_tests/dart/.dart_tool/
**/pubspec.lock

**/tmp/*
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ This is the entry point for AI guidance in Apache Fory. Read this file first, th
## Shared Validation Expectations

- Run the relevant tests for every touched language or subsystem before finishing.
- When multiple independent language test suites are required, run them concurrently when the environment has enough resources instead of running them one by one; keep each language's logs and results separate, and rerun any failed suite with focused diagnostics.
- Run applicable test commands in a subagent with a thinking budget one level lower than the main task budget, using medium when the current budget is unclear, unless the change is docs-only or the user explicitly asks to run them locally.
- Reuse the same test subagent for repeated runs within one task and subsystem so it keeps failure context; create a fresh subagent when switching unrelated subsystems or when prior context may be stale or misleading.
- Use `integration_tests/` for cross-language compatibility validation when behavior crosses runtimes.
Expand Down
1 change: 1 addition & 0 deletions BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ pyx_library(
"//cpp/fory/type:fory_type",
"//python/pyfory/cpp:_pyfory",
"//cpp/fory/thirdparty:flat_hash_map",
"//cpp/fory/thirdparty:libmmh3",
],
)

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ For more detailed benchmarks and methodology, see [Go Benchmark](benchmarks/go).
<img src="docs/benchmarks/python/throughput.png" width="95%">
</p>

For more detailed benchmarks and methodology, see [Pythonk](benchmarks/python).
For more detailed benchmarks and methodology, see [Python](benchmarks/python).

### JavaScript/NodeJS Serialization Performance

Expand Down Expand Up @@ -744,15 +744,15 @@ Apache Fory™ supports class schema forward/backward compatibility across **Jav

### Binary Compatibility

**Current Status**: Binary compatibility is **not guaranteed** between Fory major releases as the protocol continues to evolve. However, compatibility **is guaranteed** between minor versions (e.g., 0.13.x).
**Current Status**: Binary compatibility is **not guaranteed** between Fory major releases as the protocol continues to evolve. Compatibility **is guaranteed** between minor versions (for example, 0.13.x).

**Recommendations**:

- Version your serialized data by Fory major version
- Plan migration strategies when upgrading major versions
- See [upgrade guide](docs/guide/java) for details

**Future**: Binary compatibility will be guaranteed starting from Fory 1.0 release.
Major-version compatibility is the boundary for stable serialized data.

## Security

Expand Down
1 change: 1 addition & 0 deletions cpp/fory/serialization/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ cc_test(
srcs = ["serialization_test.cc"],
deps = [
":fory_serialization",
"//cpp/fory/thirdparty:libmmh3",
"@googletest//:gtest",
"@googletest//:gtest_main",
],
Expand Down
40 changes: 10 additions & 30 deletions cpp/fory/serialization/context.cc
Original file line number Diff line number Diff line change
Expand Up @@ -500,48 +500,28 @@ Result<const TypeInfo *, Error> ReadContext::read_type_meta() {

// Check if we already parsed this type meta (cache lookup by header)
if (has_last_meta_header_ && meta_header == last_meta_header_) {
// Fast path: same header as last parsed
// Header-cache hits intentionally skip without rehashing. Entries reach
// this cache only after a successful TypeMeta parse and 52-bit body-hash
// validation.
const TypeInfo *cached = last_meta_type_info_;
reading_type_infos_.push_back(cached);
if (cached && !cached->type_def.empty()) {
const size_t type_def_size = cached->type_def.size();
if (type_def_size >= sizeof(int64_t) &&
type_def_size <= std::numeric_limits<uint32_t>::max()) {
Error skip_error;
buffer_->skip(static_cast<uint32_t>(type_def_size - sizeof(int64_t)),
skip_error);
if (FORY_PREDICT_FALSE(!skip_error.ok())) {
return Unexpected(std::move(skip_error));
}
return cached;
}
}
FORY_RETURN_NOT_OK(TypeMeta::skip_bytes(*buffer_, meta_header));
FORY_RETURN_NOT_OK(
TypeMeta::skip_bytes_for_validated_header(*buffer_, meta_header));
return cached;
}

auto *cache_entry = parsed_type_infos_.find(meta_header);
if (cache_entry != nullptr) {
// Found in cache - reuse and skip the bytes
// Header-cache hits intentionally skip without rehashing. Entries reach
// this cache only after a successful TypeMeta parse and 52-bit body-hash
// validation.
const TypeInfo *cached = cache_entry->second;
reading_type_infos_.push_back(cached);
has_last_meta_header_ = true;
last_meta_header_ = meta_header;
last_meta_type_info_ = cached;
if (cached && !cached->type_def.empty()) {
const size_t type_def_size = cached->type_def.size();
if (type_def_size >= sizeof(int64_t) &&
type_def_size <= std::numeric_limits<uint32_t>::max()) {
Error skip_error;
buffer_->skip(static_cast<uint32_t>(type_def_size - sizeof(int64_t)),
skip_error);
if (FORY_PREDICT_FALSE(!skip_error.ok())) {
return Unexpected(std::move(skip_error));
}
return cached;
}
}
FORY_RETURN_NOT_OK(TypeMeta::skip_bytes(*buffer_, meta_header));
FORY_RETURN_NOT_OK(
TypeMeta::skip_bytes_for_validated_header(*buffer_, meta_header));
return cached;
}

Expand Down
51 changes: 30 additions & 21 deletions cpp/fory/serialization/fory.h
Original file line number Diff line number Diff line change
Expand Up @@ -629,15 +629,13 @@ class Fory : public BaseFory {
Buffer buffer(const_cast<uint8_t *>(data), static_cast<uint32_t>(size),
false);

FORY_TRY(header, read_header(buffer));
if (header.is_null) {
return Unexpected(Error::invalid_data("Cannot deserialize null object"));
Error header_error;
const uint8_t header = buffer.read_uint8(header_error);
if (FORY_PREDICT_FALSE(!header_error.ok())) {
return Unexpected(std::move(header_error));
}
if (FORY_PREDICT_FALSE(header.is_xlang != config_.xlang)) {
return Unexpected(Error::invalid_data(
"Protocol mismatch: payload xlang=" +
std::string(header.is_xlang ? "true" : "false") +
", local xlang=" + std::string(config_.xlang ? "true" : "false")));
if (FORY_PREDICT_FALSE(header != precomputed_header_)) {
return Unexpected(invalid_root_header(header));
}

read_ctx_->attach(buffer);
Expand Down Expand Up @@ -668,19 +666,13 @@ class Fory : public BaseFory {
if (FORY_PREDICT_FALSE(!finalized_)) {
ensure_finalized();
}
auto header_result = read_header(buffer);
if (FORY_PREDICT_FALSE(!header_result.ok())) {
return Unexpected(std::move(header_result).error());
Error header_error;
const uint8_t header = buffer.read_uint8(header_error);
if (FORY_PREDICT_FALSE(!header_error.ok())) {
return Unexpected(std::move(header_error));
}
auto header = std::move(header_result).value();
if (header.is_null) {
return Unexpected(Error::invalid_data("Cannot deserialize null object"));
}
if (FORY_PREDICT_FALSE(header.is_xlang != config_.xlang)) {
return Unexpected(Error::invalid_data(
"Protocol mismatch: payload xlang=" +
std::string(header.is_xlang ? "true" : "false") +
", local xlang=" + std::string(config_.xlang ? "true" : "false")));
if (FORY_PREDICT_FALSE(header != precomputed_header_)) {
return Unexpected(invalid_root_header(header));
}

read_ctx_->attach(buffer);
Expand Down Expand Up @@ -775,11 +767,28 @@ class Fory : public BaseFory {
static uint8_t compute_header(bool xlang) {
uint8_t flags = 0;
if (xlang) {
flags |= (1 << 1); // bit 1: xlang flag
flags |= (1 << 0);
}
return flags;
}

FORY_NOINLINE Error invalid_root_header(uint8_t header) const {
constexpr uint8_t xlang_flag = 1 << 0;
constexpr uint8_t oob_flag = 1 << 1;
constexpr uint8_t known_flags = xlang_flag | oob_flag;
if ((header & ~known_flags) != 0) {
return Error::invalid_data("Unsupported root header bitmap");
}
if ((header & oob_flag) != 0) {
return Error::invalid_data("Out-of-band mode is not supported");
}
const bool payload_xlang = (header & xlang_flag) != 0;
return Error::invalid_data(
"Protocol mismatch: payload xlang=" +
std::string(payload_xlang ? "true" : "false") +
", local xlang=" + std::string(config_.xlang ? "true" : "false"));
}

/// Core serialization implementation.
/// TypeMeta is written inline using streaming protocol (no deferred writing).
template <typename T>
Expand Down
Loading
Loading