-
Notifications
You must be signed in to change notification settings - Fork 134
chore(rivetkit): implement new hibernating ws protocol #3464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(rivetkit): implement new hibernating ws protocol #3464
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
How to use the Graphite Merge QueueAdd the label merge-queue to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
PR Review: Hibernating WebSocket Protocol ImplementationOverviewThis PR implements a new hibernating WebSocket protocol in RivetKit. The changes refactor WebSocket lifecycle management and message acknowledgment handling. Overall the code quality is good with clear separation of concerns, but there are several issues to address before merging. Critical Issues1. Incomplete WebSocket Adapter Constructor Call (tunnel.ts:139-193)The // tunnel.ts:148-186
const adapter = new WebSocketTunnelAdapter(
requestIdStr,
hibernatable,
request,
(data: ArrayBuffer | string, isBinary: boolean) => { /* ... */ },
(code?: number, reason?: string, retry: boolean = false) => { /* ... */ },
);But the constructor signature (websocket-tunnel-adapter.ts:49-73) requires: constructor(
tunnel: Tunnel, // MISSING
actorId: string, // MISSING
requestId: string, // PROVIDED
hibernatable: boolean, // PROVIDED
request: Request, // PROVIDED
messageIndex: number, // MISSING
persistHibernatableWebSocketMessageIndex: (messageIndex: number) => void, // MISSING
sendCallback: (data: ArrayBuffer | string, isBinary: boolean) => void, // PROVIDED
closeCallback: (code?: number, reason?: string, retry?: boolean) => void, // PROVIDED
)This will cause a runtime error. The parameters are misaligned - 2. Loop Variable Declaration Error (tunnel.ts:116)for (const i = 0; i < requestIds.length; i++) {Should be: for (let i = 0; i < requestIds.length; i++) {
3. Inconsistent Property Access (websocket-tunnel-adapter.ts:97)The adapter references 4. Missing Request Parameter (tunnel.ts:120-126)When calling this.#createWebSocket(
actorId,
requestId,
idToStr(requestId),
true,
messageIndex,
); // Missing 'request' parameterThe method signature requires a High Priority Issues5. Breaking API Change Without Migration Path (mod.ts:60-66)The // Before
websocket?: (runner, actorId, ws, requestId, request) => Promise<void>;
// After
websocket: (runner, actorId, ws, requestId, request) => void;This is a breaking change that will affect all existing consumers. Consider:
6. Unused Symbols Imported (tunnel.ts:12-16)import {
SET_CAN_HIBERNATE_SYMBOL,
SET_REQUEST_SYMBOL,
WebSocketTunnelAdapter,
} from "./websocket-tunnel-adapter";
7. Missing Validation (tunnel.ts:104-134)The
8. Incomplete TODO Comments (tunnel.ts:192, websocket-tunnel-adapter.ts:517)Several critical TODOs are left in the code:
These should either be implemented or documented with follow-up issue numbers. Medium Priority Issues9. Error Handling in Message Sequence Logic (websocket-tunnel-adapter.ts:103-157)The message sequencing logic is well-designed, but consider edge cases:
10. Race Condition Potential (tunnel.ts:662-674)There's a check for duplicate WebSockets, but in a concurrent scenario: const existingAdapter = this.#actorWebSockets.get(requestIdStr);
if (existingAdapter) {
existingAdapter.__closeWithoutCallback(1000, "ws.duplicate_open");
}If two open messages arrive simultaneously, both could pass the check before either is added to the map. Consider using a lock or atomic operation. 11. Memory Leak Risk (tunnel.ts:97)The check if (!ws.canHibernate) {
ws.__closeWithoutCallback(1000, "ws.tunnel_shutdown");
}But they're immediately cleared from the map on line 101. This means the WebSocket objects remain in memory without being properly cleaned up. Should hibernatable WebSockets be persisted to storage here? Minor Issues / Code Quality12. Logging Inconsistencies
13. Magic Numbers (websocket-tunnel-adapter.ts:154)this.close(1008, closeReason);Consider defining WebSocket close codes as named constants: const WS_CLOSE_POLICY_VIOLATION = 1008;14. Type Safety (tunnel.ts:145-146)Parameters #createWebSocket(options: {
actorId: string;
requestId: RequestId;
requestIdStr: string;
hibernatable: boolean;
messageIndex: number;
request: Request;
}): WebSocketTunnelAdapter15. Commented Out Code (actor-driver.ts:81-87)Large blocks of commented code should be removed: // // WebSocket message acknowledgment debouncing for hibernatable websockets
// #hibernatableWebSocketAckQueue: Map<...> = new Map();
// #wsAckFlushInterval?: NodeJS.Timeout;If this code might be needed, reference the commit SHA in a comment instead. 16. OpenAPI Changes Not Reviewed (rivetkit-openapi/openapi.json)The PR removes 277 lines from the OpenAPI spec. Ensure:
Performance Considerations17. UUID String Conversion (tunnel.ts:807-809)
18. Missing Index Validation Before Persistence (websocket-tunnel-adapter.ts:160-163)Before persisting the message index, validate it's in the acceptable range: if (messageIndex < 0 || messageIndex > 65535) {
throw new Error(`Invalid message index: ${messageIndex}`);
}
this.messageIndex = messageIndex;Testing Recommendations
Security Considerations19. Message Index ValidationThe message index sequence validation is good, but ensure:
Documentation20. Missing JSDocKey methods lack documentation:
Positive Aspects✅ Good separation of concerns between tunnel and adapter RecommendationsBefore merging:
After merging:
SummaryThe refactoring improves the WebSocket lifecycle management significantly, but critical bugs prevent this from being merged as-is. The constructor parameter mismatch (Issue #1) will cause immediate runtime failures. After addressing the critical issues, this will be a solid improvement to the codebase. |
f45c96f to
769c3f8
Compare
dbe19ca to
426a752
Compare
Code Review: Hibernating WebSocket Protocol ImplementationThis PR implements a new hibernating WebSocket protocol that significantly refactors how WebSocket connections are managed, particularly around hibernation and message acknowledgment. Here's my comprehensive review: 🎯 Overall AssessmentStrengths:
Areas of Concern:
🐛 Critical Issues1. Compilation Error in tunnel.ts:116-126for (const i = 0; i < requestIds.length; i++) {This should be Location: 2. Missing Parameters in #createWebSocketThe // Line 120-126 - Missing 'request' parameter
this.#createWebSocket(
actorId,
requestId,
idToStr(requestId),
true,
messageIndex,
);Location: 3. API Signature Changes Without MigrationThe Old: Location: 4. Incomplete WebSocketTunnelAdapter ConstructorThe constructor signature in Location: 5. Deleted Assert Function Not ReplacedIn Location:
|
| Category | Rating | Notes |
|---|---|---|
| Architecture | ⭐⭐⭐⭐ | Good separation of concerns |
| Code Quality | ⭐⭐⭐ | Some dead code and TODOs |
| Type Safety | ⭐⭐ | Missing parameters, incomplete refactoring |
| Security | ⭐⭐⭐⭐ | Good validation of message indices |
| Performance | ⭐⭐⭐⭐ | Efficient message handling |
| Test Coverage | ⭐⭐ | Missing tests for critical paths |
Overall: Needs Work
This PR has good architectural improvements but requires fixing critical issues before merging:
- Compilation errors
- Missing required callback implementations
- Incomplete API refactoring
- Dead code cleanup
Once these issues are addressed, this will be a solid improvement to the WebSocket hibernation system.
PR Review: chore(rivetkit): implement new hibernating ws protocolOverall, this is a solid refactor that improves the hibernating WebSocket protocol implementation. The changes simplify the architecture and improve reliability. Below are my detailed findings: ✅ Strengths
🐛 Potential Bugs
|
769c3f8 to
fc8f1fc
Compare
426a752 to
04be372
Compare
04be372 to
9d6c4c8
Compare
fc8f1fc to
e5ca471
Compare
PR Review: Hibernating WebSocket Protocol ImplementationCritical Issues1. Import Error in
|
Code Review: Hibernating WebSocket Protocol ImplementationSummaryThis PR implements a new hibernating WebSocket protocol for RivetKit. The changes refactor how WebSocket connections are managed, adding support for hibernatable connections that can survive actor restarts by persisting and restoring connection state. Positive ObservationsCode Quality
Implementation Strengths
Issues & RecommendationsCritical Issues1. Bug: Missing import in tunnel.ts ❌Location: tunnel.ts:7 import { wrappingLte } from "./utils";Issue: The import references 2. Bug: Unused imports in tunnel.ts
|
9d6c4c8 to
449ac28
Compare
e5ca471 to
5b5466a
Compare
PR Review: Implement New Hibernating WebSocket ProtocolI've reviewed the changes and found several critical bugs that will prevent compilation, along with some recommendations for improvement. Critical Bugs (Must Fix)1. Import Name Mismatch in websocket-tunnel-adapter.tsLocation: import { wrappingLte } from "./utils";Issue: The function is named Fix: Change the import to: import { wrappingLteU16 } from "./utils";And update the usage on line 124: if (wrappingLteU16(messageIndex, previousIndex)) {2. Missing Request Parameter in tunnel.tsLocation: for (const i = 0; i < requestIds.length; i++) {
const requestId = requestIds[i];
const messageIndex = messageIndices[i];
this.#createWebSocket(
actorId,
requestId,
idToStr(requestId),
true,
messageIndex,
); // Missing 6th parameter: request
}Issue: Fix: You need to either:
3. Commented Out Field Declarations Still Being UsedLocation: // // WebSocket message acknowledgment debouncing for hibernatable websockets
// #hibernatableWebSocketAckQueue: Map<
// string,
// { requestIdBuf: ArrayBuffer; messageIndex: number }
// > = new Map();
// #wsAckFlushInterval?: NodeJS.Timeout;Issue: These fields are commented out but are being referenced throughout the file (lines 316, 339, 344, 348, 649, 653, 770, 834-836). This will cause compilation errors. Fix: Uncomment these field declarations. 4. Invalid for Loop DeclarationLocation: for (const i = 0; i < requestIds.length; i++) {Issue: Using Fix: Change to: for (let i = 0; i < requestIds.length; i++) {5. Missing Implementation in actor-driver.tsLocation: Issue: The PR changes
But the implementation in
Fix: The actor-driver.ts needs to be updated to implement the new API instead of the old one. Code Quality Issues6. Removed Return StatementLocation: // TODO: assign request to ws so we can pass this throughIssue: The Fix: Add a return statement: // Store adapter
this.#actorWebSockets.set(requestIdStr, adapter);
return adapter;7. WebSocket Constructor Signature MismatchLocation: The old constructor signature appears to be: new WebSocketTunnelAdapter(
requestIdStr,
hibernatable,
request,
// ... callbacks
)But based on the new code in websocket-tunnel-adapter.ts (lines 50-74), the signature should be: new WebSocketTunnelAdapter(
tunnel,
actorId,
requestId,
hibernatable,
request,
messageIndex,
persistHibernatableWebSocketMessageIndex,
sendCallback,
closeCallback,
)This needs to be verified and fixed. Design Concerns8. Message Index Persistence StrategyLocation: The 9. Async to Sync Change Without Clear JustificationLocation: websocket: (
runner: Runner,
actorId: string,
ws: any,
requestId: protocol.RequestId,
request: Request,
) => void; // Changed from Promise<void> to voidConcern: This breaking change from async to sync might cause issues if implementations need to perform async operations during WebSocket setup. Consider whether this is intentional and document the reasoning. Positive Aspects
Recommendations
SummaryThis PR implements an important feature but has several compilation-blocking bugs that need to be fixed before it can be merged. The core logic appears sound, but the implementation is incomplete in several areas. Recommendation: Address the 5 critical bugs before merging. The code quality issues should also be fixed but are less critical. |
PR Review: chore(rivetkit): implement new hibernating ws protocolSummaryThis PR refactors the WebSocket tunnel implementation to support a new hibernating WebSocket (HWS) protocol. The changes affect the TypeScript runner SDK and RivetKit engine driver, implementing message sequencing, hibernation support, and improved lifecycle management. Code Quality & Best Practices✅ Strengths
|
449ac28 to
7fadc23
Compare
PR Review: Hibernatable WebSocket Protocol ImplementationOverviewThis is a substantial and well-architected PR that implements a new hibernation protocol for WebSocket connections. The implementation spans both Rust (pegboard-gateway) and TypeScript (runner SDK, RivetKit) codebases, introducing the ability for WebSocket connections to persist across actor sleep/wake cycles. Code Quality: ✅ StrongStrengths
Potential Issues & Concerns🔴 Critical Issues
🟡 High Priority Issues
🟢 Medium Priority Issues
Performance Considerations✅ Good Practices
|
6d04024 to
d4f63d9
Compare
f50891c to
7410ff0
Compare
PR Review: Hibernating WebSocket Protocol ImplementationOverviewThis is a substantial PR implementing a new hibernating WebSocket protocol for RivetKit. The changes enable WebSocket connections to persist while actors sleep/hibernate, which is a significant architectural improvement. Overall, the implementation is well-structured with good documentation and comprehensive test coverage. Strengths ✅1. Excellent Documentation
2. Strong Test Coverage
3. Good Error Handling
4. Architectural Improvements
Issues & Concerns 🔴1. Critical: Race Condition Risk in Hibernation Flow// engine/sdks/typescript/runner/src/tunnel.ts:397
// TODO: Switch this with runner WSThere's a TODO comment suggesting the current implementation has a known issue. This needs to be addressed before merging. Location: 2. Memory Leak Potential// rivetkit-typescript/packages/rivetkit/src/actor/instance/connection-manager.tsThe new Recommendation: Add cleanup in 3. Error Handling: Unhandled Promise Rejection// engine/sdks/typescript/runner/src/mod.ts:915
async #handleCommandStartActor(commandWrapper: protocol.CommandWrapper) {
// ...
try {
await this.#config.onActorStart(actorId, generation, actorConfig);
await this.#tunnel.restoreHibernatingRequests(
actorId,
startCommand.hibernatingRequestIds,
);
} catch (err) {
// error handling
}
}The previous implementation spawned a background promise, but now it's awaited. This is actually better, but it changes the behavior. Make sure this doesn't cause timeouts or block other command processing. 4. Timeout Constants Need Reviewconst WEBSOCKET_OPEN_TIMEOUT: Duration = Duration::from_secs(15);
const TUNNEL_ACK_TIMEOUT: Duration = Duration::from_secs(5);The WebSocket open timeout increased from 2s to 15s. While this might be necessary for hibernation, it could impact user experience if connections fail. Consider:
Location: 5. BARE Schema Breaking ChangeThe BARE schema changed from v2 to v3. Verify there's proper migration logic for existing persisted actors, or document that this is a breaking change requiring actor recreation. 6. Excessive Debug LoggingMultiple new debug logs were added throughout the codebase:
While helpful for development, consider:
Code Quality Issues
|
7410ff0 to
f50891c
Compare
Pull Request Review: Hibernating WebSocket Protocol ImplementationThis is a substantial PR implementing hibernating WebSocket support, a complex feature that allows WebSocket connections to persist while actors sleep. I've reviewed the changes across Rust backend, TypeScript runner SDK, and RivetKit packages. Overall, the implementation is solid with good test coverage, though I have some observations and recommendations. Positive Highlights1. Excellent Documentation
2. Comprehensive Test Coverage
3. Robust Error Handling
4. Clean Architecture
Code Quality Issues1. Timeout Constants Need Consistency Review
|
6d04024 to
d4f63d9
Compare
f50891c to
7410ff0
Compare
PR Review: Hibernating WebSocket Protocol ImplementationThis is a substantial and well-architected PR that implements hibernatable WebSocket connections. I've reviewed the code across both Rust (engine/gateway) and TypeScript (runner/RivetKit) components. Overall, the implementation is solid with good attention to detail. Below are my findings: ✅ StrengthsArchitecture & Design
Error Handling
Observability
Test Coverage
|
| Category | Rating | Notes |
|---|---|---|
| Architecture | ⭐⭐⭐⭐⭐ | Excellent separation and design |
| Error Handling | ⭐⭐⭐⭐ | Robust, but has critical bugs to fix |
| Testing | ⭐⭐⭐⭐ | Good coverage, could use more edge cases |
| Documentation | ⭐⭐⭐⭐⭐ | Excellent sequence diagrams |
| Performance | ⭐⭐⭐⭐ | Solid, watch for ack batching opportunity |
| Security | ⭐⭐⭐⭐ | Good limits in place |
🚨 Action Required Before Merge
- Fix critical timeout logic bug in shared_state.rs (Issue [SVC-2555] Set up issue templates #1)
- Fix actor check logic in tunnel.ts (Issue [SVC-2479] Send cluster events to PostHog #2)
- Review async command handling for race conditions (Issue [SVC-2504] Fix 5 GB upload limit for local development from Cloudflare #3)
- Fix memory leak in state-manager.ts (Issue [SVC-2483] Remove hardcoded uses of rivet.gg #4)
- Remove or gate debug code in state-manager.ts (Issue [SVC-2358] Enable Redis memory overcommit #5)
✨ Overall Assessment
This is a high-quality PR with sophisticated distributed systems patterns. The core architecture is excellent, but there are critical bugs that must be fixed before merging. The hibernation protocol implementation is well-thought-out, and the test coverage demonstrates good understanding of the edge cases.
Recommendation: Request changes to fix critical issues, then approve.
Great work on tackling this complex feature! 🎉
7410ff0 to
f50891c
Compare
Code Review: Hibernating WebSocket Protocol ImplementationOverviewThis PR implements a new hibernating WebSocket protocol that allows WebSocket connections to persist while actors go to sleep. This is a substantial change affecting both the Rust backend (pegboard-gateway, pegboard-runner) and TypeScript SDK (runner, rivetkit). The implementation includes comprehensive tests and documentation. StrengthsArchitecture & Design
Implementation Quality
Issues & RecommendationsCritical Issues1. Missing Error Handling in Async Operations
if (commandWrapper.inner.tag === "CommandStartActor") {
// Spawn background promise
this.#handleCommandStartActor(commandWrapper);
2. Timeout Not Enforced
// TODO: Add timeout to onActorStart
3. Incomplete Crash Handling
// TODO: Mark as crashed
// Send stopped state update if start failed
await this.forceStopActor(actorId, generation);
Performance Concerns4. Unbounded Loop in Restoration
5. Message Buffering Without Backpressure
if hs.total_pending_ws_msgs_size > HWS_MAX_PENDING_MSGS_SIZE_PER_REQ
|| hs.pending_ws_msgs.len() >= u16::MAX as usize
Code Quality Issues6. Multiple Incomplete TODOs
7. HACK Comment Without Explanation
// HACK: Close socket in order to fix bug with Cloudflare leaving WS in closing state
8. Commented Out Code
Security & Correctness9. Request ID Collision Risk
/** Maps request IDs to actor IDs for lookup */
#requestToActor: Map<string, string> = new Map();
10. Potential Race in State Persistence
Testing Gaps11. Missing Edge Case Tests
12. Test Timeout Values
await waitFor(driverTestConfig, HIBERNATION_SLEEP_TIMEOUT + 100);
Documentation13. Missing Migration Guide
Minor Issues
Positive Patterns to Note
Overall AssessmentThis is a well-architected and well-tested implementation of a complex feature. The hibernation protocol is thoughtfully designed with proper handling of restoration, message ordering, and connection lifecycle. The main concerns are around:
Recommendation: Request changes to address the critical issues (#1, #2, #3) and TODOs before merging. The other issues can be addressed in follow-up PRs if time is tight. Code Quality: 8/10 Great work on this complex feature! 🚀 |
6d04024 to
d4f63d9
Compare
f50891c to
7410ff0
Compare
Code Review: Hibernating WebSocket Protocol ImplementationThis PR implements a new hibernating WebSocket protocol that allows WebSocket connections to persist while actors sleep. This is a substantial and well-architected change with generally high code quality. Below are my findings: ✅ Strengths1. Excellent Test Coverage
2. Well-Documented Architecture
3. Proper Error Handling Consolidation
4. Thorough Logging
|
Code Review: Hibernating WebSocket Protocol ImplementationThis PR implements a new hibernating WebSocket protocol that allows WebSocket connections to persist while actors go to sleep. This is a substantial change with ~4,100 additions and ~3,200 deletions across 69 files. Overall, the implementation is well-architected with good test coverage and documentation. Below are my findings: ✅ Strengths1. Excellent Documentation
2. Robust State Management
3. Comprehensive Test Coverage
4. Good Logging
5. Proper Error Handling
|
Merge activity
|

No description provided.