fix: [SDK-4388] subscription state permanently stuck at "Never Subscribed" when login() is called before requestPermission()#2627
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes SDK-4388 where calling OneSignal.login() before requestPermission() could leave the backend push subscription permanently stuck in a “Never Subscribed / Permission Not Granted” state by ensuring “create with non-local subscriptionId” is executed as an update (PATCH) rather than a create (POST).
Changes:
- Route
CreateSubscriptionOperationwith a non-localsubscriptionIdthrough a new “update existing subscription” path (PATCH) instead of POSTing/subscriptions. - Stop including
subscriptionIdin the create-subscription request payload (always POST withid = null) for the local-id create path. - Update/extend unit tests to assert PATCH behavior for non-local create, including the grouped create+update repro.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/operations/impl/executors/SubscriptionOperationExecutor.kt |
Adds branching + helper to PATCH when CreateSubscriptionOperation carries a non-local subscription id; ensures POST create path doesn’t send an id. |
OneSignalSDK/onesignal/core/src/test/java/com/onesignal/user/internal/operations/SubscriptionOperationExecutorTests.kt |
Updates existing test expectations and adds a repro test for grouped create+update to ensure a single PATCH and no POST. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adds three tests covering branches in `updateExistingSubscriptionFromCreate` that the SDK-4388 fix introduced but didn't fully exercise: 1. Create + Delete with non-local id is a successful no-op (locks in the delete short-circuit so a future refactor can't accidentally route to PATCH and resurrect a deleted subscription). 2. Network-retryable backend errors on the PATCH path propagate as FAIL_RETRY with retryAfterSeconds (matches the local-id Create behavior and ensures errors aren't swallowed). 3. Multiple batched updates with a non-local-id Create collapse into a single PATCH carrying the *last* update's values (locks in last-wins semantics; mirrors the existing local-id POST test). These complement the two existing non-local-id tests and bring coverage for the new path to parity with the local-id Create path. Made-with: Cursor
When updateSubscription gets a 404 outside the missing-retry window we re-enqueue a CreateSubscriptionOperation so OperationRepo can rebuild the subscription. Reusing the original non-local subscriptionId on that recovery op routes it back through execute() -> updateExistingSubscriptionFromCreate -> PATCH, which immediately 404s again and re-enqueues another non-local Create, looping indefinitely without backoff (observed at ~4-6 PATCH/sec until the backend rate-limits with 429s). Generate a fresh local id via IDManager.createLocalId() for the recovery Create. A local id forces execute() down the POST/rebuild-user path in createSubscription, which is the one that can actually re-create the subscription on the backend.
Locks in the recovery shape after an update-subscription 404 outside the missing-retry window: one CreateSubscriptionOperation in the response operations, with a fresh local subscriptionId (not the original non-local one) so the next execute() dispatches it through the POST / rebuild-user path instead of looping back into PATCH -> 404. All other fields (appId, onesignalId, externalId, type, enabled, address, status) must propagate from the failed update so the recovered subscription matches the desired state.
24a3623 to
37a793c
Compare
📊 Diff Coverage ReportDiff Coverage Report (Changed Lines Only)Gate: aggregate coverage on changed executable lines must be ≥ 80% (JaCoCo line data for lines touched in the diff). Changed Files Coverage
Overall (aggregate gate)0/27 touched executable lines covered (0.0% — requires ≥ 80%) Per-file detail (informational; gate is aggregate above):
❌ Coverage Check FailedAggregate coverage on touched lines is 0.0% (minimum 80%). |
Strip "SDK-4388" prefixes from comments in SubscriptionOperationExecutor and its test class. The descriptive context that followed the ticket id already explains the why; the ticket id itself is noise for future maintainers and is preserved in git history / commit messages.
720e7ea to
3117820
Compare
…l id
When PATCH /subscriptions/{id} returns 404 outside the missing-retry
window, updateSubscription() already mints a fresh local id for the
recovery CreateSubscriptionOperation to break the PATCH -> 404 loop.
However, the cached SubscriptionModel and ConfigModel.pushSubscriptionId
still hold the stale non-local id, so the rebuild-user path
(LoginUserOperationExecutor.getRebuildOperationsIfCurrentUser) emits a
second CreateSubscriptionOperation carrying that stale id. Both Creates
end up in the POST /users payload and the backend creates two
subscription rows for the same device.
Rewrite the cached SubscriptionModel id and pushSubscriptionId to the
new local id (HYDRATE so no listener reacts) before returning the
recovery op. The rebuild path then emits a Create with the same local
id, dedupe collapses the two ops, and POST /users carries a single
subscription as it does on the un-fixed path.
… on PATCH 404 Extend the PATCH-404 recovery regression test to assert that the cached SubscriptionModel id and ConfigModel.pushSubscriptionId are rewritten to the same fresh local id used on the recovery CreateSubscriptionOperation. Without this realignment the rebuild-user path would emit a duplicate Create with the stale non-local id and LoginUserOperationExecutor's POST /users would create two subscription rows on the backend for the same device.
Summary
Fixes SDK-4388. On a fresh install, when
OneSignal.login(externalId)is called afterinitWithContextbut beforerequestPermission(), the server-side subscription gets permanently stuck atenabled:false, notification_types:0("Never Subscribed / Permission Not Granted") even though the device is fully subscribed locally (optedIn == true).Cherry-picked from @fadii925's
fadi/sdk-4388-...branch (which is based on5.7-main) ontomain. Authorship preserved.Repro (from the ticket)
initWithContextruns, SDK POSTs/apps/.../users→ server responds 201 with a server-assignedsubscriptionId.update-subscription(NO_PERMISSION).OneSignal.login(externalId)→ SDK runscreateAndSwitchToNewUser(), which enqueuescreate-subscriptionreusing the server-assignedsubscriptionId+login-userwithexistingOnesignalId.requestPermission()→ user grants → SDK enqueuesupdate-subscription(SUBSCRIBED).update-subscription(NO_PERMISSION)→PATCH /subscriptions/{id}→ 200. Server:enabled:false, notification_types:0.login-user→PATCH .../identity→SUCCESS_STARTING_ONLY. Remaining ops translated from local user onto server user.create-subscription+update-subscription(SUBSCRIBED)batched into onePOST .../subscriptionswith{ id:"<server-uuid>", enabled:true, notification_types:1 }→200 {}(no-op, subscription already exists). Theenabled:trueis silently discarded.enabled:false, notification_types:0permanently.Root cause
SubscriptionOperationExecutor.createSubscriptionwas always issuing aPOST /subscriptionsforCreateSubscriptionOperation, regardless of whether the carriedsubscriptionIdwas local or already a server-assigned UUID. When the id is already known to the backend, POST is silently treated as a no-op and the mergedenabled/address/statuspayload is dropped.This happens specifically on
login()becauseUserSwitcher.createAndSwitchToNewUserreuses the existing push subscription's id under the new local user, andSubscriptionModelStoreListener.getAddOperationmechanically converts the resulting model add into aCreateSubscriptionOperationcarrying that real id. Combined withGroupComparisonType.ALTERgrouping it with the follow-upupdate-subscription(SUBSCRIBED), the executor ends up POSTing a doomed batch.Fix
In
SubscriptionOperationExecutor.execute(), branch onIDManager.isLocalId(startingOp.subscriptionId)forCreateSubscriptionOperation:createSubscriptionpath →POST /subscriptions.updateExistingSubscriptionFromCreatehelper which collapses the merged final state (enabled/address/statusfrom the lastUpdateSubscriptionOperationif present, otherwise from the create op) into anUpdateSubscriptionOperationand dispatches viaupdateSubscription→PATCH /subscriptions/{id}.This implements option (a) from the ticket. Option (b) (drop the create op) was rejected because it would no-op when there's no follow-up
UpdateSubscriptionOperationin the batch.The delete short-circuit (
if (operations.any { it is DeleteSubscriptionOperation }) return SUCCESS) is preserved in the new helper.Customer impact
e198f6f8-f233-4073-8380-4a3c79ae64e7requestPermission()beforelogin(), or REST-API correct stuck subscriptions viaPATCH /apps/{app_id}/subscriptions/{subscription_id}with{"subscription": {"notification_types": 1}}.Test plan
"create subscription with non-local id PATCHes instead of POSTing when grouped with update"reproduces the SDK-4388 scenario (Create(NO_PERMISSION) + Update(SUBSCRIBED) with non-local id) and asserts a single PATCH withenabled:true, notificationTypes=SUBSCRIBED, zero POSTs."create subscription includes the subscription ID if non-local"rewritten to assert PATCH (the previous "POST with existing id" behavior is no longer correct)../gradlew :OneSignal:core:testReleaseUnitTestgreen.enabled:true, notification_types:1.Followups (not in this PR)
The root modeling issue is that
UserSwitcher.createAndSwitchToNewUserre-attaches an existing server-assigned subscription model under a new local user, which the model-store listener interprets as a brand-new subscription needing creation. A follow-up should consider either:TransferSubscriptionOperation(or no-op) instead ofCreateSubscriptionOperationin the listener whenIDManager.isLocalId(model.id)is false, orNO_PROPOGATEincreateAndSwitchToNewUserand explicitly enqueuing a transfer op.There is also a sibling code path in
LoginUserOperationExecutor.createSubscriptionsFromOperation(CreateSubscriptionOperation, ...)that does the sameif (!isLocalId) operation.subscriptionId else nulltrick when sending to/users(the non-SUCCESS_STARTING_ONLYpath), and is not addressed here.Backport
This fix should also land on
5.7-mainfor the next 5.7.x patch release. @fadii925's original branch is already based on5.7-mainand can be PR'd there directly: https://github.com/OneSignal/OneSignal-Android-SDK/tree/fadi/sdk-4388-android-sdk-subscription-state-permanently-stuck-at-never