Skip to content

[SDK-4725] Preserve user identity across iOS prewarm launches#1666

Open
nan-li wants to merge 8 commits into
mainfrom
nan/sdk-4725
Open

[SDK-4725] Preserve user identity across iOS prewarm launches#1666
nan-li wants to merge 8 commits into
mainfrom
nan/sdk-4725

Conversation

@nan-li
Copy link
Copy Markdown
Contributor

@nan-li nan-li commented May 29, 2026

READ AND DELETE THIS SECTION BEFORE SUBMITTING PR

  • Fill out each REQUIRED section
  • Fill out OPTIONAL sections, remove section if it doesn't apply to your PR
  • Read and fill out each of the checklists below
  • Remove this section after reading

Description

One Line Summary

REQUIRED - Very short description that summaries the changes in this PR.

Details

Motivation

REQUIRED - Why is this code change being made? Or what is the goal of this PR? Examples: Fixes a specific bug, provides additional logging to debug future issues, feature to allow X.

Scope

RECOMMEND - OPTIONAL - What is intended to be effected. What is known not to change. Example: Notifications are grouped when parameter X is set, not enabled by default.

OPTIONAL - Other

OPTIONAL - Feel free to add any other sections or sub-sections that can explain your PR better.

Testing

Unit testing

OPTIONAL - Explain unit tests added, if not clear in the code.

Manual testing

RECOMMEND - OPTIONAL - Explain what scenarios were tested and the environment.
Example: Tested opening a notification while the app was foregrounded, app build with Android Studio 2020.3 with a fresh install of the OneSignal example app on a Pixel 6 with Android 12.

Affected code checklist

  • Notifications
    • Display
    • Open
    • Push Processing
    • Confirm Deliveries
  • Outcomes
  • Sessions
  • In-App Messaging
  • REST API requests
  • Public API changes

Checklist

Overview

  • I have filled out all REQUIRED sections above
  • PR does one thing
    • If it is hard to explain how any codes changes are related to each other then it most likely needs to be more than one PR
  • Any Public API changes are explained in the PR details and conform to existing APIs

Testing

  • I have included test coverage for these changes, or explained why they are not needed
  • All automated tests pass, or I explained why that is not possible
  • I have personally tested this on my device, or explained why that is not possible

Final pass

  • Code is as readable as possible.
    • Simplify with less code, followed by splitting up code into well named functions and variables, followed by adding comments to the code.
  • I have reviewed this PR myself, ensuring it meets each checklist item
    • WIP (Work In Progress) is ok, but explain what is still in progress and what you would like feedback on. Start the PR title with "WIP" to indicate this.

@nan-li
Copy link
Copy Markdown
Contributor Author

nan-li commented May 29, 2026

@claude review

Comment on lines +269 to +286
} else if !hasCachedUser,
_user == nil,
let cachedSubscriptionId = OSResilientStorage.string(forKey: OSResilientStorage.keySubscriptionId),
!cachedSubscriptionId.isEmpty
{
// TODO: In this state, we don't want to persist anything to UserDefaults and overwrite correct data with half data.
// TODO: However, in this state where we can't read UserDefaults, does writing to it fail automatically?
// Path 2.5. Model stores loaded empty but the resilient cache shows we've
// initialized before: the prewarm-before-first-unlock case. Recover identity
// locally; missing properties/aliases will rehydrate on next fetchUser.
let cachedOnesignalId = OSResilientStorage.string(forKey: OSResilientStorage.keyOneSignalId)
OneSignalLog.onesignalLog(.LL_DEBUG, message: "OneSignalUserManager: recovering user from cache subscription_id=\(cachedSubscriptionId) onesignal_id=\(cachedOnesignalId ?? "(nil)")")

let pushSubscriptionModel = createDefaultPushSubscription(subscriptionId: cachedSubscriptionId)
let recoveredUser = setNewInternalUser(externalId: nil, pushSubscriptionModel: pushSubscriptionModel)
if let onesignalId = cachedOnesignalId, !onesignalId.isEmpty {
recoveredUser.identityModel.hydrate([OS_ONESIGNAL_ID: onesignalId])
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Path 2.5 recovery (OneSignalUserManagerImpl.swift lines 269-286) introduces a data-loss risk: setNewInternalUser() triggers identityModelStore.add()/propertiesModelStore.add()/pushSubscriptionModelStore.add(), each of which writes the freshly-created stub models to shared UserDefaults via saveCodeableData. But Path 2.5 only fires when those same UserDefaults reads returned nil — the prewarm-before-first-unlock case — and per Apple/cfprefsd behavior the answer to the inline TODO ("does writing to it fail automatically?") is unfortunately no: writes are typically buffered in memory and flushed to disk after first unlock, replacing the real cached models with stubs that have only subscription_id/onesignal_id. As a secondary concern at the same site, setNewInternalUser(externalId: nil, ...) makes a previously-logged-in user transiently appear anonymous (user.isAnonymous == true), so an app calling OneSignal.login(savedExternalId) during the window hits the identifyUser path at line 328 instead of being a no-op, firing prepareForNewUser(), notifying OS_ON_USER_WILL_CHANGE observers, and sending an unnecessary Identify request. Consider gating Path 2.5 behind UIApplication.isProtectedDataAvailable (or waiting for UIApplicationProtectedDataDidBecomeAvailable), keeping the recovered models in-memory only until protected data is readable, or skipping the store.add() writes in the recovery path.

Extended reasoning...

What the bug is

Path 2.5 (OneSignalUserManagerImpl.swift:269-286) is reached precisely when shared-UserDefaults reads for the cached identity/properties/push-subscription models returned empty — the prewarm-before-first-unlock case under NSFileProtectionCompleteUntilFirstUserAuthentication. In that state the SDK builds a stub user via setNewInternalUser(externalId: nil, pushSubscriptionModel: ...) and hydrates only OS_ONESIGNAL_ID. Two separable problems follow.

Problem 1 — Stub writes can clobber the real cached models on disk

setNewInternalUser invokes identityModelStore.add(), propertiesModelStore.add(), and pushSubscriptionModelStore.add(). Each OSModelStore.add() calls OneSignalUserDefaults.initShared().saveCodeableData(forKey: storeKey, withValue: self.models), which in OneSignalUserDefaults.m performs setObject:[NSKeyedArchiver archivedDataWithRootObject:value] forKey:key followed by synchronize. So Path 2.5 unconditionally writes freshly-created stub models (only subscription_id + onesignal_id populated; fresh modelIds; nil external_id, aliases, language, tags, timezone, token, deviceModel, appVersion, notificationTypes) to the shared App Group UserDefaults plist.

The author flagged this exact concern inline:

// TODO: In this state, we don't want to persist anything to UserDefaults and overwrite correct data with half data.
// TODO: However, in this state where we can't read UserDefaults, does writing to it fail automatically?

The answer to the second TODO is no. Under NSFileProtectionCompleteUntilFirstUserAuthentication, NSUserDefaults / cfprefsd buffers writes in memory while the file is locked and flushes them to disk after first unlock. The write is not silently dropped — it just gets deferred. When the device unlocks, the stub overwrites the real on-disk plist.

Problem 2 — Recovered user looks anonymous to login()

setNewInternalUser is called with externalId: nil because external_id is intentionally not mirrored to OSResilientStorage (PII; see comment in OSIdentityModel.swift). So during the recovery window:

  • OSUserInternalImpl.isAnonymous (identityModel.externalId == nil) returns true.
  • login() at line 328 branches on user.isAnonymous: an app calling OneSignal.login(savedExternalId) (a very common per-launch pattern) takes the identifyUser path instead of being a no-op.
  • identifyUser calls prepareForNewUser() (posts OS_ON_USER_WILL_CHANGE, clears the subscription model store) and dispatches an OSRequestIdentifyUser network call.
  • fireUserStateChanged (via hydrate([OS_ONESIGNAL_ID: onesignalId])) notifies OSUserStateObservers with newExternalId=nil, even though the original user had one.

Step-by-step proof

  1. Launch 0 (normal): User logged in with externalId "alice". Cached: identity model with aliases={onesignal_id: "abc", external_id: "alice"}; properties with language="en", tags={tier: "gold"}; push sub with token="xyz...", subscription_id="sub-123". OSResilientStorage holds {subscription_id: "sub-123", onesignal_id: "abc"}.
  2. Launch 1 (prewarm before first unlock): App is prewarmed by iOS. Shared UserDefaults plist is unreadable → identityModelStore.getModels()[OS_IDENTITY_MODEL_KEY] returns nil → hasCachedUser=false. OSResilientStorage (file-protection .none) IS readable → cachedSubscriptionId="sub-123" → Path 2.5 entered.
  3. Path 2.5 executes setNewInternalUser(externalId: nil, pushSubscriptionModel: createDefaultPushSubscription(subscriptionId: "sub-123")). This calls identityModelStore.add(...), propertiesModelStore.add(...), pushSubscriptionModelStore.add(...). Each invokes OneSignalUserDefaults.initShared().saveCodeableData. These writes are buffered by cfprefsd in memory (file still locked).
  4. App invokes OneSignal.login("alice") from its launch routine. _user.isAnonymous == true (externalId is nil) → branches into identifyUserprepareForNewUser() posts OS_ON_USER_WILL_CHANGE and clears subscriptionModelStore → an OSRequestIdentifyUser is sent for a user that already has that external_id (should have been a no-op).
  5. First unlock. cfprefsd flushes buffered writes to disk. The on-disk identity/properties/pushSubscription plists are now stub models. external_id="alice", language="en", tags={tier: "gold"}, token="xyz..." are gone from disk. Only subscription_id and onesignal_id survive (in OSResilientStorage).
  6. Launch 2. Shared UserDefaults is readable; the SDK loads the stub models from cache. External_id of "alice" is no longer available locally. It will rehydrate eventually via fetchUser, but tags/language/timezone deltas that were queued but not yet sent are permanently lost — and any in-flight executor request keyed on the original modelId is now disconnected from the new model.

Addressing the bug_004 refutation

One verifier argued bug_004 is a known design constraint (external_id intentionally not cached for PII) and largely overlaps with bug_001. That's correct that the two share the same code site and the fix likely addresses both — which is why this is filed as a single merged comment. But the two angles aren't redundant: bug_001 is the on-disk overwrite risk; bug_004 is the in-memory misclassification that misfires regardless of whether writes are buffered or dropped. The author's TODOs flag the write concern as unresolved (not as accepted), and isAnonymous returning the wrong answer for a previously-identified user is a real behavioral regression independent of the on-disk question. Mentioning both in one comment lets the author choose a single fix that covers both (e.g., gating Path 2.5 behind isProtectedDataAvailable).

How to fix

Pick one (or combine):

  • Gate Path 2.5 on UIApplication.shared.isProtectedDataAvailable. If protected data is unavailable, do not enter the recovery branch; subscribe to UIApplicationProtectedDataDidBecomeAvailable and re-run start() then. This eliminates both the buffered-write risk and the anonymous-misclassification window because the SDK has no "recovered" user until UserDefaults is genuinely readable.
  • Keep recovered models in-memory only. Build the stub user but do NOT call identityModelStore.add()/propertiesModelStore.add()/pushSubscriptionModelStore.add() until first unlock has fired. Persist on transition to readable state, after re-reading the real cache to merge.
  • Skip Path 2.5 entirely until protected data is available and stay in the no-user state for the prewarm process — most NSE/prewarm work doesn't need the user object at all, only subscription_id and app_id (already covered separately by the NSE-side OSResilientStorage fallback in this PR).

Comment on lines +562 to 568
// Drop the cached identifiers: a real app-id change invalidates them.
[OSResilientStorage setStrings:@{
OSResilientStorage.keySubscriptionId: @"",
OSResilientStorage.keyOneSignalId: @"",
}];

// Clear all cached data, does not start User Module nor call logout.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 In handleAppIdChange, the setStrings: call only clears keySubscriptionId and keyOneSignalId from OSResilientStorage, but keyReceiveReceiptsEnabled is left stale. Since receive_receipts_enabled is a per-app remote parameter (fetched by downloadIOSParamsWithAppId), the previous app's cached value persists until the new app's iOS params download completes. Fix: add OSResilientStorage.keyReceiveReceiptsEnabled: @"" to the existing setStrings dictionary.

Extended reasoning...

What the bug is

This PR introduces a mirror of the receive_receipts_enabled flag in OSResilientStorage (in downloadIOSParamsWithAppId) and adds a fallback read in OneSignalReceiveReceiptsController.isReceiveReceiptsEnabled so the NSE can read it while the device is locked. However, handleAppIdChange in OneSignal.m (lines 562-566) only invalidates a subset of cached keys on app id change:

[OSResilientStorage setStrings:@{
    OSResilientStorage.keySubscriptionId: @"",
    OSResilientStorage.keyOneSignalId: @"",
}];

keyReceiveReceiptsEnabled is not included, so the previous app's value sticks around.

Why existing code doesn't prevent it

receive_receipts_enabled is per-app — downloadIOSParamsWithAppId fetches it for the current appId and writes it to both shared UserDefaults and OSResilientStorage. It's only overwritten when the new app's iOS params download completes. Between app-id change and that download, the old value is what isReceiveReceiptsEnabled will see via the new fallback path.

Step-by-step proof

  1. Developer/tester runs the app with App A (receive_receipts_enabled=YES). downloadIOSParamsWithAppId writes "1" to OSResilientStorage.keyReceiveReceiptsEnabled.
  2. Developer switches the SDK to App B (receive_receipts_enabled=NO).
  3. App launches → handleAppIdChange runs. It clears keySubscriptionId and keyOneSignalId from OSResilientStorage but leaves keyReceiveReceiptsEnabled = "1". appId is updated to App B.
  4. Before the new app's iOS params download completes, a notification arrives and the NSE is invoked while the device is in the prewarm/locked state (shared UserDefaults returns NO/nil).
  5. OneSignalReceiveReceiptsController.isReceiveReceiptsEnabled reads NO from UserDefaults, then falls back to OSResilientStorage.string(forKey: keyReceiveReceiptsEnabled) which returns "1" → returns YES.
  6. A confirmed delivery receipt is sent for App B even though App B's configuration disables it.

Pre-PR, the equivalent flow would have returned the safer NO default during the locked window, so this PR slightly worsens the staleness exposure for this specific flag.

Impact

Low. The trigger requires (a) an app-id change — overwhelmingly a developer/test scenario — AND (b) an NSE invocation in the booted-but-locked window AND (c) before the new app's iOS params download completes. The blast radius is a stray confirmed-delivery receipt (or a suppressed one in the opposite direction).

Fix

Trivial — include the key with an empty string in the existing setStrings dict in handleAppIdChange:

[OSResilientStorage setStrings:@{
    OSResilientStorage.keySubscriptionId: @"",
    OSResilientStorage.keyOneSignalId: @"",
    OSResilientStorage.keyReceiveReceiptsEnabled: @"",
}];

Note: a symmetric staleness exists in the shared UserDefaults copy (OSUD_RECEIVE_RECEIPTS_ENABLED) which this PR also doesn't clear, so the underlying inconsistency is partially pre-existing — but the PR adds a new code path (the resilient-storage fallback) that makes the cache value load-bearing during the lockup window, so it's worth tightening here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant