Skip to content

Conversation

@bigbrett
Copy link
Contributor

@bigbrett bigbrett commented Jan 22, 2026

Server thread safety

TL;DR: Makes wolfHSM server safe to use in multithreaded scenarios.

Overview

This pull request implements thread-safe access to shared server resources in wolfHSM, specifically targeting the NVM (non-volatile memory) subsystem which also protects the global key cache. Crypto is left to a subsequent PR but is the likely next candidate.

Note that a server context itself still cannot be shared across threads without proper serialization by the caller. This PR adds the mechanisms such that, when multiple server contexts share an NVM instance (which includes the global keystore), access to those shared resources is properly serialized, allowing requests from multiple clients to be processed concurrently in separate threads.

Changes

  • Introduces lock abstraction layer (wh_lock.{c,h}) with callback-based design for platform independence
  • Example POSIX lock implementation using pthread_mutex
  • Adds server-level NVM locking API (wh_Server_NvmLock()/wh_Server_NvmUnlock()) with convenience macros WH_SERVER_NVM_LOCK()/WH_SERVER_NVM_UNLOCK()
  • All request handlers that access NVM or global keystore resources acquire the lock at the handler level before performing operations
  • Lower-level modules (NVM, keystore, counter, cert, etc.) remain lock-free; synchronization is the responsibility of the request handler layer
  • Thread safe functionality enabled with the WOLFHSM_CFG_THREADSAFE build option. When this option is NOT defined, all lock macros compile to no-ops with zero overhead
  • Adds "thread safe stress test" to test suite that attempts to flush out data races via a large number of contention cases, meant to be run under ThreadSanitizer

Design Rationale

The locking strategy is intentionally simple: acquire the NVM lock at the start of a request handler, perform all operations (including any compound operations involving multiple NVM/cache accesses), then release the lock. This approach:

  1. Avoids TOCTOU issues - No risk of metadata becoming stale or objects being destroyed/replaced between checks
  2. Makes lock scope visible - Locking is explicit at the handler level rather than hidden in lower layers

Gaps/Future Work

  • Serializing access to global crypto state, specifically hardware crypto for ports. A bit of a tricky problem since offload is provided at the port level, and there isn't a good way for wolfHSM to know which algos will be accelerated and which won't. A naive implementation might consider simply locking the server crypto context, but this contains a mixture of local (CMAC) and quasi-global (RNG) elements and no abstraction for hardware. Locks also need to be synchronized with the wolfCrypt port mutex. We should refactor the server crypto context and perhaps split it into local and global structures, with the global supporting hardware state. Future work...

…ety,

serializing access to shared global resources like NVM and global keycache
Copy link
Contributor

@billphipps billphipps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truly excellent! You solved this just the way I had hoped for!
My requested changes are very limited and not really functional. More just fleshing out the exact requirements for a real implementation and a few minor typos and renaming opportunities.

The stress testing framework is outstanding!

#include "wolfhsm/wh_lock.h"
#include "wolfhsm/wh_error.h"

#ifdef WOLFHSM_CFG_THREADSAFE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the best name? Consider the more mundane WOLFHSM_CFG_LOCKS. Threadsafe may imply more than just locks, like cancelability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah was kind of wishy washy on this. good point. Let me think on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding posix into the name of this file since it heavily used posix to provide any real functionality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it might be nice to organize our posix tests in one spot. maybe test/posix or port/posix/test/ so we can leave our wh_test_*.c stuff generic for all platforms

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like that solution. +1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is a good idea. Unfortunately a lot of our generic tests modules (e.g. wh_test_clientserver.c) contain both generic drivers as well as a POSIX harness (e.g. spins up the client + server threads). I think it might be best to push this out of scope of this PR and refactor the tests to better split generic test drivers (e.g. whTest_XXXClientCfg(whClientConfig*) and whTest_XXXCLientCtx(whClientCtx*)) from the actual underlying test harness. I'd wager we could reduce a lot of code that way with one or two unified harnesses that drivers just run on top of

Copy link
Contributor

@rizlik rizlik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look into tests yet.
Great work.
Is this lock enough to properly synchronize client request?
Example, _HandleNvmRead:

    rc = wh_Nvm_GetMetadata(server->nvm, id, &meta);
    if (rc != WH_ERROR_OK) {
        return rc;
    }

    if (offset >= meta.len)
        return WH_ERROR_BADARGS;

    /* Clamp length to object size */
    if ((offset + len) > meta.len) {
        len = meta.len - offset;
    }

    rc = wh_Nvm_ReadChecked(server->nvm, id, offset, len, out_data);
    if (rc != WH_ERROR_OK)

metadata can be changed between GetMetadata and ReadChecked.
Also, when handling key request:

            /* get a new id if one wasn't provided */
            if (WH_KEYID_ISERASED(meta->id)) {
                ret     = wh_Server_KeystoreGetUniqueId(server, &meta->id);
                resp.rc = ret;
            }
            /* write the key */
            if (ret == WH_ERROR_OK) {
                ret     = wh_Server_KeystoreCacheKeyChecked(server, meta, in);
                resp.rc = ret;
            }

the id might not be unique anymore when _KeysotreCacheKeyCached.

Would more coarse granular locking at request level simplify the design?

API/Error handling:
- Add initialized flag to whLock structure to distinguish init states
- Enhance error handling: acquire/release check initialized flag
- Make wh_Lock_Cleanup zero structure for clear post-cleanup state
- Document init/cleanup must be single-threaded (no atomics)
- Document cleanup preconditions (no active contention required)
- Update all API docs with precise return codes and error conditions
- Change blocking acquire failure from ERROR_LOCKED to ERROR_ABORTED
- Add comment explaining why non-blocking acquire is not provided

POSIX port improvements:
- Enhanced errno mapping in posix_lock.c (EINVAL→BADARGS, etc)
- Trap PTHREAD_MUTEX_ERRORCHECK errors (EDEADLK, EPERM)

Test coverage:
- Add testUninitializedLock to validate error handling
- Enhance testLockLifecycle with post-cleanup validation tests

Misc:
- Apply consistent critical section style pattern in wh_nvm.c
- Update copyright years to 2026
- Rename stress test files to wh_test_posix_threadsafe_stress.*
@bigbrett
Copy link
Contributor Author

@rizlik great catch, thanks. I thought I fixed all of those but clearly there are some non-atomic compound operations still lurking. I will make another pass to ensure I make them all atomic.

@rizlik
Copy link
Contributor

rizlik commented Jan 27, 2026

@rizlik great catch, thanks. I thought I fixed all of those but clearly there are some non-atomic compound operations still lurking. I will make another pass to ensure I make them all atomic.

I wonder, if we are going to use a single lock, can't we just acquire the lock at wh_Server_HandleKeyRequest start and release the lock at the end (same for wh_Server_HandleNvmRequest)?

It's probably a tradeoff, we'll gain simplicity as we don't need locked vs unlocked APIs but there is the risk that other part of the code misuse Nvm API and introduce races in the future.

@bigbrett
Copy link
Contributor Author

It's probably a tradeoff, we'll gain simplicity as we don't need locked vs unlocked APIs but there is the risk that other part of the code misuse Nvm API and introduce races in the future.

@rizlik yep that is what I was worried about and why I didn't initially try it that way ¯\_(ツ)_/¯

I'm not 100% sold on which is better

…nter, img_mgr, and nvm modules

Adds proper thread-safety locking discipline to additional server modules that
perform compound NVM operations. This prevents TOCTOU (Time-Of-Check-Time-Of-Use)
issues where metadata could become stale between check and use/writeback.

Changes:
- wh_server_cert.c: Add NVM locking for atomic GetMetadata + Read operations in
  certificate read and export paths
- wh_server_counter.c: Add NVM locking for atomic read-modify-write counter
  increment operations
- wh_server_img_mgr.c: Add NVM locking for atomic signature load operations
- wh_server_keystore.c: Refactor to use unlocked internal variants for compound
  operations (GetUniqueId + CacheKey, policy check + erase, freshen + export).
  Add locking discipline documentation.
- wh_server_nvm.c: Add NVM locking for DMA read operations to ensure metadata
  remains valid throughout transfer. Add locking discipline documentation.
- wh_test_posix_threadsafe_stress.c: Add new stress test phases for counter
  concurrent increment, counter increment vs read, NVM read vs resize, NVM
  concurrent resize, and NVM read DMA vs resize. Add counter atomicity validation.

All compound operations now follow the pattern:
1. Acquire server->nvm->lock
2. Use only *Unlocked() variants internally
3. Keep lock held for entire operation including DMA
4. Release lock after all metadata-dependent operations complete
Copy link
Member

@AlexLanzano AlexLanzano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good so far!

My main concern is the addition of *Unlocked functions. I feel like there has to be a way to remove those and still use the top level API functions by either checking if the current thread has already acquired the nvm lock. Or by creating a lock for both the keystore and the nvm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it might be nice to organize our posix tests in one spot. maybe test/posix or port/posix/test/ so we can leave our wh_test_*.c stuff generic for all platforms

…vel server module APIs (keystore, NVM, counter, etc.) and aquire lock in request handling functions (e.g. wh_Server_HandleXXXRequest())
@bigbrett bigbrett assigned bigbrett and unassigned AlexLanzano and billphipps Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants