Skip to content

Multi-node clustering — link NeuralDrive instances for combined compute #12

@eshork

Description

@eshork

Idea

Allow multiple NeuralDrive instances on a network to discover each other and pool their compute resources. Adding more USB-booted nodes would scale inference capacity — more GPUs, more VRAM, more concurrent requests.

Open Questions

  • Discovery: mDNS/Avahi auto-discovery vs manual node registration?
  • Load balancing vs distributed inference: Simple request-level load balancing (route whole requests to nodes with capacity) is straightforward. Distributed inference across nodes (tensor parallelism over the network) is much harder and may not be practical over commodity networking.
  • Model placement: Which node holds which model? Replicate popular models across nodes, or dedicate nodes to specific models?
  • Coordination: Does one node act as a primary/coordinator, or is it fully peer-to-peer?
  • API surface: Should the cluster present a single unified API endpoint, or does each node remain independently addressable?
  • State: Shared model registry? Synchronized config? Or fully independent nodes behind a load balancer?

Initial Thought

A simple load-balancer approach (Caddy upstream pool, round-robin or least-connections across discovered nodes) would be the fastest path to something useful. More sophisticated distributed inference could come later.

Status

Needs design work before implementation. Logging for future consideration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions