Eval bug: Segfault at slot initialization with CUDA on SM75 (Turing) — ngl > 0

### Name and Version

./llama-server --version
version: 9867 (e72710983)
built with GNU 14.2.0 for Linux x86_64

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
./llama-server --model model.gguf --port 8081 -ngl 32 -c 8192 --flash-attn on -ctk turbo4 -ctv turbo3 -t 6
```

### Problem description & steps to reproduce

CPU: Intel i7 
GPU: NVIDIA Quadro T1000 Max-Q (SM75 / Turing 4GB)
CUDA: 12.8

Server segfaults immediately after initializing slots when any CUDA layers are offloaded (-ngl > 0) on SM75 (Turing) hardware. CPU-only (-ngl 0) works correctly. Reproducible on both MTP and non-MTP model. 

Segfaults at initializing slots with -ngl > 0
Warning: fused Gated Delta Net (chunked) not supported, set to disabled
CPU-only (-ngl 0) works fine
Reproducible on both MTP and non-MTP model
Occurs with and without --spec-type mtp
Occurs with and without --no-warmup
fused Gated Delta Net disabled warning appears consistently before crash
CPU-only run (-ngl 0) loads and serves correctly
Model: Qwopus3.5-4B-v3-MTP Q5_K_M GGUF

### First Bad Commit

_No response_

### Relevant log output

<details>
<summary>Logs</summary>


```console
Here's the last log lines before the SegFault:
W sched_reserve: layer 0 is assigned to device CPU but the fused Gated Delta Net tensor is assigned to device CUDA0
W sched_reserve: fused Gated Delta Net (chunked) not supported, set to disabled
I srv load_model: initializing slots, n_slots = 4
[segfault]
```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: Segfault at slot initialization with CUDA on SM75 (Turing) — ngl > 0 #41

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Eval bug: Segfault at slot initialization with CUDA on SM75 (Turing) — ngl > 0 #41

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions