Skip to content

[Bug] Segfault for sd-server on musl #1454

@atomic-carpenter

Description

@atomic-carpenter

Git commit

c97702e

Operating System & Version

Alpine Linux 3.23

GGML backends

Vulkan

Command-line arguments used

sd-server -m juggernautXL_version2.safetensors

Steps to reproduce

  • Run server
  • Generate any image

What you expected to happen

Image generation

What actually happened

Segfault

Logs / error messages / stack trace

The stack tracke has 1887 lines of:

No symbol table info available.
#1878 0x0000555555bc58cb in ggml_visit_parents_graph ()

The default stack size is 128KB. It's unlikely that this can fit this many recursive calls. Did the stack run out?

#1888 0x00005555557f4f77 in std::_Function_handler<ggml_cgraph* (), UNetModelRunner::compute(int, sd::Tensor<float> const&, sd::Tensor<float> const&, sd::Tensor<float> const&, sd::Tensor<float> const&, sd::Tensor<float> const&, int, std::vector<sd::Tensor<float>, std::allocator<sd::Tensor<float> > > const&, float)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
No symbol table info available.
#1889 0x000055555575d680 in std::optional<sd::Tensor<float> > GGMLRunner::compute<float>(std::function<ggml_cgraph* ()>, int, bool, bool) ()
No symbol table info available.
#1890 0x000055555575e52d in UNetModel::compute(int, DiffusionParams const&) ()
No symbol table info available.
--Type <RET> for more, q to quit, c to continue without paging--
#1891 0x0000555555747940 in StableDiffusionGGML::sample(std::shared_ptr<DiffusionModel> const&, bool, sd::Tensor<float> const&, sd::Tensor<float>, SDCondition const&, SDCondition const&, SDCondition const&, SDCondition const&, sd::Tensor<float> const&, float, sd_guidance_params_t const&, float, int, sample_method_t, bool, std::vector<float, std::allocator<float> > const&, int, std::vector<sd::Tensor<float>, std::allocator<sd::Tensor<float> > > const&, bool, sd::Tensor<float> const&, sd::Tensor<float> const&, float, sd_cache_params_t const*)::{lambda(sd::Tensor<float> const&, float, int)#1}::operator()(sd::Tensor<float> const&, float, int) const::{lambda(SDCondition const&, sd::Tensor<float> const*, std::vector<int, std::allocator<int> > const*)#1}::operator()(SDCondition const&, sd::Tensor<float> const*, std::vector<int, std::allocator<int> > const*) const ()
No symbol table info available.
#1892 0x00005555557a8755 in StableDiffusionGGML::sample(std::shared_ptr<DiffusionModel> const&, bool, sd::Tensor<float> const&, sd::Tensor<float>, SDCondition const&, SDCondition const&, SDCondition const&, SDCondition const&, sd::Tensor<float> const&, float, sd_guidance_params_t const&, float, int, sample_method_t, bool, std::vector<float, std::allocator<float> > const&, int, std::vector<sd::Tensor<float>, std::allocator<sd::Tensor<float> > > const&, bool, sd::Tensor<float> const&, sd::Tensor<float> const&, float, sd_cache_params_t const*)::{lambda(sd::Tensor<float> const&, float, int)#1}::operator()(sd::Tensor<float> const&, float, int) const ()
No symbol table info available.
#1893 0x00005555557aa4f8 in std::_Function_handler<sd::Tensor<float> (sd::Tensor<float> const&, float, int), StableDiffusionGGML::sample(std::shared_ptr<DiffusionModel> const&, bool, sd::Tensor<float> const&, sd::Tensor<float>, SDCondition const&, SDCondition const&, SDCondition const&, SDCondition const&, sd::Tensor<float> const&, float, sd_guidance_params_t const&, float, int, sample_method_t, bool, std::vector<float, std::allocator<float> > const&, int, std::vector<sd::Tensor<float>, std::allocator<sd::Tensor<float> > > const&, bool, sd::Tensor<float> const&, sd::Tensor<float> const&, float, sd_cache_params_t const*)::{lambda(sd::Tensor<float> const&, float, int)#1}>::_M_invoke(std::_Any_data const&, sd::Tensor<float> const&, float&&, int&&) ()
No symbol table info available.
#1894 0x0000555555795a37 in sample_euler_ancestral(std::function<sd::Tensor<float> (sd::Tensor<float> const&, float, int)>, sd::Tensor<float>, std::vector<float, std::allocator<float> > const&, std::shared_ptr<RNG>, float) ()
No symbol table info available.
#1895 0x00005555557a633b in StableDiffusionGGML::sample(std::shared_ptr<DiffusionModel> const&, bool, sd::Tensor<float> const&, sd::Tensor<float>, SDCondition const&, SDCondition const&, SDCondition const&, SDCondition const&, sd::Tensor<float> const&, float, sd_guidance_params_t const&, float, int, sample_method_t, bool, std::vector<float, std::allocator<float> > const&, int, std::vector<sd::Tensor<float>, std::allocator<sd::Tensor<float> > > const&, bool, sd::Tensor<float> const&, sd::Tensor<float> const&, float, sd_cache_params_t const*) ()
No symbol table info available.
#1896 0x0000555555733708 in generate_image ()
No symbol table info available.
#1897 0x00005555556c758d in execute_img_gen_job(ServerRuntime&, AsyncGenerationJob&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) ()
No symbol table info available.
#1898 0x00005555556c9b77 in async_job_worker(ServerRuntime&) ()
No symbol table info available.
#1899 0x00007ffff7afa680 in ?? () from /lib/libstdc++.so.6
No symbol table info available.
#1900 0x00007ffff7fbef6d in start (p=<optimized out>) at src/thread/pthread_create.c:207
        args = <optimized out>
        state = <optimized out>
#1901 0x00007ffff7fc08c0 in __clone () at src/thread/x86_64/clone.s:22

Stack size for the worker should be much larger. This can be set with pthread_attr_setstacksize.

Additional context / environment details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions