Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions gallery/index.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,52 @@
---
- name: "step-3.7-flash"
url: "github:mudler/LocalAI/gallery/virtual.yaml@master"
urls:
- https://huggingface.co/stepfun-ai/Step-3.7-Flash-GGUF
description: |
**[ModelPage]**: https://static.stepfun.com/blog/step-3.7-flash/

## 1. Introduction

Step 3.7 Flash is a 198B-parameter sparse Mixture-of-Experts (MoE) vision-language model that combines a 196B-parameter language backbone with a 1.8B-parameter vision encoder for native image understanding. Engineered for high-frequency production workloads, it activates approximately 11B parameters per token and delivers a throughput of up to 400 tokens per second. Step 3.7 Flash supports a 256k context window and offers three selectable reasoning levels (low, medium, and high) so developers can easily balance speed, cost, and cognitive depth.

We built Step 3.7 Flash for developers who need to scale agentic workflows that combine perception, search, and reasoning. It is designed to handle intensive tasks such as parsing massive financial reports in one pass, running multi-step search loops with cross-source verification, or operating concurrent coding agents in high-throughput pipelines.

## 2. Capabilities & Performance

### Multimodal Perception and Verification

...
license: "apache-2.0"
tags:
- llm
- gguf
- vision
- multimodal
- reasoning
icon: https://example.com/photo.jpg
overrides:
backend: llama-cpp
function:
automatic_tool_parsing_fallback: true
grammar:
disable: true
known_usecases:
- chat
mmproj: llama-cpp/mmproj/Step-3.7-Flash-GGUF/mmproj-step3.7-flash-f16.gguf
options:
- use_jinja:true
parameters:
model: llama-cpp/models/Step-3.7-Flash-GGUF/Step-3.7.imatrix.gguf
template:
use_tokenizer_template: true
files:
- filename: llama-cpp/models/Step-3.7-Flash-GGUF/Step-3.7.imatrix.gguf
sha256: 7f94ca213e4560d30b492b332128527c6808041ec3526df6c2816884eb107203
uri: https://huggingface.co/stepfun-ai/Step-3.7-Flash-GGUF/resolve/main/Step-3.7.imatrix.gguf
- filename: llama-cpp/mmproj/Step-3.7-Flash-GGUF/mmproj-step3.7-flash-f16.gguf
sha256: 5f25d11f92235c69682ca820af5f4cb125ae1142c8c33c018d0b3c9000a2ec1c
uri: https://huggingface.co/stepfun-ai/Step-3.7-Flash-GGUF/resolve/main/mmproj-step3.7-flash-f16.gguf
- name: "lfm2.5-8b-a1b"
url: "github:mudler/LocalAI/gallery/virtual.yaml@master"
urls:
Expand Down
Loading