Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@

.claude
CLAUDE.md
.omx/
4 changes: 1 addition & 3 deletions docs/en/dify/install.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -161,9 +161,7 @@ ingress:
- <dify.example.com>
```

<a id="storage-s3-and-pvc"></a>

### Storage (S3 and PVC)
### Storage (S3 and PVC) \{#storage-s3-and-pvc}

**PVC (default):** API and plugin daemon each use a PVC when enabled. Override storage class and size as needed.

Expand Down
12 changes: 3 additions & 9 deletions docs/en/installation/ai-cluster.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,7 @@ If your use case requires `Knative` functionality, which enables advanced featur
[Recommended deployment option](https://kserve.github.io/website/docs/admin-guide/overview#generative-inference): For generative inference workloads, the **Standard** approach (previously known as RawKubernetes Deployment) is recommended as it provides the most control over resource allocation and scaling.
:::

<a id="downloading"></a>

## Downloading
## Downloading \{#downloading}

**Operator Components**:

Expand All @@ -38,9 +36,7 @@ If your use case requires `Knative` functionality, which enables advanced featur
You can download the app named 'Alauda AI' and 'Knative Operator' from the Marketplace on the Customer Portal website.
:::

<a id="uploading"></a>

## Uploading
## Uploading \{#uploading}

We need to upload both `Alauda AI` and `Knative Operator` to the cluster where Alauda AI is to be used.

Expand Down Expand Up @@ -163,9 +159,7 @@ Confirm that the **Alauda AI** tile shows one of the following states:

For detailed installation steps, see [Install KServe](../kserve/install.mdx) in Alauda Build of KServe.

<a id="enabling-knative-functionality"></a>

## Enabling Knative Functionality
## Enabling Knative Functionality \{#enabling-knative-functionality}

Knative functionality is an optional capability that requires an additional operator and instance to be deployed.

Expand Down
20 changes: 5 additions & 15 deletions docs/en/kserve/install.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,7 @@ Before installing **Alauda Build of KServe**, you need to ensure the following d
1. **Required Dependencies**: All required dependencies must be installed before installing Alauda Build of KServe.
2. **GIE Integration**: GIE is bundled and enabled by default. If your environment already has GIE installed separately, set `gie.builtIn` to `false` in the operator configuration to disable the built-in installation.

<a id="upload-operator"></a>

## Upload Operator
## Upload Operator \{#upload-operator}

Download the Alauda Build of KServe Operator installation file (e.g., `kserve-operator.ALL.xxxx.tgz`).

Expand Down Expand Up @@ -137,9 +135,7 @@ kubectl get kserve default-kserve -n kserve-operator

The instance is ready when the status shows `DEPLOYED: True`.

<a id="envoy-gateway-configuration"></a>

### Envoy Gateway Configuration
### Envoy Gateway Configuration \{#envoy-gateway-configuration}

| Field | Description | Default |
|-------|-------------|---------|
Expand All @@ -148,18 +144,14 @@ The instance is ready when the status shows `DEPLOYED: True`.
| `preset.envoy_gateway.create_instance` | Create an Envoy Gateway instance to manage inference traffic with bundled extensions. | `true` |
| `preset.envoy_gateway.instance_name` | Name of the Envoy Gateway instance to create. | `aieg` |

<a id="envoy-ai-gateway-configuration"></a>

### Envoy AI Gateway Configuration
### Envoy AI Gateway Configuration \{#envoy-ai-gateway-configuration}

| Field | Description | Default |
|-------|-------------|---------|
| `preset.envoy_ai_gateway.service` | Kubernetes service name for Envoy AI Gateway. | `ai-gateway-controller` |
| `preset.envoy_ai_gateway.port` | Port number used by Envoy AI Gateway. | `1063` |

<a id="kserve-gateway-configuration"></a>

### KServe Gateway Configuration
### KServe Gateway Configuration \{#kserve-gateway-configuration}

| Field | Description | Default |
|-------|-------------|---------|
Expand All @@ -169,9 +161,7 @@ The instance is ready when the status shows `DEPLOYED: True`.
| `preset.kserve_gateway.gateway_class` | Optional custom GatewayClass name. If empty, derived as `{namespace}-{name}`. | `""` |
| `preset.kserve_gateway.port` | Port number used by the KServe Gateway. | `80` |

<a id="gie-gateway-api-inference-extension-configuration"></a>

### GIE (gateway-api-inference-extension) Configuration
### GIE (gateway-api-inference-extension) Configuration \{#gie-gateway-api-inference-extension-configuration}

| Field | Description | Default |
|-------|-------------|---------|
Expand Down
4 changes: 1 addition & 3 deletions docs/en/label_studio/install.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -185,9 +185,7 @@ redirectURIs:

### 4. Configure User Management

<a id="41-disable-user-registration"></a>

#### 4.1 Disable User Registration
#### 4.1 Disable User Registration \{#41-disable-user-registration}

User registration can be disabled by setting the following fields:

Expand Down
4 changes: 1 addition & 3 deletions docs/en/llama_stack/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,7 @@ The notebook demonstrates:

## FAQ

<a id="how-to-prepare-python-312-in-notebook"></a>

### How to prepare Python 3.12 in Notebook
### How to prepare Python 3.12 in Notebook \{#how-to-prepare-python-312-in-notebook}

1. Download the pre-compiled Python installation package:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,7 @@ The core definition of the inference service feature is to deploy trained machin
- Automatically generates Swagger documentation to facilitate user integration and invocation of inference services.
- Provides real-time monitoring and alarm features to ensure stable service operation.

<a id="create-inference-service"></a>

## Create inference service
## Create inference service \{#create-inference-service}

<Steps>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -213,9 +213,7 @@ Multi-node EP deployments require additional distributed runtime and networking
This page focuses on the single-node configuration pattern. If you need multi-node EP, refer to the official vLLM guide and adapt the deployment model to your cluster topology and runtime environment.
:::

<a id="references"></a>

## References
## References \{#references}

- [Expert Parallel Deployment - vLLM](https://docs.vllm.ai/en/latest/serving/expert_parallel_deployment/)
- [Data Parallel Deployment - vLLM](https://docs.vllm.ai/en/latest/serving/data_parallel_deployment/)
Expand Down
Loading