-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Support Virtual-GenAI monitoring #13745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
peachisai
wants to merge
23
commits into
apache:master
Choose a base branch
from
peachisai:Support-GenAI-monitoring
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+3,033
−10
Open
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
e29c3a9
Support Virtual-GenAI monitoring
peachisai 3642ce3
fix changes
peachisai 37708a2
Merge branch 'master' into Support-GenAI-monitoring
wu-sheng 0de57e7
Merge remote-tracking branch 'origin/prd' into Support-GenAI-monitoring
peachisai 45d255f
Merge branch 'master' into Support-GenAI-monitoring
wu-sheng f15e579
Merge branch 'master' into Support-GenAI-monitoring
wu-sheng d2c2165
fix some issues
peachisai f417ea5
Merge branches 'Support-GenAI-monitoring' and 'Support-GenAI-monitori…
peachisai 4f1ea70
Merge branch 'master' into Support-GenAI-monitoring
wu-sheng ca9704e
fix
peachisai fab132c
Merge branch 'Support-GenAI-monitoring' of github.com:peachisai/skywa…
peachisai 38af222
Merge branch 'master' into Support-GenAI-monitoring
peachisai 8e915bd
fix some suggestions
peachisai 30da5b4
fix some suggestions
peachisai 3a0af32
Merge branch 'master' into Support-GenAI-monitoring
wu-sheng 25394b0
fix some suggestions and add some default model pricing
peachisai 68476df
Merge branch 'Support-GenAI-monitoring' of github.com:peachisai/skywa…
peachisai 549e8e4
Merge branch 'master' into Support-GenAI-monitoring
wu-sheng ea7e330
fix
peachisai 1f5d5ac
fix
peachisai 227fef4
fix
peachisai a620889
fix
peachisai 4679f01
Merge pull request #80 from peachisai/uat
peachisai File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| # Virtual GenAI | ||
|
|
||
| Virtual GenAI represents the Generative AI service nodes detected by [server agents' plugins](server-agents.md). The performance | ||
| metrics of the GenAI operations are from the GenAI client-side perspective. | ||
|
|
||
| For example, a Spring AI plugin in the Java agent could detect the latency of a chat completion request. | ||
| As a result, SkyWalking would show traffic, latency, success rate, token usage (input/output), and estimated cost in the GenAI dashboard. | ||
|
|
||
| ## Span Contract | ||
|
|
||
| The GenAI operation span should have the following properties: | ||
| - It is an **Exit** span | ||
| - **Span's layer == GENAI** | ||
| - Tag key = `gen_ai.provider.name`, value = The Generative AI provider, e.g. openai, anthropic, ollama | ||
| - Tag key = `gen_ai.response.model`, value = The name of the GenAI model, e.g. gpt-4o, claude-3-5-sonnet | ||
| - Tag key = `gen_ai.usage.input_tokens`, value = The number of tokens used in the GenAI input (prompt) | ||
| - Tag key = `gen_ai.usage.output_tokens`, value = The number of tokens used in the GenAI response (completion) | ||
| - Tag key = `gen_ai.server.time_to_first_token`, value = The duration in milliseconds until the first token is received (streaming requests only) | ||
| - If the GenAI service is a remote API (e.g. OpenAI), the span's peer should be the network address (IP or domain) of the GenAI server. | ||
|
|
||
| ## Provider Configuration | ||
|
|
||
| SkyWalking uses `gen-ai-config.yml` to map model names to providers and configure cost estimation. | ||
|
|
||
| When the `gen_ai.provider.name` tag is present in the span, it is used directly. Otherwise, SkyWalking matches the model name | ||
| against `prefix-match` rules to identify the provider. For example, a model name starting with `gpt` is mapped to `openai`. | ||
|
|
||
| To configure cost estimation, add `models` with pricing under the provider: | ||
|
|
||
|
|
||
| ```yaml | ||
| providers: | ||
| - provider: openai | ||
| prefix-match: | ||
| - gpt | ||
| models: | ||
| - name: gpt-4o | ||
| input-estimated-cost-per-m: 2.5 # estimated cost per 1,000,000 input tokens | ||
| output-estimated-cost-per-m: 10 # estimated cost per 1,000,000 output tokens | ||
| ``` | ||
|
|
||
| ## Metrics | ||
|
|
||
| The following metrics are available at the **provider** (service) level: | ||
| - `gen_ai_provider_cpm` - Calls per minute | ||
| - `gen_ai_provider_sla` - Success rate | ||
| - `gen_ai_provider_resp_time` - Average response time | ||
| - `gen_ai_provider_latency_percentile` - Latency percentiles | ||
| - `gen_ai_provider_input_tokens_sum / avg` - Input token usage | ||
| - `gen_ai_provider_output_tokens_sum / avg` - Output token usage | ||
| - `gen_ai_provider_total_estimated_cost / avg_estimated_cost` - Estimated cost | ||
|
|
||
| The following metrics are available at the **model** (service instance) level: | ||
| - `gen_ai_model_call_cpm` - Calls per minute | ||
| - `gen_ai_model_sla` - Success rate | ||
| - `gen_ai_model_latency_avg / percentile` - Latency | ||
| - `gen_ai_model_ttft_avg / percentile` - Time to first token (streaming only) | ||
| - `gen_ai_model_input_tokens_sum / avg` - Input token usage | ||
| - `gen_ai_model_output_tokens_sum / avg` - Output token usage | ||
| - `gen_ai_model_total_estimated_cost / avg_estimated_cost` - Estimated cost | ||
|
|
||
| ## Requirement | ||
| `SkyWalking Java Agent` version >= 9.7 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
115 changes: 115 additions & 0 deletions
115
...ng/oap/server/analyzer/provider/trace/parser/listener/vservice/VirtualGenAIProcessor.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,115 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.skywalking.oap.server.analyzer.provider.trace.parser.listener.vservice; | ||
|
|
||
| import lombok.RequiredArgsConstructor; | ||
| import org.apache.skywalking.apm.network.language.agent.v3.SegmentObject; | ||
| import org.apache.skywalking.apm.network.language.agent.v3.SpanLayer; | ||
| import org.apache.skywalking.apm.network.language.agent.v3.SpanObject; | ||
| import org.apache.skywalking.oap.analyzer.genai.service.IGenAIMeterAnalyzerService; | ||
| import org.apache.skywalking.oap.server.core.analysis.Layer; | ||
| import org.apache.skywalking.oap.server.core.config.NamingControl; | ||
| import org.apache.skywalking.oap.server.core.source.GenAIMetrics; | ||
| import org.apache.skywalking.oap.server.core.source.GenAIModelAccess; | ||
| import org.apache.skywalking.oap.server.core.source.GenAIProviderAccess; | ||
| import org.apache.skywalking.oap.server.core.source.ServiceInstance; | ||
| import org.apache.skywalking.oap.server.core.source.ServiceMeta; | ||
| import org.apache.skywalking.oap.server.core.source.Source; | ||
|
|
||
| import java.util.ArrayList; | ||
| import java.util.List; | ||
| import java.util.function.Consumer; | ||
|
|
||
| @RequiredArgsConstructor | ||
| public class VirtualGenAIProcessor implements VirtualServiceProcessor { | ||
|
|
||
| private final NamingControl namingControl; | ||
|
|
||
| private final IGenAIMeterAnalyzerService meterAnalyzerService; | ||
|
|
||
| private final List<Source> recordList = new ArrayList<>(); | ||
|
|
||
| @Override | ||
| public void prepareVSIfNecessary(SpanObject span, SegmentObject segmentObject) { | ||
| if (span.getSpanLayer() != SpanLayer.GenAI) { | ||
| return; | ||
| } | ||
|
|
||
| GenAIMetrics metrics = meterAnalyzerService.extractMetricsFromSWSpan(span, segmentObject); | ||
| if (metrics == null) { | ||
| return; | ||
| } | ||
|
|
||
| recordList.add(toServiceMeta(metrics)); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should do naming control here, you miseed toService and toInstance, due to recent code changes. |
||
| recordList.add(toInstance(metrics)); | ||
| recordList.add(toProviderAccess(metrics)); | ||
| recordList.add(toModelAccess(metrics)); | ||
| } | ||
|
|
||
| private ServiceMeta toServiceMeta(GenAIMetrics metrics) { | ||
| ServiceMeta service = new ServiceMeta(); | ||
| service.setName(namingControl.formatServiceName(metrics.getProviderName())); | ||
| service.setLayer(Layer.VIRTUAL_GENAI); | ||
| service.setTimeBucket(metrics.getTimeBucket()); | ||
| return service; | ||
| } | ||
|
|
||
| private Source toInstance(GenAIMetrics metrics) { | ||
| ServiceInstance instance = new ServiceInstance(); | ||
| instance.setTimeBucket(metrics.getTimeBucket()); | ||
| instance.setName(namingControl.formatInstanceName(metrics.getModelName())); | ||
| instance.setServiceLayer(Layer.VIRTUAL_GENAI); | ||
| instance.setServiceName(metrics.getProviderName()); | ||
| return instance; | ||
| } | ||
|
|
||
| private GenAIProviderAccess toProviderAccess(GenAIMetrics metrics) { | ||
| GenAIProviderAccess source = new GenAIProviderAccess(); | ||
| source.setName(namingControl.formatServiceName(metrics.getProviderName())); | ||
| source.setInputTokens(metrics.getInputTokens()); | ||
| source.setOutputTokens(metrics.getOutputTokens()); | ||
| source.setTotalEstimatedCost(metrics.getTotalEstimatedCost()); | ||
| source.setLatency(metrics.getLatency()); | ||
| source.setStatus(metrics.isStatus()); | ||
| source.setTimeBucket(metrics.getTimeBucket()); | ||
| return source; | ||
| } | ||
|
|
||
| private GenAIModelAccess toModelAccess(GenAIMetrics metrics) { | ||
| GenAIModelAccess source = new GenAIModelAccess(); | ||
| source.setServiceName(namingControl.formatServiceName(metrics.getProviderName())); | ||
| source.setModelName(namingControl.formatInstanceName(metrics.getModelName())); | ||
| source.setInputTokens(metrics.getInputTokens()); | ||
| source.setOutputTokens(metrics.getOutputTokens()); | ||
| source.setTotalEstimatedCost(metrics.getTotalEstimatedCost()); | ||
| source.setTimeToFirstToken(metrics.getTimeToFirstToken()); | ||
| source.setLatency(metrics.getLatency()); | ||
| source.setStatus(metrics.isStatus()); | ||
| source.setTimeBucket(metrics.getTimeBucket()); | ||
| return source; | ||
| } | ||
|
|
||
| @Override | ||
| public void emitTo(Consumer<Source> consumer) { | ||
| for (Source source : recordList) { | ||
| if (source != null) { | ||
| consumer.accept(source); | ||
| } | ||
| } | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| <?xml version="1.0" encoding="UTF-8"?> | ||
| <!-- | ||
| ~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
| ~ contributor license agreements. See the NOTICE file distributed with | ||
| ~ this work for additional information regarding copyright ownership. | ||
| ~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| ~ (the "License"); you may not use this file except in compliance with | ||
| ~ the License. You may obtain a copy of the License at | ||
| ~ | ||
| ~ http://www.apache.org/licenses/LICENSE-2.0 | ||
| ~ | ||
| ~ Unless required by applicable law or agreed to in writing, software | ||
| ~ distributed under the License is distributed on an "AS IS" BASIS, | ||
| ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| ~ See the License for the specific language governing permissions and | ||
| ~ limitations under the License. | ||
| ~ | ||
| --> | ||
|
|
||
| <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
| <parent> | ||
| <artifactId>analyzer</artifactId> | ||
| <groupId>org.apache.skywalking</groupId> | ||
| <version>${revision}</version> | ||
| </parent> | ||
| <modelVersion>4.0.0</modelVersion> | ||
|
|
||
| <artifactId>gen-ai-analyzer</artifactId> | ||
|
|
||
| <dependencies> | ||
| <dependency> | ||
| <groupId>org.apache.skywalking</groupId> | ||
| <artifactId>server-core</artifactId> | ||
| <version>${project.version}</version> | ||
| </dependency> | ||
| </dependencies> | ||
| </project> |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to update the demo to point to here. I think from
Marketplace/General Service?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just this. menu.yml is not updated in the /docs/en