Skip to content

fix(backends/vastai): route /instances/ to /api/v1/ (v0 deprecated, returns HTTP 410)#3969

Open
evolv3ai wants to merge 1 commit into
dstackai:masterfrom
evolv3ai:fix/vastai-v1-instances
Open

fix(backends/vastai): route /instances/ to /api/v1/ (v0 deprecated, returns HTTP 410)#3969
evolv3ai wants to merge 1 commit into
dstackai:masterfrom
evolv3ai:fix/vastai-v1-instances

Conversation

@evolv3ai

Copy link
Copy Markdown

Problem

VastAIAPIClient._url() in src/dstack/_internal/core/backends/vastai/api_client.py hardcodes https://console.vast.ai/api/v0 for all Vast API calls. Vast.ai has deprecated only the /api/v0/instances/ path family — the endpoint now responds with HTTP 410 deprecated_endpoint.

Reproduction (any current dstack release, including master and PyPI 0.20.24):

curl -sS -o /dev/null -w '%{http_code}\n' "https://console.vast.ai/api/v0/instances/?api_key=$VASTAI_API_KEY"
# -> 410
curl -sS -o /dev/null -w '%{http_code}\n' "https://console.vast.ai/api/v1/instances/?api_key=$VASTAI_API_KEY"
# -> 200

This breaks vastai backend registration: auth_test() calls get_instances(), the 410 is raised as an HTTPError, dstack reports it back to the operator as "Invalid credentials". The API key is fine — the URL is dead.

/bundles/ and /asks/ still work on v0 (v1 is not yet published for them), so offer queries and instance creation are unaffected — only the /instances/ family needs the v1 prefix.

Fix

Route paths under /instances/ (covers get_instances, destroy_instance, request_logs) to /api/v1/. Leave /bundles/ and /asks/ on v0. The v1 instances response preserves the existing schema (success flag, instances list); it adds pagination fields the existing code ignores.

Verification

Applied to a self-hosted dstack 0.20.23 broker. Before: backend registration fails with "Invalid credentials". After patch + restart:

  • POST /api/project/<p>/backends/create_yaml → 200
  • Project backends list includes vastai
  • dstack offer --backend vastai --gpu A6000 returns 7+ live offers
  • dstack apply of a task config successfully provisions a Vast.ai instance through the patched code path (instance transitions provisioning → running cleanly)

Notes

  • Targeted patch — only the _url() helper changes. No new dependencies, no behavioral change for bundles/asks callers.
  • Bug exists on master and on every tagged release I checked (0.20.20 through 0.20.24).
  • If Vast publishes /api/v1/bundles/ and /api/v1/asks/ later, the same conditional shape will accept additional path prefixes cleanly.

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com

Vast.ai deprecated /api/v0/instances/, returning HTTP 410
"deprecated_endpoint". dstack interprets that as "Invalid credentials"
during auth_test() and refuses to register the backend.

Only the /instances/ path family is deprecated; /bundles/ and /asks/
remain on v0 (v1 not yet published for them). The v1 instances response
preserves the existing schema (success flag, instances list); it adds
pagination fields the existing code ignores.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5dcaac677b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +142 to +143
if path.lstrip("/").startswith("instances/"):
base = base.replace("/api/v0", "/api/v1")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep non-list instance calls on v0

This prefix check sends every /instances/... URL to v1, including destroy_instance() and request_logs(). The current Vast docs only move the list endpoint to GET /api/v1/instances/; they still document destruction as DELETE /api/v0/instances/{id}/ and logs as PUT /api/v0/instances/request_logs/{id}. In VastAI runs, terminate_instance() relies on destroy_instance() and ignores a false return, so routing deletes to an unpublished v1 endpoint can leave paid instances running while also breaking log retrieval.

Useful? React with 👍 / 👎.

Comment on lines +142 to +143
if path.lstrip("/").startswith("instances/"):
base = base.replace("/api/v0", "/api/v1")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Page through v1 instance results

Routing get_instances() to v1 changes the response from the old unpaginated list to Vast's paginated endpoint: the docs state limit defaults to 25, has max 25, and provide next_token for additional pages. Since the existing client caches only data["instances"], accounts with more than 25 matching instances can have a dstack instance omitted, causing get_instance() to return None and update_provisioning_data() to stop discovering SSH/status data for that run.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant