Skip to content

Comments

Graceful failure when extensions fail to start: suppress verbose errors and surface update hints#6832

Draft
Copilot wants to merge 3 commits intomainfrom
copilot/improve-extension-error-messages
Draft

Graceful failure when extensions fail to start: suppress verbose errors and surface update hints#6832
Copilot wants to merge 3 commits intomainfrom
copilot/improve-extension-error-messages

Conversation

Copy link
Contributor

Copilot AI commented Feb 20, 2026

When an extension fails to start, azd was dumping the extension's full stdout/stderr to the console—including internal gRPC logs, timestamps, and EOF traces—producing noise irrelevant to the user. There was also no guidance that an outdated extension might be the root cause.

Changes

  • cmd/middleware/extensions.go
    • Suppress verbose invocation errors from console: Extension process stdout/stderr on failure is now routed to log.Printf only, keeping it available for debug inspection (AZD_DEBUG) without polluting normal output.
    • Differentiate timeout vs real startup errors: Introduced extensionStartFailure struct tracking each extension alongside a timedOut bool (set via errors.Is(err, context.DeadlineExceeded)). The failure message now says "failed to initialize" generically; the AZD_EXT_TIMEOUT hint is only shown when at least one failure was actually a deadline exceeded.
    • Add update check on startup failure: After extensions reach a terminal state, checkUpdatesForExtensions consults the already-cached registry data for failed extensions. Any extension with an available upgrade emits a ux.WarningMessage with actionable upgrade instructions before the generic startup-failure block. No network requests are made in the failure path—if the cache for a source is expired or missing, that extension is simply skipped.

Before (on extension startup failure):

extension 'spboyer.azd.doctor' run failed: exit code: 1, stdout: Error: failed to register predeploy handler: EOF
, stderr: 2026/02/09 16:41:56 [spboyer.azd.doctor] Registered handler for MessageType=*azdext.InvokeProjectHandler
2026/02/09 16:41:56 [spboyer.azd.doctor] [RequestId=] ERROR: Failed to send fire-and-forget message, ...

WARNING: Extension startup failures detected
The following extensions failed to initialize within the timeout period:
  - azd doctor (spboyer.azd.doctor)

Some features may be unavailable. Increase timeout with AZD_EXT_TIMEOUT=<seconds> if needed.

After — timeout failure with a newer version available:

WARNING: A new version of extension 'azd doctor' is available: 0.1.0 -> 0.2.0
  • To upgrade: azd extension upgrade spboyer.azd.doctor
  • To upgrade all: azd extension upgrade --all

WARNING: Extension startup failures detected
The following extensions failed to initialize:
  - azd doctor (spboyer.azd.doctor)

Some features may be unavailable. Increase timeout with AZD_EXT_TIMEOUT=<seconds> if needed.

After — real startup error (e.g. invalid binary, permission denied):

WARNING: Extension startup failures detected
The following extensions failed to initialize:
  - azd doctor (spboyer.azd.doctor)

Some features may be unavailable.
Original prompt

This section details on the original issue you should resolve

<issue_title>Graceful failure when extensions are not updated</issue_title>
<issue_description>Opportunity to improve extension installation error messages.

Describe the bug
When extensions (including test extensions) aren't updated, the tool currently fails with verbose errors that aren't helpful or relevant particularly when the extension isn't related to the current task. The messaging is noisy and doesn't guide the user to resolution.

It feels like a tidy task that's "in the way" unless it's needed for what I'm trying to accomplish (which in this case with azd up the updates aren't needed).

Image

To Reproduce

  • Attempt to use Azure Developer CLI features when certain extensions are out-of-date (especially test extension).
  • Observe the verbose failure output.

Expected behavior
Instead of verbose errors, the tool should fail gracefully with clear, plain-language guidance. Error messaging should briefly explain that an extension needs updating and provide instructions/link on how to update.

Additional context
Current error output is distracting and not relevant to the user flow. The experience could be improved by minimizing noise and offering actionable steps.
</issue_description>

<agent_instructions>1. Add a check for extensions to see if the user installed version is the latest version. If not, print a warning message like what core azd cli does to ask the user to update the extensions.
2. Check the error call stack as shown in the screenshot. Do not expose a long call stack to users in error message. </agent_instructions>

Comments on the Issue (you are @copilot in this section)


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

…tensions

Co-authored-by: hemarina <104857065+hemarina@users.noreply.github.com>
Copilot AI changed the title [WIP] Improve error messaging for outdated extensions Graceful failure when extensions fail to start: suppress verbose errors and surface update hints Feb 20, 2026
Copilot AI requested a review from hemarina February 20, 2026 22:59
Copy link
Member

@spboyer spboyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two concerns:

  1. Non-timeout startup errors are masked as timeout failures (High)
    The fallback warning always says extensions failed to initialize within the timeout period even for real startup errors (invalid binary, permission denied, bad args). WaitUntilReady returns the original failure error, not just timeout, but the message hardcodes timeout wording for all failures. Consider conditionally branching the messaging on context.DeadlineExceeded vs other error types.

  2. Synchronous network work in failure path (Medium)
    On extension startup failure, the middleware now performs update checks (including source refresh via FindExtensions which can issue HTTP requests) before continuing to next(ctx). This adds blocking I/O in a failure path that should be fast. Consider making update checks async with a short timeout, or skip remote refresh entirely during startup failure handling.

@spboyer
Copy link
Member

spboyer commented Feb 22, 2026

@copilot address feedback

…rk I/O from failure path

Co-authored-by: spboyer <7681382+spboyer@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 22, 2026

Two concerns:

  1. Non-timeout startup errors are masked as timeout failures (High)...

Both issues addressed in 19b8566:

  1. Timeout vs real startup errors: Added an extensionStartFailure struct that pairs each extension with a timedOut bool (set via errors.Is(err, context.DeadlineExceeded)). The intro line now says "failed to initialize" generically, and the AZD_EXT_TIMEOUT hint is only shown when at least one failure was actually a deadline exceeded.

  2. Network I/O in failure path: checkUpdatesForExtensions no longer refreshes the registry cache — it only consults already-cached data. Extensions whose source cache is expired or missing are skipped entirely, so no HTTP requests are made in the failure path.

Copilot AI requested a review from spboyer February 22, 2026 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Graceful failure when extensions are not updated

3 participants