Skip to content

{Misc.} Fix help % token handling for Python 3.14 argparse compatibility#300

Open
YangAn-microsoft wants to merge 1 commit into
microsoft:devfrom
YangAn-microsoft:fix/help-token-expansion-v2
Open

{Misc.} Fix help % token handling for Python 3.14 argparse compatibility#300
YangAn-microsoft wants to merge 1 commit into
microsoft:devfrom
YangAn-microsoft:fix/help-token-expansion-v2

Conversation

@YangAn-microsoft
Copy link
Copy Markdown
Collaborator

@YangAn-microsoft YangAn-microsoft commented Apr 29, 2026

Summary

This PR fixes Knack help rendering for %-based help strings to keep behavior stable with Python 3.14 argparse validation. Replaces #299 and optionally link any key discussion from #299.

Why this is needed (Python 3.14)

Python 3.14 argparse performs stricter help-format validation.
Help text containing unescaped literal % can raise formatting errors during help processing.
Since Knack renders help through custom paths, we must handle these cases safely while preserving argparse placeholder semantics.

What changed

  1. Added safe % sanitization for argparse-facing help text.
  2. Expanded argparse placeholders in Knack-rendered help output.
  3. Added safe fallback behavior so malformed/literal % does not crash help rendering.
  4. Added/updated tests for:
    • %(default)s
    • %(prog)s
    • %%
    • single % fallback behavior

Resulting behavior

  1. %(default)s expands to the argument default value.
  2. %(prog)s expands to the parser program string.
  3. %% is rendered as %.
  4. single % does not crash and is shown as %.

Validation

  1. Targeted token tests passed.
  2. Full Knack test suite passed on Python 3.14 (245 passed).
  3. Azure CLI help-path checks with local Knack changes completed without parser/help crashes.

Scope

This is a focused compatibility and robustness fix for help rendering behavior under Python 3.14.

@YangAn-microsoft YangAn-microsoft marked this pull request as ready for review April 29, 2026 03:21
@YangAn-microsoft YangAn-microsoft changed the title Fix help % token handling for Python 3.14 argparse compatibility {Misc.} Fix help % token handling for Python 3.14 argparse compatibility Apr 29, 2026
@YangAn-microsoft
Copy link
Copy Markdown
Collaborator Author

YangAn-microsoft commented May 6, 2026

/azp run

@azure-pipelines
Copy link
Copy Markdown

Commenter does not have sufficient privileges for PR 300 in repo microsoft/knack

@azure-pipelines
Copy link
Copy Markdown

Command 'ru' is not supported by Azure Pipelines.

Supported commands
  • help:
    • Get descriptions, examples and documentation about supported commands
    • Example: help "command_name"
  • list:
    • List all pipelines for this repository using a comment.
    • Example: "list"
  • run:
    • Run all pipelines or specific pipelines for this repository using a comment. Use this command by itself to trigger all related pipelines, or specify specific pipelines to run.
    • Example: "run" or "run pipeline_name, pipeline_name, pipeline_name"
  • where:
    • Report back the Azure DevOps orgs that are related to this repository and org
    • Example: "where"

See additional documentation.

@azure-pipelines
Copy link
Copy Markdown

Commenter does not have sufficient privileges for PR 300 in repo microsoft/knack

@bebound
Copy link
Copy Markdown
Collaborator

bebound commented May 6, 2026

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Comment thread knack/parser.py
class CLICommandParser(argparse.ArgumentParser):

@staticmethod
def _sanitize_help_for_argparse(help_text):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is a little bit too complex. A regular expression can do the job. AI response:

Yes — the whole _sanitize_help_for_argparse loop can be replaced by a single re.sub with a negative lookahead. The idea: match any % that is not followed by another % (already-escaped) and not followed by a %(name)x mapping placeholder, then replace it with %%.

import re

# Matches a single '%' that is NOT:
#   - followed by another '%'  (already escaped: %%)
#   - the start of a mapping placeholder like %(default)s, %(prog)r, etc.
_LITERAL_PERCENT_RE = re.compile(r'%(?!%|\([^()]+\)[A-Za-z])')

@staticmethod
def _sanitize_help_for_argparse(help_text):
    if not isinstance(help_text, str) or '%' not in help_text:
        return help_text
    return _LITERAL_PERCENT_RE.sub('%%', help_text)

How the lookahead reads:

  • %% → the % is followed by %, lookahead fails the "not" condition → left alone (the second % is then consumed on its own iteration; since it's followed by whatever came after %%, it usually matches and… wait — see caveat below).
  • %(default)s → the % is followed by (default)s, matches \([^()]+\)[A-Za-z] → left alone.
  • %Y, %, % , 100% → no match in lookahead → replaced with %%.

Important caveat about %%

A naive single-pass re.sub walks left-to-right and doesn't re-scan its own replacement, but it does advance past the matched % only. With the lookahead (?!%), the first % in %% is skipped, then the regex engine moves to the second %. That second % is followed by whatever comes next (e.g. end of string, a space, Y, etc.), so it would get doubled into %%, turning %% into %%%. ❌

To handle this correctly with one regex, match the %% pair explicitly and keep it, e.g. using an alternation + a callback:

_PERCENT_RE = re.compile(r'%%|%\([^()]+\)[A-Za-z]|%')

def _sanitize_help_for_argparse(help_text):
    if not isinstance(help_text, str) or '%' not in help_text:
        return help_text
    # Keep %% and %(name)x as-is; escape any other lone %.
    return _PERCENT_RE.sub(lambda m: m.group(0) if len(m.group(0)) > 1 else '%%', help_text)

This is equivalent in behavior to the hand-written loop in the PR but ~6 lines instead of ~40, and arguably easier to audit. The alternation order matters: %% and %(...)x are tried before the bare % fallback, so they "win" and are returned unchanged; only the lone % branch gets escaped.

If you'd like, I can open a follow-up PR replacing the loop with this regex version (plus keeping the existing tests, which already cover %(default)s, 100%%, and %Y-%m-%d).

Test result:

import re

_PERCENT_RE = re.compile(r'%%|%\([^()]+\)[A-Za-z]|%')

def _sanitize_help_for_argparse(help_text):
    if not isinstance(help_text, str) or '%' not in help_text:
        return help_text
    # Keep %% and %(name)x as-is; escape any other lone %.
    return _PERCENT_RE.sub(lambda m: m.group(0) if len(m.group(0)) > 1 else '%%', help_text)

print(_sanitize_help_for_argparse("A string with single %, double %%, and placeholders %(prog)s. Test test."))

Output:

A string with single %%, double %%, and placeholders %(prog)s. Test test.

Copy link
Copy Markdown
Member

@jiasli jiasli May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, even this level of complexity is not necessary, as pointed out by #299 (comment), the sanitization can be achieved as below:

  1. If there is %( or %% in the help string, leave the string as is, because this indicates the help string author is aware of the escaping rule.
  2. If not, escape % to %%, like AWS CLI did (Add support for Python 3.14 aws/aws-cli#9790).

It is not necessary to handle "a string with both % and %%". Just let it fail!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI comments on the above approach :

Yes, the comment is valid — and arguably the best approach for this PR. It trades a bit of edge-case coverage for a large reduction in complexity, with a clear and defensible rule.

The proposed rule

  1. If the help string contains %( or %%, the author already knows about argparse's escaping → leave it alone.
  2. Otherwise, blindly escape every %%%.

Why it's valid

It's a "signal of intent" heuristic. The presence of %( or %% is a strong indicator that the author wrote argparse-aware help. If they didn't use either, they almost certainly meant % literally (%Y-%m-%d, 50% off, ratio 100%, etc.).

Implementation collapses to ~5 lines, no regex needed:

@staticmethod
def _sanitize_help_for_argparse(help_text):
    if not isinstance(help_text, str) or '%' not in help_text:
        return help_text
    if '%(' in help_text or '%%' in help_text:
        return help_text  # author handled escaping themselves
    return help_text.replace('%', '%%')

Covers all the realistic cases in the PR's tests:

Input Has %(/%%? Output argparse result
default=%(default)s prog=%(prog)s yes unchanged ✅ expands
ratio 100%% yes unchanged ✅ renders 100%
bad % token no bad %% token ✅ renders bad % token
Expected format: %Y-%m-%d no Expected format: %%Y-%%m-%%d ✅ renders literally

The conscious trade-off

Strings that mix both — e.g. "Use %Y-%m-%d, default %(default)s" — fall through rule 1 and crash argparse. The comment explicitly says "just let it fail!" That's reasonable because:

  • Such mixing is extremely rare in practice.
  • The failure is loud at parser-construction time, not silent corruption.
  • The author can fix it by escaping (%%Y-%%m-%%d).

How it compares to the regex approaches we discussed

Approach LOC Handles %Y + %(default)s mixed Cognitive load
Hand-written loop (current PR) ~40 yes medium
Alternation regex + lambda ~5 yes medium
Lookbehind/lookahead regex ~5 partial (%%% edge) high
%(/%% heuristic (this comment) ~5 no (by design) low

So the comment is valid both technically and as a design judgment. I'd recommend adopting it, and updating the n8 test ('bad % token') to assert the escaped-rendering path, plus dropping the n7/%% test's expectation since %% strings are now passed through untouched (which still renders as % via argparse — same observable result).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants