Skip to content

Conversation

@gpshead
Copy link
Member

@gpshead gpshead commented Dec 29, 2025

Motivation python/cpython#143262 but also in general this wasn't covered and @serhiy-storchaka is also doing work in this area that'll become relevant such as python/cpython#143216.

@gpshead

This comment was marked as resolved.

@gpshead gpshead marked this pull request as ready for review December 29, 2025 05:59
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the results and the total time?

My suggestions:

  • Test an ASCII string input for decoding.
  • Balance encoding and decoding.


# Generate test data with fixed seed for reproducibility
random.seed(12345)
DATA_TINY = bytes(random.randrange(256) for _ in range(20))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

randbytes() is usually much faster and memory efficient. Or os.urandom().

Comment on lines +70 to +72
base64.b64encode(DATA_TINY)
base64.b64decode(B64_TINY)
base64.b64decode(B64_TINY, validate=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decoding has twice larger weight than encoding. validate=True should not affect performance for such input in reasonable implementation.

On other hand, I would test decoding from an ASCII string.

If you keep several decoding calls, you should balance them by equal number of encoding calls.


# --- URL-safe Base64 (small only) ---

def bench_urlsafe_b64_small(loops):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They may be more rare, but there are more than 2 standard Base64 variants. For example, the variant with altchars='+,' is used for IMAP mailbox names (RFC 3501). It may be not worth to add benchmarks for them, but there would be a difference. urlsafe_*() are more optimized than altchars=.


# --- Base32 ---

def bench_b32_small(loops):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also several variants. Some of them are implemented with preprocessing or postprocessing, this can affect performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants