Skip to content

Fix vulnerability: uninitalized tensor heap memory leak via integer overflow in BalanceTrajectory#1097

Open
mhucka wants to merge 3 commits into
tensorflow:masterfrom
mhucka:fix-balancetrajectory
Open

Fix vulnerability: uninitalized tensor heap memory leak via integer overflow in BalanceTrajectory#1097
mhucka wants to merge 3 commits into
tensorflow:masterfrom
mhucka:fix-balancetrajectory

Conversation

@mhucka

@mhucka mhucka commented Jun 22, 2026

Copy link
Copy Markdown
Member

Internal security scans reported a security vulnerability (CWE-200/CWE-908) in tensorflow_quantum/core/ops/noise/tfq_noisy_samples.cc. In TfqNoisySamplesOp, a massive num_samples value near INT_MAX will cause BalanceTrajectory to overflow its signed integer calculation for chunk sizes. This results in negative or zero chunk sizes, causing worker threads to bypass execution. Because TensorFlow's allocate_output provides uninitialized heap blocks, the kernel returns a tensor filled with residual process memory.

Changes in this PR:

  • In util_balance_trajectory.cc, updated chunk calculations to use int64_t arithmetic, preventing overflows when num_samples (or rep_limits) is close to INT_MAX.

  • Modified tfq_noisy_samples.cc, tfq_noisy_expectation.cc, and tfq_noisy_sampled_expectation.cc to compute thread allocations (p_reps) using int64_t division and cast the Philox random sample reservation requests to int64_t variables.

  • Add checks to enforce a maximum capacity limit of 10,000,000 in input tensor parsing functions GetNumSamples and GetIndividualSample within parse_context.cc. This should eliminate memory depletion/DoS attempts with massive sizes.

  • Added tests.

@mhucka

mhucka commented Jun 22, 2026

Copy link
Copy Markdown
Member Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements integer overflow prevention and input validation across TensorFlow Quantum's noisy simulation operations. It restricts the maximum number of samples to 10,000,000, updates several loop counters and qubit sizes to standard integers, and uses 64-bit integer casting for intermediate calculations. Unit tests are also added to verify these limits and overflow protections. The reviewer feedback highlights an issue in tfq_noisy_expectation.cc where threads in ComputeSmall calculate but do not use n_rand, causing each thread to redundantly reserve the entire undivided amount of random samples.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +309 to +310
auto local_gen = random_gen.ReserveSamples128(
static_cast<int64_t>(ncircuits.size()) * max_n_shots + 1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In ComputeSmall, each thread running DoWork should only reserve its portion of the random samples (n_rand) from the shared random_gen. Currently, n_rand is calculated but unused, and instead each thread reserves the entire undivided amount (static_cast<int64_t>(ncircuits.size()) * max_n_shots + 1). This results in reserving num_threads times more samples than actually needed, wasting the random number generator's state space and causing unnecessary overhead.

      auto local_gen = random_gen.ReserveSamples128(n_rand);

@mhucka mhucka marked this pull request as ready for review June 22, 2026 05:08

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces input validation and overflow prevention for the number of samples across several TensorFlow Quantum noisy operators. Specifically, it limits the maximum number of samples to 10,000,000, adds explicit casts to int64_t during arithmetic operations to prevent integer overflow, and updates several loop index types from 64-bit integers to standard int. Corresponding unit tests are also added to verify these constraints. The review feedback identifies an inefficiency in tfq_noisy_expectation.cc where each thread in ComputeSmall reserves the total combined random samples instead of the per-thread required samples (n_rand), which causes unnecessary overhead.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +307 to +310
int64_t n_rand = static_cast<int64_t>(ncircuits.size()) * max_n_shots + 1;
n_rand = (n_rand + num_threads) / num_threads;
auto local_gen =
random_gen.ReserveSamples128(ncircuits.size() * max_n_shots + 1);
auto local_gen = random_gen.ReserveSamples128(
static_cast<int64_t>(ncircuits.size()) * max_n_shots + 1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In ComputeSmall, each thread in the parallel loop reserves random samples from the global generator. However, instead of reserving the per-thread required samples (n_rand), each thread is reserving the total combined samples for all threads (static_cast<int64_t>(ncircuits.size()) * max_n_shots + 1). This leads to excessive skipping of the Philox random number generator state and unnecessary overhead. Using n_rand directly (as is correctly done in tfq_noisy_sampled_expectation.cc) resolves this inefficiency.

      int64_t n_rand = static_cast<int64_t>(ncircuits.size()) * max_n_shots + 1;
      n_rand = (n_rand + num_threads) / num_threads;
      auto local_gen = random_gen.ReserveSamples128(n_rand);

@mhucka mhucka requested a review from MichaelBroughton June 24, 2026 23:14

@MichaelBroughton MichaelBroughton left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably shorten this up a fair bit and remove some unrelated changes that didn't get new tests. Fine to merge once this is done.


uint64_t max_num_qubits = 0;
for (const uint64_t num : num_qubits) {
int max_num_qubits = 0;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid downgrading these ?

DCHECK_EQ(4, context->num_inputs());
const int num_inputs = context->num_inputs();
OP_REQUIRES(context, num_inputs == 4,
tensorflow::errors::InvalidArgument(absl::StrCat(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to fixing the overflow problem, can we move this to a different PR ?

int num_samples = 0;
OP_REQUIRES_OK(context, GetIndividualSample(context, &num_samples));
OP_REQUIRES(context, num_samples >= 0 && num_samples <= 10000000,
tensorflow::errors::InvalidArgument(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like the main piece of code that actually solves our problems, can we scope this PR down to just putting in these checks and testing them across all ops that require a num_samples and call into BalanceTrajectory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants