Skip to content

Conversation

@oetr
Copy link
Contributor

@oetr oetr commented Oct 10, 2025

This allows the users provide values directly to types by annotating them with @ValuePool.
All types, other than those that need cached-based mutators, are supported.

  public static Stream<?> myPool() {
    return Stream.of(
        "0123456789abcdef".repeat(50),
        "sitting duck suprime".repeat(53),
        123);
  }

  @FuzzTest
  @ValuePool("myPool")
  public static void fuzzerTestOneInput(
      @NotNull @WithUtf8Length(max = 10000) String data,
      @NotNull @WithUtf8Length(max = 10000) String data2) {
    if (data.equals("0123456789abcdef".repeat(50))
        && data2.equals("sitting duck suprime".repeat(53))) {
      throw new FuzzerSecurityIssueLow("Found the long string!");
    }
  }

@oetr oetr force-pushed the CIF-1785-dictionary-provider branch 15 times, most recently from 958d54a to d164e3c Compare October 16, 2025 23:50
@oetr oetr force-pushed the CIF-1785-dictionary-provider branch 3 times, most recently from 5eac7ec to 5069fb3 Compare October 28, 2025 19:20
@oetr oetr changed the title feat: dictionary provider for selected types feat: dictionary provider for Strings and Integrals Oct 28, 2025
@oetr oetr marked this pull request as ready for review October 28, 2025 19:25
Copilot AI review requested due to automatic review settings October 28, 2025 19:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds dictionary provider support to the mutation framework, enabling users to provide custom dictionary values for fuzzing through the @DictionaryProvider annotation. The implementation:

  • Introduces MutatorRuntime class to provide runtime information (including the fuzz test method) to mutators
  • Adds @DictionaryProvider annotation that references static methods returning Stream<?> of dictionary values
  • Updates all MutatorFactory implementations to accept MutatorRuntime parameter
  • Implements dictionary support for String and integral mutators using weighted sampling
  • Adds SamplingUtils with Vose's alias method for efficient O(1) weighted sampling
  • Includes comprehensive tests for the new functionality

Reviewed Changes

Copilot reviewed 85 out of 85 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
DictionaryProvider.java New annotation for specifying dictionary provider methods with probability control
MutatorRuntime.java New runtime info class providing fuzz test method to mutators
DictionaryProviderSupport.java Helper methods to extract dictionary values from provider methods
IgnoreRecursiveConflicts.java Meta-annotation to allow duplicate annotations during type hierarchy propagation
SamplingUtils.java Weighted sampling utilities using Vose's alias method
StringMutatorFactory.java Implements dictionary support for String mutators
IntegralMutatorFactory.java Implements dictionary support for integral type mutators
MutatorFactory.java Updated interface to accept MutatorRuntime parameter
ArgumentsMutator.java Forwards method-level @DictionaryProvider annotations to parameters
TestSupport.java Adds helper methods for creating dummy MutatorRuntime in tests
All other factory files Updated to pass through MutatorRuntime parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@oetr oetr force-pushed the CIF-1785-dictionary-provider branch 5 times, most recently from 0044fb9 to 84707f2 Compare October 30, 2025 12:42
Copy link
Contributor

@simonresch simonresch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool feature!

I'm not entirely sure that the ability to have different dictionaries per field warrants the added complexity & state in the MutatorRuntime. Is this an essential part of the custom dictionary in your opinion? Having a single @DictionaryProvider per fuzz test could simplify the mutator instantiation and maybe even make it simpler to use for users.
Otherwise my main concern would be the duplication with DictionaryEntry and similar annotations (see other comment).

import java.lang.annotation.Target;

/**
* Provides dictionary values to user-selected mutator types. Currently supported mutators are:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that users might be confused on when to use the DictionaryProvider and when to use other dictionary annotations like DictionaryEntry, DictionaryFile.
"user-selected mutator types" might also not be descriptive enough and end with users annotating a myFuzzTest(byte[] data) fuzz test with @DictionaryProvider without realizing that the values are not used.

Would it make sense to add the @DictionaryProvider values in addition to the libFuzzer dictionary file? Otherwise I could foresee situations where one wants to specify the same dictionary entries with @DictionaryProvider and @DictionaryEntry.

Copy link
Contributor Author

@oetr oetr Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Renamed to @ValuePool
  • Values are not really dictionary entries that libfuzzer limits to 64 bytes

I think, instead of pouring these values into a dictionary, we should always use the mutation framework even when the user only has fuzzTestMethod(byte[] data), so that ValuePool will always work in a consistent manner. WDYT? We can fix that in another PR after discussing it

@oetr oetr force-pushed the CIF-1785-dictionary-provider branch from 84707f2 to c64c841 Compare November 19, 2025 12:17
@oetr oetr marked this pull request as draft November 19, 2025 12:17
@oetr oetr force-pushed the CIF-1785-dictionary-provider branch 5 times, most recently from 1ca7569 to b29031b Compare November 21, 2025 01:34
In addition to primitive arrays, these types are now also supported:
- List<Integer> []
- List<Integer> [][][]
@oetr oetr changed the title feat: dictionary provider for Strings and Integrals feat: add @ValuePool to propagate values to types Nov 21, 2025
Allow annotations to be inherited at multiple levels of the type hierarchy,
enabling both broad and specific configuration of mutators.

Use case: Configure mutators that share common types. For example, annotate
a fuzz test method to apply default settings to all String mutators, while
still allowing individual String parameters to override those settings with
different values.

Without this feature, an annotation could only appear once in the inheritance
chain, preventing this layered configuration approach.
@oetr oetr force-pushed the CIF-1785-dictionary-provider branch from b29031b to ba90fe6 Compare November 21, 2025 11:10
This adds a new user-facing annotation @valuepool for wiring values
directly to type mutators.

ValuePoolMutatorFactory is prepended before (almost) every mutator.
@oetr oetr force-pushed the CIF-1785-dictionary-provider branch 2 times, most recently from 7fc2b50 to 4cbb978 Compare November 21, 2025 13:17
@oetr oetr marked this pull request as ready for review November 21, 2025 13:53
Copy link
Contributor

@simonresch simonresch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! I really like the ValuePool name and the more centralized implementation. Really cool that it works for (almost) all mutators. Only have some minor comments regarding doc strings. Nice work!

if (type instanceof Class<?>) {
return Optional.of((Class<?>) type);
} else if (type instanceof ParameterizedType) {
return Optional.of((Class<?>) ((ParameterizedType) type).getRawType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raw type of a ParameterizedType does not have to be a Class<?>, no? Maybe something line return extractRawClass((ParameterizedType) type).getRawType()); would make sense?

* }</pre>
*
* In this example, the mutator for the String parameter {@code input} of the fuzz test method
* {@code fuzzerTestOneInput} will be using the values returned by {@code provide} method during
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be {@code valluesVisibleByAllArgumentMutators}?

* {@code fuzzerTestOneInput} will be using the values returned by {@code provide} method during
* mutation, while the mutator for String {@code anotherInput} will use values from both methods:
* from the method-level {@code ValuePool} annotation that uses {@code provide} and the
* parameter-level {@code ValuePool} annotation that uses {@code provideSomethingElse}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here valuesVisibleOnlyByAnotherInput.

String[] value();

/**
* This {@code ValuePool} will be used with probability {@code 1/p} by the mutator responsible for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be {@code p} instead of 1/p?

mutator.read(new DataInputStream(new ByteArrayInputStream(originalSerialized)));
mutator.write(roundTrip, new DataOutputStream(byteStream));
byte[] roundTripSerialized = byteStream.toByteArray();
return Arrays.equals(originalSerialized, roundTripSerialized);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what mutator types do we need the roundtrip check? Wouldn't it be enough to check if the mutator.write(...) call does not throw, to filter out e.g. CachedConstructorMutators or values where the type does not match?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this will also make sure that the value pool values respect other annotations of the particular mutator, correct? Clever!
Minor: Could be worth a comment that lists what this serialization stable entails.

public double extractFirstProbability(AnnotatedType type) {
// If there are several @ValuePool annotations on the type, this will take the most
// immediate one, because @ValuePool is not repeatable.
ValuePool[] dictObj = type.getAnnotationsByType(ValuePool.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: dictObj should probably be something like valuePoolAnnotations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants