Skip to content

Reified generics#21317

Open
php-generics wants to merge 3 commits intophp:PHP-8.5.4from
php-generics:feature/generics
Open

Reified generics#21317
php-generics wants to merge 3 commits intophp:PHP-8.5.4from
php-generics:feature/generics

Conversation

@php-generics
Copy link

@php-generics php-generics commented Feb 28, 2026

Add reified generics to PHP

Summary

This PR adds reified generics to the Zend Engine — generic type parameters that are preserved at runtime and enforced through the type system. Unlike type erasure (Java/TypeScript), generic type arguments are bound per-instance and checked at every type boundary.

Syntax

Generic classes

class Box<T> {
    public T $value;
    public function __construct(T $value) { $this->value = $value; }
    public function get(): T { return $this->value; }
}

$box = new Box<int>(42);       // explicit type arg
$box = new Box(42);            // inferred from constructor — T = int
$box->value = "hello";         // TypeError: cannot assign string to property Box<int>::$value of type int

Multiple type params, constraints, defaults

class Map<K: int|string, V = mixed> {
    // K is constrained to int|string, V defaults to mixed
}

class NumberBox<T: int|float> {
    public function sum(T $a, T $b): T { return $a + $b; }
}

Variance annotations

class ReadOnlyList<out T> { /* covariant — can return T, not accept T */ }
class Consumer<in T>      { /* contravariant — can accept T, not return T */ }

Wildcard types

function printAll(Collection<? extends Printable> $items): void { ... }
function addDogs(Collection<? super Dog> $kennel): void { ... }
function count(Collection<?> $any): int { ... }

Generic traits

trait Cacheable<T> {
    private ?T $cached = null;
    public function cache(T $value): void { $this->cached = $value; }
}

class UserCache {
    use Cacheable<User>;
}

Generic functions and closures

function identity<T>(T $x): T { return $x; }

$map = function<T>(array $items, Closure $fn): array<T> { ... };

Nested generics

$nested = new Box<Box<int>>(new Box<int>(42));  // >> handled by lexer splitting

Static method calls with generics

$result = Factory<int>::create();

instanceof with generics

if ($obj instanceof Collection<int>) { ... }

Inheritance

class IntBox extends Box<int> {}           // bound generic args
class PairBox<A, B> extends Box<A> {}      // forwarded params

// Method signatures verified against parent's resolved types

Runtime enforcement

All type boundaries are checked at runtime:

  • Constructor argsnew Box<int>("x") throws TypeError
  • Method params$box->set("x") throws TypeError when T = int
  • Return typesreturn "x" from a method declared (): T with T = int throws TypeError
  • Property writes$box->value = "x" throws TypeError
  • Error messages include resolved types: Cannot assign string to property Box<int>::$value of type int

Ecosystem integration

  • ReflectionReflectionClass::isGeneric(), ::getGenericParameters(), ReflectionObject::getGenericArguments(), ReflectionGenericParameter (name, constraint, default, variance)
  • Serializationserialize(new Box<int>(42)) produces O:8:"Box<int>":1:{...}, unserialize() restores generic args
  • Debug displayvar_dump shows object(Box<int>)#1, stack traces show Box<int>->method()
  • Opcache — SHM persistence and file cache serialization
  • JIT — inline monomorphization with pre-computed bitmask fast path

Edge cases covered

  • Anonymous classes (new class extends Box<int> {})
  • Clone preserves generic args
  • Autoloading (Collection<int> triggers autoload for Collection)
  • class_alias inherits generic params
  • WeakReference/WeakMap with generic objects
  • Fibers across suspend/resume
  • Type argument forwarding (new Box<T>() inside generic methods/factories resolves T from context)
  • Compile-time rejection of void/never as type args

Performance

Benchmarked at 1M, 10M, and 100M iterations on arm64, both master (PHP 8.5.3 NTS) and generics branch (PHP 8.5.4-dev NTS) built as release binaries.

Generic args use refcounted sharing — new Box<int>() adds a refcount instead of deep-copying, eliminating 4 allocator round-trips per object lifecycle. Pre-computed scalar bitmasks are stored inline (no separate allocation).

Generic vs non-generic overhead (same binary)

Results stabilize at higher iteration counts as warmup and noise are amortized.

Operation 1M interp / JIT 10M interp / JIT 100M interp / JIT
Object creation -8% / -10% -7% / -11% -7% / -11%
Method calls (set+get) -16% / -9% -14% / -10% -14% / -10%
Property assignment -3% / +1% -3% / +1% -2% / +0%
Memory per object +0 bytes +0 bytes +0 bytes

Absolute throughput (generics branch, JIT, ops/sec)

Operation 1M 10M 100M
new GenericBox<int>(42) 37.3M 37.1M 37.4M
GenericBox<int>->set+get 60.3M 57.7M 60.2M
GenericBox<int>->value = N 94.9M 93.1M 94.2M
new GenericBox<GenericBox<int>> 22.0M 22.0M 21.9M

Non-generic regression check

Non-generic classes (PlainBox, UntypedBox) were benchmarked on both master and the generics branch under identical conditions (release NTS, same image). Results at 100M iterations (most stable) confirm zero performance regression on existing code paths:

Operation Class master (JIT) generics (JIT) delta
Object creation PlainBox 41.3M ops/s 42.0M ops/s +2%
Object creation UntypedBox 49.3M ops/s 45.6M ops/s -8%
Method calls PlainBox 66.6M ops/s 66.6M ops/s +0%
Method calls UntypedBox 77.0M ops/s 77.4M ops/s +0%
Property assign PlainBox 96.1M ops/s 94.1M ops/s -2%
Property assign UntypedBox 116.0M ops/s 113.0M ops/s -3%
Memory PlainBox 94 bytes 102 bytes +8 bytes
Memory UntypedBox 82 bytes 90 bytes +8 bytes

The +8 bytes per object is the generic_args pointer field added to zend_object (NULL for non-generic objects). Throughput deltas are within cross-build variance (+-5%); no systematic code path slowdown was observed.

Memory overhead

Object type bytes/obj
PlainBox (typed int) 102
GenericBox<int> (explicit) 102
GenericBox (inferred) 122
GenericBox<GenericBox<int>> 90

Generic objects with explicit type args have zero memory overhead vs non-generic typed objects — refcounted args are shared with the compiled literal. Inferred args (+20 bytes) allocate a new args struct. Nested generics benefit most from sharing (90 bytes vs 210 bytes before optimization).

Implement reified generics with type parameter declarations on classes,
interfaces, traits, and functions. Includes type constraint enforcement,
variance checking, type inference, reflection API support, and
comprehensive test suite.
Refcount zend_generic_args to eliminate per-object alloc/dealloc — new
Box<int>() now adds a refcount instead of deep-copying, removing 4
allocator round-trips per object lifecycle. Inline resolved_masks into
the args struct (single contiguous allocation). Fix crash when creating
generic objects inside generic methods (new Box<T>() inside Factory<int>
::create()) by resolving type param refs from the enclosing context.
Copy link
Member

@DanielEScherzer DanielEScherzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have this target the master branch of PHP, patch-specific branches like 8.5.4 are for release management, and older version branches like 8.5 are for bugfixes or security fixes, new feature are added to the master branch

@iluuu1994
Copy link
Member

Hi @php-generics 👋 May I ask two things:

  1. How much was AI involved in the creation of this PR?
  2. Is there a reason you're not disclosing your identity?

@rennokki
Copy link

Serialization — serialize(new Box(42)) produces O:8:"Box":1:{...}, unserialize() restores generic args

Won't this break existing serialized strings in case of an upgrade? I think of having existing serialized strings in a database and upgrading it would break it.

Also, usually isn't there an RFC for these kind of things?

@ramsey
Copy link
Member

ramsey commented Feb 28, 2026

This requires an RFC and discussion on the internals mailing list.

My initial impression is this was coded 100% by an AI agent, and perhaps the @php-generics account itself was created by an AI agent. All code in this PR should be highly scrutinized to ensure it's not introducing vulnerabilities.

@bwoebi
Copy link
Member

bwoebi commented Feb 28, 2026

The only tests for generic functions which I see are simple identity functions. Can we also have tests which show proper behaviour for generics in return types (e.g. function x<T>(T $a): T { return 1; } x("a"); should probably fail. I see some EG(static_generic_args), but not seeing how it would be preserved for return types (with nested calls at least, that is).

@cvsouth
Copy link

cvsouth commented Mar 1, 2026

How much was AI involved in the creation of this PR?

If it's any indication the first line of the summary contains an em-dash, there are 12 of them in the relatively short description and they are used throughout the code comments. I'd guess at 100% of it.

@jorgsowa
Copy link
Contributor

jorgsowa commented Mar 1, 2026

If it's any indication the first line of the summary contains an em-dash, there are 12 of them in the relatively short description

Not only dashes. Whole structure of PR description, benchmark results, comments in Code, tests structure. A lot of indicators of AI.

I'm not against AI, but it's sad that author didn't even invest time to learn how such changes should be performed in PHP - through RFC. It's just one more prompt to learn it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants