Neuropsychological benchmark for executive function in LLMs — 6 tasks, 150 items, 33 models evaluated. EPI-5 index (5 tasks, T4 excluded). Kaggle × Google DeepMind AGI Hackathon 2026.
-
Updated
Apr 23, 2026 - Jupyter Notebook
Neuropsychological benchmark for executive function in LLMs — 6 tasks, 150 items, 33 models evaluated. EPI-5 index (5 tasks, T4 excluded). Kaggle × Google DeepMind AGI Hackathon 2026.
Intelligence isn’t IQ—it’s an OS.
Add a description, image, and links to the executive-functions topic page so that developers can more easily learn about it.
To associate your repository with the executive-functions topic, visit your repo's landing page and select "manage topics."