Permutation Feature Importance DOC #553

AngelReyero · 2025-12-08T14:05:52Z

No description provided.

…50-doc-permutation-feature-importance

docs/tools/references.bib

codecov · 2025-12-08T15:16:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.37%. Comparing base (eda767d) to head (6e7ef93).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #553   +/-   ##
=======================================
  Coverage   98.37%   98.37%           
=======================================
  Files          23       23           
  Lines        1602     1602           
=======================================
  Hits         1576     1576           
  Misses         26       26

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jpaillard · 2025-12-08T17:35:55Z

docs/src/model_agnostic_methods/permutation_feature_importance.rst

+Note that this method was initially introduced as the mean decrease accuracy (MDA) 
+by :footcite:t:`breimanRandomForests2001` for Random Forests. It was initially proposed 
+as an heuristic Variable Importance Measure and not as a formal estimator of a 
+interesting theoretical quantity. Moreover, it was shown in
+:footcite:t:`benard2022SobolMDA` that PFI estimates a quantity that can be decomposed
+as the sum of the Total Sobol Index (TSI) :ref:`total_sobol_index` and two extra terms 
+that are not significant due to correlations. Thus, the theoretical quantity estimated by PFI is 
+not a relevant quantity contrarily to :ref:`leave_one_covariate_out` or 
+:ref:`conditional_feature_importance`.


Here, I would be a bit more direct, saying that it does not estimate any meaningful / previously studied quantity.

Maybe we can make it a Note like for "extrapolation issues" below?

jpaillard · 2025-12-08T17:37:39Z

docs/src/model_agnostic_methods/permutation_feature_importance.rst

+estimating a conditional sampler as in :ref:`conditional_feature_importance`. Since
+the distribution from which we are sampling is the marginal distribution of the feature
+breaking the relationship with the others, a simple permutation of the feature values
+across the individuals is sufficient. Also, note that the same estimated model is 


Suggested change

estimating a conditional sampler as in :ref:`conditional_feature_importance`. Since

the distribution from which we are sampling is the marginal distribution of the feature

breaking the relationship with the others, a simple permutation of the feature values

across the individuals is sufficient. Also, note that the same estimated model is

estimating a conditional sampler as in :ref:`conditional_feature_importance`. A simple permutation

of the feature values across the individuals is sufficient since the distribution from which we are sampling is the

marginal distribution of the feature, thus breaking the relationship with the others. Also, note that the

same estimated model is

docs/tools/references.bib

jpaillard · 2025-12-08T17:40:21Z

docs/tools/references.bib

+  url       = {https://doi.org/10.1007/s11222-021-10057-z},
+  abstract  = {This paper reviews and advocates against the use of permute-and-predict (PaP) methods for interpreting black box functions. Methods such as the variable importance measures proposed for random forests, partial dependence plots, and individual conditional expectation plots remain popular because they are both model-agnostic and depend only on the pre-trained model output, making them computationally efficient and widely available in software. However, numerous studies have found that these tools can produce diagnostics that are highly misleading, particularly when there is strong dependence among features. The purpose of this work is to review the growing body of literature, demonstrate these drawbacks, explain why they occur, and advocate for alternative measures involving additional modeling. In particular, breaking dependencies between features forces extrapolation into sparse regions of the feature space, over-emphasizing correlated features in both variable importance measures and partial dependence plots.}


Suggested change

url = {https://doi.org/10.1007/s11222-021-10057-z},

abstract = {This paper reviews and advocates against the use of permute-and-predict (PaP) methods for interpreting black box functions. Methods such as the variable importance measures proposed for random forests, partial dependence plots, and individual conditional expectation plots remain popular because they are both model-agnostic and depend only on the pre-trained model output, making them computationally efficient and widely available in software. However, numerous studies have found that these tools can produce diagnostics that are highly misleading, particularly when there is strong dependence among features. The purpose of this work is to review the growing body of literature, demonstrate these drawbacks, explain why they occur, and advocate for alternative measures involving additional modeling. In particular, breaking dependencies between features forces extrapolation into sparse regions of the feature space, over-emphasizing correlated features in both variable importance measures and partial dependence plots.}

…ub.com/mind-inria/hidimstat into 550-doc-permutation-feature-importance

docs/src/model_agnostic_methods/permutation_feature_importance.rst

bthirion · 2025-12-10T18:03:49Z

hidimpy/bin/Activate.ps1

OOps, what are all these files ?

jpaillard · 2025-12-11T13:52:19Z

It looks almost good to me. I still have comments regarding the .bib file, from which I suggest removing links and abstracts.
Also, suggest adding at the end of the page of PFI, which will add links to the examples using PFI.

Examples
--------

.. minigallery:: hidimstat.PFI

I see that the test are not passing but merging #558 shoud improve

Permutation Feature Importance DOC

384d0b7

AngelReyero linked an issue Dec 8, 2025 that may be closed by this pull request

[DOC] Permutation Feature Importance #550

Open

AngelReyero requested a review from jpaillard December 8, 2025 14:06

AngelReyero added 2 commits December 8, 2025 15:22

Correct cite

ba31eb6

Merge branch 'main' of https://github.com/mind-inria/hidimstat into 5…

8d417c2

…50-doc-permutation-feature-importance

jpaillard reviewed Dec 8, 2025

View reviewed changes

docs/tools/references.bib Show resolved Hide resolved

jpaillard added 2 commits December 8, 2025 16:05

Update docs/tools/references.bib

d53b42e

Merge branch 'main' into 550-doc-permutation-feature-importance

7a6af8d

AngelReyero and others added 3 commits December 8, 2025 17:44

Total Sobol Index doc and restructure the concept section.

c3191f2

heading of concepts

5508804

Merge branch 'main' into 550-doc-permutation-feature-importance

6e7ef93

jpaillard reviewed Dec 8, 2025

View reviewed changes

GLM coefficients

4a155c2

jpaillard mentioned this pull request Dec 10, 2025

[SPRINT] December 8 2025, Montpellier #545

Closed

9 tasks

AngelReyero added 2 commits December 10, 2025 16:12

GLM coefficients and PFI corrections

8a77d84

Merge branch '550-doc-permutation-feature-importance' of https://gith…

18c80bc

…ub.com/mind-inria/hidimstat into 550-doc-permutation-feature-importance

bthirion reviewed Dec 10, 2025

View reviewed changes

Mean decrease in accuracy

5076f49

References without links

3027559

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Permutation Feature Importance DOC #553

Permutation Feature Importance DOC #553

Uh oh!

AngelReyero commented Dec 8, 2025

Uh oh!

Uh oh!

codecov bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

jpaillard Dec 8, 2025

Uh oh!

jpaillard Dec 8, 2025

Uh oh!

Uh oh!

jpaillard Dec 8, 2025

Uh oh!

Uh oh!

bthirion Dec 10, 2025

Uh oh!

jpaillard commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		url = {https://doi.org/10.1007/s11222-021-10057-z},
		abstract = {This paper reviews and advocates against the use of permute-and-predict (PaP) methods for interpreting black box functions. Methods such as the variable importance measures proposed for random forests, partial dependence plots, and individual conditional expectation plots remain popular because they are both model-agnostic and depend only on the pre-trained model output, making them computationally efficient and widely available in software. However, numerous studies have found that these tools can produce diagnostics that are highly misleading, particularly when there is strong dependence among features. The purpose of this work is to review the growing body of literature, demonstrate these drawbacks, explain why they occur, and advocate for alternative measures involving additional modeling. In particular, breaking dependencies between features forces extrapolation into sparse regions of the feature space, over-emphasizing correlated features in both variable importance measures and partial dependence plots.}

Permutation Feature Importance DOC #553

Are you sure you want to change the base?

Permutation Feature Importance DOC #553

Uh oh!

Conversation

AngelReyero commented Dec 8, 2025

Uh oh!

Uh oh!

codecov bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jpaillard Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

jpaillard Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jpaillard Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bthirion Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

jpaillard commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Dec 8, 2025 •

edited

Loading