Skip to content

Conversation

@lionelkusch
Copy link
Collaborator

I created an Abstract class for the management of the group of features.

I update the name of groups by features_groups.

@lionelkusch lionelkusch linked an issue Aug 27, 2025 that may be closed by this pull request
@lionelkusch lionelkusch linked an issue Aug 27, 2025 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Aug 27, 2025

Codecov Report

❌ Patch coverage is 99.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 97.95%. Comparing base (5292877) to head (0463a83).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/hidimstat/base_variable_importance.py 98.07% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #357      +/-   ##
==========================================
- Coverage   98.00%   97.95%   -0.05%     
==========================================
  Files          22       22              
  Lines        1200     1223      +23     
==========================================
+ Hits         1176     1198      +22     
- Misses         24       25       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is mostly OK, thx.

Copy link
Collaborator

@jpaillard jpaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a high-level comment, related to #377
Groups could also be dealt with in a pipeline fashion.

It would give a pipeline GroupFeatures --> FeatureImportance, where GroupFeatures could either be pre-defined or computed in the fit with clustering. That would also solve #377 since the groups could be defined at the instantiation of any FeatureImportance method.

This approach would have similarities with the ColumnTransformer in sklearn.
Let me know what you think.

@bthirion
Copy link
Collaborator

bthirion commented Sep 9, 2025

Sorry, I'm missing what benefit you expect from it.
My main fear is that we are over-engineering the FeatureGroup object, making it hard to understand. But I may be off. I think that I'd like what you want to achieve in an example: what would it simplify to do it ?

@lionelkusch lionelkusch added the API 2 Refactoring following the second version of API label Sep 9, 2025
@lionelkusch
Copy link
Collaborator Author

I have a high-level comment, related to #377 Groups could also be dealt with in a pipeline fashion.

It would give a pipeline GroupFeatures --> FeatureImportance, where GroupFeatures could either be pre-defined or computed in the fit with clustering. That would also solve #377 since the groups could be defined at the instantiation of any FeatureImportance method.

This approach would have similarities with the ColumnTransformer in sklearn. Let me know what you think.

I don't see what you mean.
The question is if the group is defined at the instantiation of the class or when the method fit is called.
This has a different impact on the implementation.
However, I am afraid but it won't simplify the complexities.

@lionelkusch
Copy link
Collaborator Author

The addition of groups is equivalent to add a new dimension of the method, this is normal that the complexity will increase due to this.
Nevertheless, the groups are important for Perturbation methods. I just decouple the BasePerturbation and GroupVariableImportance because I think that there will be in the future a desire to have all the possible method using groups.
However, I don't know what is the best moment for including this functionality in the library.

@bthirion
Copy link
Collaborator

bthirion commented Sep 9, 2025

We should not add a feature that is not motivated by a use case. We want to add a class to handle groups. Would it make any example significantly simpler ? What is the actual benefit ?

@lionelkusch
Copy link
Collaborator Author

We should not add a feature that is not motivated by a use case. We want to add a class to handle groups. Would it make any example significantly simpler ? What is the actual benefit ?

The group is used for LOCI (see PR #358) which is not inherited from BasePerturbation because BasePerturbation is the logic of marginal importance and feature importance is different.
The functionality of the group is the same for LOCI, PFI, LOCO and CFI. I don't see arguments for having the functionality of groups only for the method of BasePerturbation. From what I see in the domain of explainable AI, grouping features together is very common. I think that it's quite an easy bet that users will want this functionality as a future requirement for the library. In this case, it's better to adapt the architecture now than the latter.

@bthirion
Copy link
Collaborator

I think that there is a full agreement regarding the need to support feature groups. The question is whether adding a GroupFeatureImportance class is the right way to proceed. Two alternatives come to my mind:

  • First, natively support features groups in the BaseVariableImportance methods. feature_groups would be an optional parameter, if None, it means that each feature is its own group.
  • Maybe use a VariablesGroup mixin (not sure it would work).
    Can someone explain what is the advantage of the proposed implementation ?

@lionelkusch
Copy link
Collaborator Author

I think that there is a full agreement regarding the need to support feature groups. The question is whether adding a GroupFeatureImportance class is the right way to proceed. Two alternatives come to my mind:

* First, natively support features groups in the BaseVariableImportance methods. feature_groups would be an optional parameter, if None, it means that each feature is its own group.

* Maybe use a VariablesGroup mixin (not sure it would work).
  Can someone explain what is the advantage of the proposed implementation ?

What do you mean by VariablesGroup mixing?

If I am correct, this was the spirit when I was coding this class.

For native support, I am a bit against because this means that most of the methods should support feature groups, which is not the case, actually.

@bthirion
Copy link
Collaborator

Mixin, not Mixing.
https://stackoverflow.com/questions/533631/what-is-a-mixin-and-why-is-it-useful

@lionelkusch
Copy link
Collaborator Author

@jpaillard @bthirion Last review?

@bthirion
Copy link
Collaborator

bthirion commented Oct 2, 2025

Good with me.

Copy link
Collaborator

@jpaillard jpaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few minor comments, but otherwise it looks good.

lionelkusch and others added 11 commits October 3, 2025 11:35
Co-authored-by: Joseph Paillard <joseph.paillard@inria.fr>
Co-authored-by: Joseph Paillard <joseph.paillard@inria.fr>
Co-authored-by: Joseph Paillard <joseph.paillard@inria.fr>
Co-authored-by: Joseph Paillard <joseph.paillard@inria.fr>
Co-authored-by: Joseph Paillard <joseph.paillard@inria.fr>
Co-authored-by: Joseph Paillard <joseph.paillard@inria.fr>
@lionelkusch lionelkusch requested a review from jpaillard October 3, 2025 12:47
Copy link
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx.

jpaillard and others added 5 commits October 5, 2025 14:18
* first commit

* add section structure

* [doc quick] [skip tests] skip

* missing .rst? [doc quick] [skip tests]

* fix link [doc quick] [skip tests]

* add CFI fiirst draft and TSI [doc quick] [skip tests]

* missing space? [doc quick] [skip tests]

* replace space

* [doc quick] [skip tests]

* add ref and note section [doc quick] [skip tests]

* add code snippets

* typo cfi [doc quick] [skip tests]

* add total sobol index ref [doc quick] [skip tests]

* add copy button [doc quick] [skip tests]

* missing sphinx requirements [quick doc] [skip tests]

* add copybutton config

* [doc quick] [skip tests]

* solve example test

* clarify "sub-model" for classif and regression

* trry add figure + update note

* add intro

* try fix image path

* trigger CI

* try not to scale

* definition

* try image

* [skip tests] trigger CI

* [tests skip] another one

* trigger CI

* back to figure

* add inference section

* add reff

* skip tests

* [skip tests]

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: lionel kusch <lionel.kusch@grenoble-inp.org>

* [skip tests] format bullet

* rephrase not

* [skip tests]

* [skip tests]

* [skip tests] linkcheck generated ignore images

* [skip tests] linkcheck generated ignore

* review

* trigger CI

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

* genetic example

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

* Update docs/src/model_agnostic_methods/total_sobol_index.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: bthirion <bertrand.thirion@inria.fr>

* Update docs/src/model_agnostic_methods/conditional_feature_importance.rst

Co-authored-by: bthirion <bertrand.thirion@inria.fr>

* Update docs/src/model_agnostic_methods/total_sobol_index.rst

Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>

---------

Co-authored-by: jpaillard <joseph.paillard@inria.fr>
Co-authored-by: lionel kusch <lionel.kusch@grenoble-inp.org>
Co-authored-by: Ángel Reyero Lobo <angelreyerolobo@gmail.com>
Co-authored-by: bthirion <bertrand.thirion@inria.fr>
@lionelkusch lionelkusch merged commit ae16933 into mind-inria:main Oct 9, 2025
22 checks passed
@lionelkusch lionelkusch deleted the PR_create_group_base branch October 9, 2025 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API 2 Refactoring following the second version of API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Find a name for "group" of feature

4 participants