Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No n_features_to_select parameter #92

Open
bgalvao opened this issue Jan 26, 2021 · 1 comment
Open

No n_features_to_select parameter #92

bgalvao opened this issue Jan 26, 2021 · 1 comment

Comments

@bgalvao
Copy link

bgalvao commented Jan 26, 2021

Although I understand that Boruta is, by design, an all-relevant feature selection method, it would be nice to have the option to select a specified number of features.

As of right now, BorutaPy presents ranking 1 through 3 (relevant, tentative, rejected).

I am thinking of looking through the statistical tests and return the ranking by p-value. If you like this issue and have a clear idea how to implement it, let me know.

I am trying to work on it on my fork.

@DreHar
Copy link

DreHar commented Jan 26, 2021

I know this doesnt directly answer your question. When I want to minimize the features I often do a feature reduction after the all relevant feature selection step. Forward or backward stepwise feature elimination depending on whether you want choose very few features or only drop a few respectively. I have also found that some simulated annealing helps a lot in practice.

This might help in practice because highly correlated features will all have high p values. So you might throw out features which are less statistically relevant but have more orthogonal value.

Sorry for the tangent but thought it might help

@bgalvao bgalvao changed the title No n_features parameter No n_features_to_select parameter Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants