Characteristic significance is the commonest device for explaining a machine studying mannequin. It’s so common that many information scientists find yourself believing that function significance equals function goodness.
It’s not so.
When a function is essential, it merely signifies that the mannequin discovered it helpful within the coaching set. Nevertheless, this doesn’t say something in regards to the capacity of the function to generalize on new information!
To account for that, we have to make a distinction between two ideas:
- Prediction Contribution: the burden {that a} variable has within the predictions made by the mannequin. That is decided by the patterns that the mannequin discovered on the coaching set. That is equal to function significance.
- Error Contribution: the burden {that a} variable has within the errors made by the mannequin on a holdout dataset. This can be a higher proxy of the function efficiency on new information.
On this article, I’ll clarify the logic behind the calculation of those two portions on a classification mannequin. I may even present an instance wherein utilizing Error Contribution for function choice results in a much better end result, in comparison with utilizing Prediction Contribution.
If you’re extra occupied with regression fairly than classification, you possibly can learn my earlier article “Your Features Are Important? It Doesn’t Mean They Are Good”.
- Ranging from a toy instance
- Which “error” ought to we use for classification fashions?
- How ought to we handle SHAP values in classification fashions?
- Computing “Prediction Contribution”
- Computing “Error Contribution”
- An actual dataset instance
- Proving it really works: Recursive Characteristic Elimination with “Error Contribution”
- Conclusions