Contents

Mastering Logistic Regression: Unpacking WoE and IV Metrics for Variable Selection

   Apr 24, 2023     9 min read

Learn how to interpret Weight of Evidence (WoE) and Information Value (IV) for variable selection, predictive analysis, and logistic regression modeling.

Building a good classification model is not just about choosing algorithms, it’s essential to understand how variables behave and how they relate to the event of interest.

That’s where Weight of Evidence (WoE) and Information Value (IV) come in. These two metrics are widely used in credit scoring, variable selection, and predictive modeling.

They help measure a variable’s discriminatory power and provide insights into the direction and strength of the relationship between predictors and the target variable.

In this article, you will learn:

  • how to interpret WoE and IV in practice;
  • how these metrics support the development of logistic regression models;
  • how to use WoE and IV for category grouping (binning);
  • how to transform variables into more interpretable and predictive features.

To make the concepts more intuitive, we’ll use practical examples based on the dataset from the Kaggle Titanic competition.

Metric Overview

In the previous post on how to calculate WoE and IV in Python, we explored how to build the functions responsible for computing these metrics. Now, we’ll focus on how to interpret the results and extract insights for logistic regression models.

These metrics help evaluate how each explanatory variable relates to the response variable.

The main metrics are:


Class Proportions (0 and 1)

Represents the distribution of the target variable within each segment of the analyzed feature. This metric helps you understand how events are distributed across categories.


Weight of Evidence (WoE)

Measures the discriminatory power of each category. The farther the WoE value is from zero, the stronger the segment’s ability to distinguish between classes.

In general:

WoEInterpretation
WoE > 0higher relative concentration of non-events (target₀)
WoE ≈ 0similar distribution between classes.
WoE < 0higher relative concentration of events (target₁)

Information Value (IV)

Measures the overall predictive power of the variable. The total IV is obtained by summing the contributions of each segment.

The table below presents a commonly used classification for interpreting the predictive power of a variable:

IVInterpretation
IV ≤ 0.02Variable with no predictive power
0.02 < IV ≤ 0.10Weak predictive power
0.10 < IV ≤ 0.30Medium predictive power
0.30 < IV ≤ 0.50Strong predictive power
IV > 0.50Very strong predictive power (possible data leakage)

Extremely high IV values may indicate information leakage (data leakage), especially when the variable has a direct relationship with the target event.


It is important to note that rare categories may exhibit extreme WoE values but still contribute little to the total IV due to their low population representativeness.

These metrics are extremely useful across several modeling stages, especially in exploratory analysis, feature engineering, category grouping (binning), and variable selection. Ultimately, by correctly interpreting WoE and IV, it becomes easier to enhance the interpretability of logistic regression models and to build models that are more robust and better aligned with the underlying data behavior.


Creating Variables Through Category Grouping (Binning)

Category grouping (binning) is a widely used strategy for creating variables in predictive models, especially in logistic regression and credit scoring.

The idea is to combine categories or intervals that exhibit similar behavior with respect to the target variable, using metrics such as Weight of Evidence (WoE) and Information Value (IV) to support the analysis.

By adopting this approach, the model becomes more robust. This process simplifies variables and reduces data dimensionality, which consequently reduces the risk of overfitting and increases statistical stability. In addition, binning improves model interpretability and helps establish more linear relationships between explanatory variables and the logistic regression logit.

However, some precautions are important during the grouping process:

  • grouped categories should exhibit similar behavior;
  • segments with very different WoE values should not be combined;
  • excessive grouping may significantly reduce the variable’s predictive power;
  • very rare categories may produce unstable WoE values.

This manipulation introduces a well‑known trade‑off: when categories are grouped, a natural reduction in Information Value (IV) occurs, since part of the variable’s original discriminatory power is smoothed out. Therefore, the main challenge in binning is to find the balance between model simplification, statistical stability, and preservation of relevant information. When done correctly, category grouping can significantly improve the robustness and generalization ability of the predictive model.

Practical Application

Now let’s put into practice the concepts of Weight of Evidence (WoE) and Information Value (IV) using the dataset from the Titanic - Machine Learning from Disaster competition.

Our goal will be to interpret the metrics, identify predictive variables, and understand how to use WoE and IV in the creation of new variables for logistic regression models.


Example with a Discrete Variable

When we apply the Woe_IV_Discrete function to the Sex variable, we obtain the following table:

Sextarget₀target₁DistrWoEIVIV_total
female0.1475410.6812870.216562-1.5298770.8165651.341681
male0.8524590.3187132.6746880.9838330.5251161.341681

The interpretation of these results is quite intuitive:

  • the negative WoE value for female indicates a stronger association with survival (target₁);
  • the positive WoE value for male indicates a stronger association with non‑survival (target₀);
  • the IV_total = 1.34 indicates an extremely high discriminatory power for the Sex variable.

In credit scoring problems, variables with very high IV values typically require additional attention, as they may indicate excessive class separation or potential information leakage.


Example with a Continuous Variable

In addition to categorical variables, WoE and IV can also be applied to continuous variables after a discretization (binning) process.

VariableIntervaltarget₀target₁DistrWoEIV
Fare<= 7.550.1438980.0380123.7856241.3312110.14
Fare7.55 – 7.85420.1111110.0760231.4615380.3794900.01
Fare7.8542 – 8.050.1584700.0555562.8524591.0481810.11
Fare8.05 – 10.50.1092900.0526322.0765030.7306850.04
Fare10.5 – 14.45420.0874320.1052630.830601-0.1856060.00
Fare14.4542 – 21.67920.0928960.1081870.858662-0.1523800.00
Fare21.6792 – 270.0783240.1345030.582324-0.5407290.03
Fare27 – 39.68750.1038250.0994151.0443590.0434030.00
Fare39.6875 – 77.95830.0765030.1374270.556679-0.5857660.04
Total1.0000001.0000001.0000000.0000000.37

The Fare variable also shows strong predictive power (IV = 0.37).

In addition, the WoE analysis allows us to identify ranges with similar behavior, making it possible to group categories (binning).

Observe that:

  • ranges with WoE > 0 tend to be more associated with non‑survival;
  • ranges with WoE < 0 tend to be more associated with survival;
  • WoE values close to zero indicate neutral behavior.

Based on these results, we can create a binary variable indicating, for example, whether the fare is less than or equal to 10.5.


Creating New Variables

Based on the WoE and IV analysis, we can construct derived variables that make the model simpler and more interpretable.

Example:

PassengerIdSurvivedSexFareFLG_femaleFLG_Fare_leq_10.5
10male7.250001
21female71.283310
31female7.925011
8910male7.750001

In this scenario, we create binary variables (flags) to simplify the information:

  • FLG_female identifies female passengers;
  • FLG_Fare_leq_10.5 identifies fares less than or equal to 10.5.

This type of transformation enhances model interpretability and ensures greater statistical stability. In addition, by reducing noise from the original data, the approach improves the model’s generalization and overall predictive performance.


Conclusion

The Weight of Evidence (WoE) and Information Value (IV) metrics are extremely useful tools for variable selection, exploratory analysis, category grouping (binning), creation of derived variables, and interpretation of logistic regression models.

In addition to contributing to more interpretable and robust models, WoE and IV make it possible to understand how variables behave across different population segments, supporting both attribute selection and feature engineering for binary classification problems.


Additional Resources

References:

  • Anderson, Raymond. The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation. Oxford University Press, 2007.

  • Siddiqi, Naeem. Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. Wiley, 2006.

  • Sudarson Mothilal Thoppay (2015). woe: Computes Weight of Evidence and Information Values. R package version 0.2. https://CRAN.R-project.org/package=woe

  • Thilo Eichenberg (2018). woeBinning: Supervised Weight of Evidence Binning of Numeric Variables and Factors. R package version 0.1.6. https://CRAN.R-project.org/package=woeBinning