More General Weighted Binning

You might be wondering what motivates me spending countless weekend hours on the MOB package. The answer is plain and simple. It is users that are driving the development work.

After I published the wts_bin() function last week showing the impact of two-value weights on the monotonic binning outcome (https://statcompute.wordpress.com/2019/04/21/binning-with-weights), a question was asked if I can write a more general weighted binning function with weights being any positive value. The function wqtl_bin() is my answer (https://github.com/statcompute/MonotonicBinning/blob/master/code/wqtl_bin.R).

Below is an example demonstrating how to use the wqtl_bin() function. First of all, let’s apply the function to the case with two-value weights that was illustrated last week. As expected, statistics from both approaches are identical. In the second use case, let’s assume that weights can be any value under the Uniform distribution between 0 and 10. With positive random weights, all statistics have changed.

It is worth mentioning that, while binning rules can be the same with or without weights in some cases, it is not necessarily true in all situations, depending on the distribution of weights across the data sample. As shown in binning outcomes for “ltv” below, there are 7 bins without weights but only 5 with weights.