Practical Notes on GBM Modelling in Radar

27 October , 2025
Neptune Jin article extended background

Practical Notes on GBM Modelling in Radar

I’ve been working with GBM modelling in Radar for a while now, and thought it’d be useful to put together a walkthrough of the process.

Radar is a tool commonly used for insurance pricing, but I’ve noticed there isn’t much practical discussion about it. I’m not sure why — maybe it’s considered too niche or too routine to write about, or maybe most of the learning just happens quietly within teams. It could also be that some see it as proprietary and best kept internal.

Either way, I’ve found it helpful to write down the scattered bits I’ve figured out — the kind of things that surface gradually, and are easy to miss later if not captured. Partly for myself, and partly in case it’s useful to someone else trying to work out what Radar is actually doingunder the hood.

I’ve been careful to keep this clear of anything sensitive — no data, no internal IP. Some points apply to GBM modelling more broadly, others are specific to how Radar behaves in practice.

GBM in Radar – Workflow

The typical flow of building a GBM model in Radar:

1. Set up the GBM Fitter by selecting your response, weight, and link structure (e.g., log-Poisson, logistic-binomial, log-Gamma/Tweedie).

2. Configure hyperparameters, fit, test, refit — repeat until you’re happy with the structure. (I have a separate note on Foundational Tips for Hyper-parameter Tuning )

3. Review evaluation metrics to identify the best number of rounds.

4. Once satisfied, click “Apply GBM”. This generates one or more GBM model components depending on whether cross-validation is enabled.

5. If CV is on, take the average of the ensemble models and pass that on to downstream steps.

Important: The predictions you should use are from the applied GBM models — not from the fitter. More on that below.

Selecting and Controlling Variables via the Factor Tab

The GBM Fitter interface includes several key tabs: Variable, Factor, Factor Importance, and (if using Layered GBM) Factor Importance (Sunburst).

Here’s the catch: Just because a variable shows up in the Variable tab doesn’t mean it will be used.

You need to tick the box beside it in the Factor tab for it to be included in model training. This two-step setup is easy to forget — and when you do, you’ll wonder why your enriched or engineered variable isn’t picked up.

It’s easy to overlook, as for most components in radar, appear in “variable” panel automatically means it is included and used in the functionality of that component. This one is different.

That said, the design is sensible. It gives you control over which variables should be active in the fitting. In cases of collinearity or overlapping predictive power, you might deliberately untick one.

So while it might seem annoying, it’s useful.

How and Why to Use Monotonic Constraints

When you tick a variable in the Factor tab, you can also define how it’s treated: categorical, ordered, monotonically increasing, or monotonically decreasing.

Radar is usually good at picking up categorical and ordered treatments if your bands are clean. But for continuous variables, worth taking a pause and inspect the monotonic feature.

If you already know the expected direction of the trend (e.g. NCD, age), apply a monotonic constraint. This stops GBM from picking up noise and generating statistically valid but illogical splits.

GBM is inherently less interpretable than GLM, so building in this kind of control helps you understand what your model is doing later.

Understanding GBM Starting Values: Constant vs. GLM Response

There’s a subtle but important difference between setting the GBM’s start value to:

• A constant equal to the average of the GLM predictions

• The full GLM predictions (row by row)

The first one helps the GBM start in a sensible place, but it’s still learning from scratch. The GBM built is still considered a ground up GBM.

The second one benefits from the explaining power of the GLM in each variables, whatever it learns from the GBM would be the residual information that got left out from the GLM – this is boosted GBM on top of a GLM.It’s a different structure. And when interpreting the results, especially for proof-of-concept tests or residual diagnostics, that difference matters.

It is not uncommon to see a GBM wrap up after only 2-3 rounds of tree fitting when it is built upon a GLM – most of the information already got picked up by the GLM. GBM ends early because it has nothing much to add.

A general warning: averages are dangerous.

They often mask structure. Whenever you find yourself using an average — especially one passed into a model — pause and check what assumptions it hides.

Why Cross-Validation Should Always Be On

By default, GBM (especially under a log-Poisson link) evaluates performance using negative log-likelihood (NLL).

NLL is a measure of how well your predicted distribution fits the observed data — lower is better.

Here’s the problem: in-sample NLL always decreases as more rounds are added, even if the model is just fitting noise. You could go to 500 rounds and get a beautiful in-sample loss curve, but it tells you nothing about generalisation.

That’s why cross-validation must be on: it gives you access to the out-of-sample NLL, which reflects real predictive performance.

Once this OOS NLL curve:

• Flattens

• Or starts increasing

you know the model has passed its optimal complexity. That round is your true sweet spot.

To use CV, you need a randomiser variable. Set it up with the number of folds you want (I usually use five), band it, and link it to the GBM Fitter.

Truncating to the Optimal Number of Rounds

Radar does not stop training when it hits the best number of rounds. If you set 200, it runs all 200, even if OOS performance peaked at round 153.

Well, to be fair, it would not know 153th is the best unless it finished all 200 rounds.

This means you must truncate manually:

1. Check where OOS NLL is lowest (e.g. round 153)

2. Go back to the fitter and change the maximum rounds to 153

3. Refit and re-apply

Now your applied models will be trained only up to that round, avoiding unnecessary complexity and overfitting.

This step is easy to skip — but critical if you’re trying to generalise well.

Averaging Applied GBMs to Get the Final Prediction

If CV is on and you apply the GBM, Radar gives you one applied model per fold. That means you get five (or however many folds you set).

You need to:

• Take the Predicted Value from each model

• Compute the row-wise average

Easiest way is to use a Formula component. Add them up, divide by 5, and use that value as your final model output. This becomes your ensemble prediction — generalised, balanced, and properly validated.

(There are some more advanced components that can do the average and other aggregations for you, but like I said, taking average could be dangerous. If possible, it’s always preferred to define the calculation yourself.)

Why You Should Never Use the Fitter’s Prediction

This is a common pitfall. The GBM Fitter does offer predicted outputs (Predicted Value.Output), but these should only be used for diagnostics.

When cross-validation is on, the fitter doesn’t represent any one real model. Its predictions might come from a model trained on the full dataset (used internally for reporting), or just one fold, or something else entirely.

I have tried to check which data segment it is using to generate the predictions, but I could not find it out. Suffice to say it won’t match any applied model or ensemble.

So just use it as a reference, see that when the performance is the best and that the predictions are not too far from expectation.

But use applied GBM predictions as the output piped into next steps.

The official Radar documentation says the same: don’t use the fitter’s predicted values for scoring or pricing. Always take output from the applied models.

Understanding Radar Outputs: Score vs. Prediction

Applied GBM model would give you more than one outputs. For example:

• Score.Output: the internal prediction on the log scale

• Predicted Value.Output: the final value on the original scale

For log-Poisson models: Predicted Value = Start Value × exp(Score)

That Start Value might be a constant or a prior model (like a GLM). The log-scale prediction (Score) is the pure GBM result, while Predicted Value transforms it for interpretation.

Use the right one:

• Use Predicted Value when scoring or pricing

• Use Score when debugging or checking what the GBM has learned

The meaning and formula could vary case by case, depending on the version of the software and the error structure you chose for the model, among other things.

When unsure, go to the Formula tab — it shows exactly how predictions are calculated and the relationships of each features.

This tab – Formula – is basic but invaluable. Sometimes the naming conventions of Radar can be misleading. Don’t assume. Assumptions can go very wrong. Always check in tab to ensure what they mean.

Tree Inspection, Formula Tab, and Naming Confusion

You can inspect tree structures in the Applied GBM.

It’s unlikely that you would go through the full 200 or 500 trees in total, but a scan over the first 10–20 trees are usually enough to understand how your model splits.

Sometimes this can surface useful insights — like where age, credit score, or certain code/type variables break into segments. The split point could be unexpectedly useful.

There’s a tree panel in the GBM Fitter as well, but the diagrams are not available in the fitter when CV is on — so, it is never there. Because we should always have CV on. Another reason to work from applied models.

Open Questions for Future

• What’s the best GBM setup to assess the added value of new variables? What is the best sequence of hyper-parameters set up (from “forgiving” one to “strict” ones)

• When is boosted GBM better than ground-up GBM, for proof-of-concept? For actual modelling?

• Layered GBM for interaction explorations

• A deeper dive on performance metrics: log-likelihood vs. MAE vs. RMSE

• Build and interpret PDP graphs

• model refresh: any clever tricks for EDA? SOP for targeted improvement?

• Why averages can be misleading in model logic

• Cross-validation vs. train/test split in pricing contexts

By Neptune Jin, Insurance Pricing specialist at Aioi Nissay Dowa Europe

Neptune Jin

#GBM #MachineLearning #Modelling

Related posts

image
image
image