Churn

Search the documentation...

Modules

Churn

Churn prediction is one of the core components within Onesurance's Top Defend. This module, built from multiple machine learning models working together, aims to predict which customers are at increased risk of canceling their insurance policies over the next 12 months. By identifying these customers in a timely manner, it becomes possible for insurance organizations to take targeted actions to promote customer retention and improve portfolio quality.

Please note that while in the Netherlands people often speak of 'churn' and in Belgium of 'termination' or 'churn', in this documentation we use the umbrella term churn. By this we mean the (partial or complete) termination of a customer relationship within the insurance portfolio.

Application of the model

The churn score resulting from this model is applied within Top Defend, where it is an integral part of prioritizing customer actions. In addition, the churn score is combined with Customer Lifetime Value (CLV) to segment customers into valuable categories, such as loyal growth opportunities, critical customers with high termination risk or structurally lossy relationships.

Definition of churn / churn

Good churn prediction starts with a clear and relevant definition of churn. In the context of non-life insurance, several forms of termination can be considered churn . Some possible definitions are:

Canceling one policy
Canceling all policies within one main line of business
Canceling all claims policies
Canceling all essential policies (default setting)
Canceling all policies within the customer relationship

The standard approach within Onesurance focuses on canceling all essential policies, as this typically indicates termination of the entire customer relationship. What is considered essential is adjustable in consultation with the Customer Success Manager.

Industries identified as essential by default:

Motor Insurance
Legal expenses insurance
Liability Insurance

Data usage and variables

For training the churn models, dozens of variables are created based on the available data. These features are divided into the following categories:

Policy information (type, term, premium structure)
Relationship data (age, place of residence, customer duration)
Claim data (frequency, size, claim behavior)
Contact moments (number of interactions, customer inquiries, mutations)
External data, such as demographic and geographic data

The exact number of variables used varies by organization and depends on both data completeness and the predictive power of the characteristics within the historical dataset.

Model training and validation

The churn models are trained on dates from previous calendar years, except for the most recent year. The latter year is used for model validation. For example, for a dataset from 2020 to 2025, the model is trained on data from 2020 to 2023, and validated based on the year 2024.

Validation involves the model making "blind" predictions over the validation year, without access to the outcomes. Performance is then measured by comparing how well the predictions match actual termination times.

Multiple models are trained and optimized for each main industry. These are then combined to produce an accurate, personalized royalty score per relationship.