Cash Insurance: AI‑Driven Pricing & Actuarial Risk

3.1. Project Context

Cash Insurance is a health plan operator (a FNAT Business Case) facing a severe actuarial imbalance and cash flow crisis. The standard pricing model was locked into static tables that did not reflect the policyholder's real risk.

This model created a scenario of financial cross‑subsidization. In practice, younger, healthier clients (the profitability engine of the portfolio) were paying disproportionately high premiums to cover the deficit generated by high‑risk groups, whose medical costs exceeded collected revenue. Before this dashboard, the board was operating blindly using disconnected spreadsheets, watching margins melt without being able to isolate the financial “drains.”

3.2. Project Objectives

The project aimed to build an analytical intelligence pipeline to stop margin loss, focusing on:

Main Objective: Eliminate the accounting imbalance by recalibrating premiums, with the non‑negotiable goal of guaranteeing exactly 75% Target Loss Ratio across all segments.
Predictive Centralization: Use Artificial Intelligence to mathematically group the risk profiles of the active base.
Executive Visibility: Translate statistical algorithms into an automated managerial Income Statement (P&L) via Power BI.

3.3. Data and Modeling

Data was extracted from the billing ERP and claims cost reports (medical expenses). The process was conducted at contract/beneficiary granularity, covering a 12‑month usage period.

We used Python to run the unsupervised K‑Means algorithm, clustering the portfolio based on Age, BMI, and Medical Cost. After applying the Elbow Method, we defined 4 optimal clusters. The enriched base was stored in a PostgreSQL relational database.

Python

from sklearn.cluster import KMeans
# Mapping actuarial risk
# Defining 4 clusters based on Elbow Method analysis
kmeans = KMeans(n_clusters=4, random_state=42)
df['Cluster_Risco'] = kmeans.fit_predict(df[['Idade', 'IMC', 'Custo_Assistencial']])

# Financial logic for accounting rebalancing
# The New Premium must guarantee exactly the profitability required by the board
target_loss_ratio = 0.75
df['Novo_Premio_Calculado'] = df['Custo_Assistencial'] / target_loss_ratio

In Power BI, a Star Schema model was built. Advanced DAX measures were developed to process the dynamic P&L recalculation (calculating Projected Revenue = SUMX(Table, [Novo_Premio_Calculado]) and Loss Ratio = DIVIDE([Total Costs], [Projected Revenue])).

3.4. Dashboard Overview

The dashboard was designed with a C‑level decision‑making focus, divided into two fundamental views:

Risk & Dispersion View: A scatter plot crossing Age and Medical Cost, visually illustrating the 4 clusters created by K‑Means. Next to it, a column chart compares Current vs. Projected Loss Ratio.
P&L and Financial Matrices: A dedicated Managerial P&L page. Two matrices compare Revenue, Cost, and Operating Margin side by side (Old Scenario vs. AI‑Driven Scenario).

3.5. Business Insights

The algorithmic analysis revealed that Cluster 1 (composed mostly of older policyholders with high BMI) operated with a loss ratio above 120%, bleeding the company’s cash. Clusters 2 and 3 (healthy young individuals) had a loss ratio of 40%, indicating the company was overcharging to retain this group and risking losing them to competitors. The identified risk was portfolio collapse if the healthy base opted for churn (cancellation).

3.6. Accounting & Financial Perspective

In advanced data projects, the statistical model only proves its effectiveness if it generates cash results. The development of the parametrized Managerial Income Statement (P&L) in DAX was crucial.

The Projected P&L validated the thesis: by applying the new premiums, the operating loss in the risk cluster was eliminated, and the Operating Loss Ratio hit exactly 75.00% across 100% of policyholder profiles. The model helped managerial accounting move from a “firefighting” logic to precise revenue and provision predictability.

3.7. Legal, Governance & Compliance Perspective

Health pricing attracts strong regulatory scrutiny from Brazil’s ANS (National Health Agency) and the Elderly Statute. The use of Artificial Intelligence for adjustments requires ethical governance to avoid abusive algorithmic discrimination.

Note: The analyses and segmentations proposed in this case are analytical interpretations of business intelligence and corporate management support. They DO NOT constitute formal legal advice.

The project operated in compliance with LGPD (Brazil’s data protection law), ensuring that all K‑Means data processing was strictly anonymized, shielding the company against risks of fines for leaking sensitive health data (PHI).

3.8. Data Quality and Limitations

Although the database offers perfect granularity by anonymized CPF (Brazilian tax ID), the main analytical limitation of the K‑Means model is its unsupervised nature focused on the present (a static snapshot). It perfectly segments costs that have already occurred, but requires retraining routines and new loads via PostgreSQL to accommodate medical inflation (VCMH) or the natural aging of the active base over the coming years.

3.9. Results and Executive Impact

Margin Leveling: Complete eradication of the operating deficit in high‑risk segments, ensuring net profit predictability (target 75%).
Retention Optimization: The smart recalculation reduced the premium burden on the young portfolio, preventing churn and shielding the base from competitors.
Managerial Automation: Consolidation via Views in PostgreSQL and ODBC integration with Power BI drastically reduced the manual hours spent by the controllership closing spreadsheets at month‑end.

3.10. Project Tech Stack

The technical architecture of this project combined engineering robustness and visualization:

Power BI: Star Schema modeling, Data Storytelling, executive conditional formatting.
DAX Language: Financial intelligence calculations and parameterized P&L.
Python (Scikit‑Learn, Pandas): Scripting for data cleaning and predictive clustering via the K‑Means algorithm.
SQL & PostgreSQL: Creation of mathematical Views for server‑side premium recalculation.
ODBC: Direct data integration between the relational database and the semantic layer.

3.11. Learnings and Next Steps

From a strategic perspective, the main learning was validating that complex mathematical rules only gain board buy‑in when translated into accounting language. On the technical side, the next evolutionary step for Cash Insurance will be to integrate supervised models (e.g., Random Forest) capable of predicting the exact probability that a new contract will become deficit‑prone before the policy is even issued, protecting the operator’s “front door.”

3.12. Conclusion

Cash Insurance exemplifies the true essence of advanced Analytics. The value of a Senior Data Analyst goes beyond simply creating beautiful charts; it demands financial savvy, accounting precision, and perception of corporate legal risks.

By integrating Python, PostgreSQL, and Power BI, we moved from a scenario of margin collapse due to blind cross‑subsidies to a sustainable, fair, and predictable ecosystem. This case crowns the perfect intersection between data science and business intelligence.

Cash Insurance: AI Applied to Pricing and Actuarial Risk

How I developed a predictive AI engine to resegment a health insurance portfolio, eliminating cross-subsidization and hitting a strict 75% target loss ratio via Power BI.

Executive Dashboard: Financial Impact by Risk Cluster

Tech Stack Used in this Project