1. Business Context
Insight Academy, a digital education platform (EdTech), faced a dangerous paradox. Although the company recorded high sales volumes and boasted a consolidated base of 45,815 students, the board noticed chronic low engagement and struggled to extract Lifetime Value (LTV).
The core issue lay in the generic way the user base was analyzed. The managerial belief in the "average student" masked true user behavior: high initial revenue created a false sense of Customer Success. There was a glaring lack of data intelligence to cross-reference financial friction levels (PIX—Brazil's instant payment system—vs. Credit Card) with platform adoption rates.
2. Project Objectives
- Discover Mathematical Personas: Use unsupervised algorithms to cluster customers based on actual behavior, not marketing guesswork.
- Mitigate Churn Risk: Identify and financially quantify inactive customers, classifying their urgency level.
- Direct CS (Customer Success) Actions: Create a daily tactical tool so the team knows exactly who to contact and when, in an automated and bilingual format.
3. Data & Machine Learning Modeling
The analytical pipeline began with an audit of 27 transactional and behavioral variables. Using Python and the Scikit-Learn library, I performed Feature Engineering, narrowing it down to 5 critical indicators.
The foundational step was applying the StandardScaler. Since spatial distance algorithms are highly sensitive to variables of varying scales, it was imperative to level the weight of each metric (so that 400 "Inactive Days" wouldn't crush the weight of 12 "Installments"). Next, the K-Means model was processed. The optimal number of profiles (K=4) was mathematically proven by the Elbow Method and validated by the Silhouette Score.
With the clusters generated, we migrated to Power BI using a Star Schema: a Fact table (f_Clientes) connected to Dimension tables (d_Clusters, d_Pagamento), allowing for analytical flexibility and high performance.
4. Dashboard Overview
The dashboard was designed with an executive focus (Dark Mode with a glassmorphism effect) and is structured into 6 bilingual pages (Portuguese/English), navigable via a dynamic menu:
- Methodology & Stack (Pg 3): Documents the scientific rigor, showcasing the mathematical validation chart (Elbow Method) and the data pipeline.
- Executive Overview (Pg 4): C-Level focus. Scorecards showing Total Students (45,815), Average Logins, and 100% stacked charts crossing Cluster vs. Payment Method.
- CS Action Center (Pg 5): The tactical operational panel. Features a scatter plot (Engagement vs. Inactivity) and a Matrix with red conditional formatting to instantly highlight customers with over 90 days of inactivity.
5. Business Insights
The algorithm revealed 4 exact personas: Impulse Buyers (46.7%), Ghost Subscribers (40.5%), Installment PIX Users (8.5%), and Ambassadors (4.2%).
The analysis highlighted that Installment PIX Users show slightly higher engagement compared to Credit Card users. The absence of a tied-up credit limit creates a psychologically lighter purchasing journey, subtly reflected in access retention.
6. Accounting & Financial Perspective
From a financial standpoint, we focused on distinguishing between present billing and future revenue.
We identified 9,626 recurring subscribers at critical risk (>90 inactive days). This creates a "Hidden Liability": the current P&L is inflated by automatic monthly credit card charges, but inactivity guarantees mass churn at the renewal cycle, requiring a strong adjustment in future cash provisions.
7. Legal, Risk & Compliance Perspective
From a consumer regulatory risk and compliance perspective, a high volume of "Ghost Subscribers" generates a potential liability for chargebacks or legal disputes claiming lack of service provision or improper billing, even with signed contracts.
Legal Analytics Note: While this dashboard is an analytical interpretation and not legal advice, it serves as an early warning system. The operational matrix on Page 5 allows the company to execute proactive cancellation or win-back actions, reducing brand exposure to consumer protection agencies or litigation. All data was handled in strict compliance with LGPD (Brazil's data protection law) and GDPR standards.
8. Bilingual Analytical Engine in DAX
To ensure fluid language switching (PT/EN) without duplicating visuals or forcing performance-heavy calculations, I built Field Parameters in Power BI. Below is an example of how I structured the Risk metrics (Hidden Liability) tied to the language selection:
// Parameterized calculation of Hidden Liability ensuring bilingual adaptation and accounting precision
Titulo_Card_Passivo =
VAR IdiomaSelecionado = SELECTEDVALUE('Idioma_Cluster'[Idioma_Cluster]; "🇧🇷 Português")
RETURN
IF(
IdiomaSelecionado = "🇺🇸 English";
"Hidden Liability (Students)";
"Passivo Oculto (Alunos)"
)
Passivo Oculto (Alunos) =
CALCULATE(
[Total de Alunos];
'f_Clientes'[Recorrente] = 1;
'f_Clientes'[Dias sem Acessar] >= 90;
'd_Clusters'[Cluster_ID] = 1 // Refers specifically to the Ghost Subscribers group
)
9. Results & Next Steps
Structuring this dashboard shifted the company's CS culture. Instead of making random, volume-based calls, the team now focuses their energy on the first 15 days of the "Impulse Buyers" cluster, mathematically preventing their migration into the critical inactivity zone.
Model Deployment: The project's next technological step is productization (Deployment via API). The goal is to couple the K-Means algorithm directly to the CRM, instantly classifying students during onboarding and triggering automated retention workflows.
10. Conclusion
The execution of this case study proves that high-level data projects go far beyond aesthetically pleasing dashboards. By fusing Machine Learning, financial business rules (P&L forecasting), and analytical UI/UX design, we delivered a real-time governance tool that protects the company's cash health and mitigates scaling risks.