Customer Intelligence & Segmentation Platform
A full-stack analytics application turning 8.5M+ transactions into automated marketing intelligence.
The Strategic Problem
The Context
"The business struggled to operationalize segmentation. Static dashboarding wasn't enough; marketers needed the ability to spin up clustering models dynamically without waiting for Data Science."
Marketing teams were dependent on data requests for every campaign. We needed an in-memory analytics engine that could perform live K-Means clustering and Market Basket Analysis without exposing code.
- The Trade-off Map: Constrained by operational overhead, meaning pure theoretical accuracy was less important than practical reliability.
- Constraint 01: Non-Technical End Users: The UI had to completely abstract away algorithm tuning, evaluating Silhouette Scores strictly behind the scenes.
- Constraint 02: CPU-Aware Processing: Interactive Streamlit apps choke on big data. We had to push heavy aggregations down to ClickHouse and keep only ML arrays in memory.
The Diamond Centerpiece
Technical Rationale
Core Approach
Built a stateful Streamlit application wrapping Scikit-learn and Apriori modeling, tapping directly into ClickHouse to aggregate millions of receipts on the fly.
Outcome
Provided an interactive A/B testing simulator and ABC-XYZ product matrix, shifting the business from reactive reporting to proactive strategy.
Engine
ClickHouse columnar database for instantly aggregating 8.5M+ retail interactions.
Modeling
Scikit-learn K-Means (RFM) and Apriori (Market Basket) rule mining.
Quantitative Validation
Cohort Automation: Replaces multi-day SQL ad-hoc requests with instant UI-driven segment discovery.
Bundle Optimization: Identifies specific cross-selling pairs (Support & Lift) across 70K+ SKUs dynamically.
Strategic Matrix: Extends customer analysis into product assessment via a real-time ABC-XYZ matrix.
Delivery & Reflections
Automated Model Selection: The system auto-evaluates K-Means/MiniBatch against Silhouette Scores and serves the winning configuration to the dashboard without user intervention.
In-Memory State Management: Engineered robust Streamlit caching and session keys to persist complex filtering logic across multiple application pages.
Business-Driven Output: Moving beyond 'clustering novelty' to build a dedicated A/B testing simulator that projects actual revenue impact.