adeyemi@adediranadeyemi.com +234 816 273 5399
Product Analytics · Python · SQL

Why Do Only 1 in 5 Trial Companies Convert?

A full-scope Python data science case study analysing trial-to-paid conversion across 966 B2B SaaS organisations. Survival analysis, machine learning, cohort analysis, and production-ready SQL data models expose a counterintuitive truth: product behaviour alone cannot predict who pays.

Tools
Python · scikit-learn · lifelines · SQL
Industry
B2B SaaS · Workforce Scheduling
Type
Product Analytics · Conversion Research
SaaS trial conversion analysis chart showing admin vs worker adoption rates by Adediran Adeyemi
966 B2B SaaS trial organisations analysed
21.3% Overall trial-to-paid conversion rate
51% of trials never got a single worker onto the platform
0.48 AUC — ML model accuracy (near random chance)

Project Overview

A B2B SaaS company offering workforce scheduling software had a 21.3% trial-to-paid conversion rate across a 30-day free trial window. On the surface this sits within the typical opt-in B2B benchmark range of 15 to 25%. But beneath that number, 760 out of 966 trialling organisations were walking away without paying, and nobody could explain why.

The raw data existed: 102,895 product events across 966 organisations, covering 28 distinct activity types from creating shifts to punching in and out. This project analyses that data in full, applying funnel analysis, statistical hypothesis testing, survival analysis, machine learning, and clustering to answer one question: what does a company do during its trial that predicts whether it will pay?

Central finding: Converters and non-converters are statistically indistinguishable in their product behaviour. Three separate machine learning models scored an AUC of approximately 0.48, below random chance. The conversion decision is being made outside the product entirely — in pricing conversations, sales touchpoints, and budget cycles that the event log cannot see.

The analysis also uncovered a structural product health problem: 51% of trialling organisations never got a single employee onto the platform after the admin set it up. The product was configured for a workforce that never arrived. This does not directly drive conversion, but it is a structural risk to post-conversion retention.

Full code, charts, SQL models, and README available on GitHub.
Python analysis · 12 charts · 4 SQL models · dbt-compatible architecture

View Repository

The Business Problem

The product team had a standard product-led growth hypothesis: companies that engage more deeply with the product, use more features, stay active for more days, and get more employees onto the platform should be more likely to convert. This is the assumption that underlies most SaaS trial optimisation strategies.

The data held answers to questions the team could not answer from gut feel alone:

  • Was the 21.3% conversion rate driven by identifiable behavioural signals, or was it essentially random from a product perspective?
  • Did the two distinct user types in this product (admins and workers) show different conversion patterns?
  • Was there a minimum engagement threshold that reliably predicted conversion?
  • Did the timing of conversion (early vs late in the trial) reveal anything about what drove the decision?
  • Could a machine learning model trained on product behaviour predict, at Day 3 or Day 7, which organisations would convert?

Answering these questions rigorously — and being honest when the answers are null — is the work this project sets out to do.

Data Overview and Cleaning

The dataset covered a 30-day free trial window for a B2B workforce scheduling SaaS platform. Each row represented one product event by one organisation.

FieldTypeDescription
organization_idstringUnique identifier per trialling organisation
activity_namestringName of the product activity performed
timestampdatetimeWhen the activity occurred
convertedbooleanWhether the organisation converted to paid
converted_atdatetimeTimestamp of conversion (null if not converted)
trial_startdatetimeWhen the trial started
trial_enddatetimeTrial expiry (trial_start + 30 days)

Cleaning Steps

The raw dataset of 170,526 rows required significant cleaning before analysis. The following steps were applied and documented in the staging SQL model:

  • Deduplication: Exact matching across all 7 columns removed 67,631 duplicate rows (40% of raw data), leaving 102,895 clean events.
  • Datetime parsing: All datetime columns were parsed with errors='coerce' so malformed values became null rather than causing failures downstream.
  • Null handling: The converted_at field is legitimately null for non-converting organisations and was treated as such, not as a data quality issue.
  • Window validation: Events were validated against the trial window. Zero events fell before trial_start or after trial_end.
  • Clipping: Derived time fields (hours to first activity) were clipped at zero to prevent negative values from sub-second timestamp precision.
MetricValue
Raw rows170,526
After deduplication102,895
Duplicate rows removed67,631 (40%)
Unique organisations966
Unique activity types28
Overall conversion rate21.3%
Converted organisations206
Non-converted760

Methodology

The analysis was structured in four progressive layers, each building on the last:

1

Data Cleaning and Feature Engineering

After cleaning (described above), each of the 966 organisations was characterised on an org-level feature matrix: total event count, admin event count, worker event count, unique activity types used, active days (admin-side and worker-side separately), and binary flags for six specific worker activities. A worker engagement depth score (0 to 5) was constructed by summing the binary worker activity flags.

2

Admin vs Worker Activity Segmentation

All 28 activity types were manually classified as either admin actions or worker actions based on product logic. Admin actions are configuration tasks (creating shifts, approving timesheets). Worker actions are operational daily tasks (clocking in, viewing the mobile schedule, setting availability). Each organisation was classified into one of three archetypes based on which sides of the product were used.

3

Statistical Hypothesis Testing

Mann-Whitney U tests compared all continuous engagement metrics between converters and non-converters (non-parametric, appropriate for right-skewed distributions). Chi-square tests assessed the relationship between each binary worker activity flag and conversion. Point-biserial correlations were calculated for all continuous-to-binary pairs.

4

Predictive Modelling and Survival Analysis

Three Random Forest models were trained and evaluated using 5-fold cross-validated ROC-AUC: admin features only, worker features only, and combined. Kaplan-Meier survival curves modelled time-to-conversion for worker-active versus admin-only organisations. K-Means clustering (k=4, elbow method) segmented organisations by behavioural profile.

The Admin vs Worker Framework

The most important analytical decision in this project was to separate admin and worker activity rather than treating all 102,895 events as equivalent. The rationale is grounded in how this product category actually works.

This is a workforce scheduling platform. It serves two fundamentally different users within the same account. Admins are managers who set up the schedule, create shifts, approve timesheets, and make the purchasing decision. Workers are the employees who view their schedule on mobile, clock in and out, set their availability, and request shift swaps. They do not make the purchasing decision, but they are the ones whose daily working lives depend on the product.

Only worker adoption creates genuine switching costs. If the admin is the only user, the product can be cancelled with a single decision. If workers are clocking in through it every day, cancellation means disrupting live operations for the entire workforce. That is a fundamentally different retention dynamic.

The hypothesis was wrong — but in a revealing way. Worker adoption proved not to predict trial conversion (p = 0.85, chi-square). But the framework is still correct. The implication has shifted from conversion to retention. Worker adoption is likely the key variable for predicting which paying customers stay versus which ones cancel. That question cannot be answered without post-conversion data, but the groundwork is laid here.

Key Findings

Eight findings emerged from the full analysis. The most important are summarised below:

p = 0.85

Converters and Non-Converters Are Statistically Identical

Mann-Whitney U tests on total events, active days, unique activities, and time-to-first-activity all return non-significant p-values. The two groups are behaviourally indistinguishable.

51%

Half of Trials Never Got a Worker onto the Platform

489 out of 966 organisations had the admin set up the product but no employee ever used it. Conversion rates for this group (21.7%) are virtually identical to those where workers did engage (21.0%).

52%

Conversion Is a Deadline Decision

52% of all conversions happen in the final 9 days of the 30-day trial, with the largest single spike on Day 30 itself. This is deadline urgency, not a product value moment. Both converters and non-converters spike in activity in the final week.

46%

The Handoff Failure Is the Biggest Funnel Drop

Of organisations that created at least one shift, 46% never opened the mobile schedule view. The admin built the schedule and nobody on the team ever opened it. This is the largest single friction point in the product.

Analysis Charts

All 12 charts below were generated programmatically from the Python analysis script. Each is embedded directly from the project's charts folder.

Overall SaaS trial conversion rate overview showing 21.3% conversion across 966 organisations with event volume and active days distributions
Fig 1. Overall conversion rate (21.3%), event volume distribution, and active days distribution. Step histograms with median lines confirm that converters and non-converters look identical at first glance.
Boxplot comparison of key SaaS trial metrics between converted and non-converted organisations — total events, admin events, worker events, active days
Fig 2. Boxplots of total events, admin events, worker events, and active days. Near-perfect overlap across all four metrics. Mann-Whitney U tests confirm no statistically significant difference.
Core finding: worker adoption versus conversion rate in SaaS trial analysis — admin only 21.7% vs workers active 21.0%
Fig 3. The central finding. Worker adoption (left) shows virtually identical conversion rates between groups. The middle panel shows 51% of organisations had zero worker activity. The right panel shows 85% of all worker events are a single passive action.
Conversion uplift analysis for each worker activity type in B2B SaaS trial — punch clock, availability, shift swap, absence request, mobile schedule view
Fig 4. Conversion uplift per worker activity type. Punch clock usage shows the strongest association (+1.8pp) but does not reach statistical significance. No activity achieves p < 0.05.
Three SaaS trial company archetypes: Committed Operator both admin and workers active, Incomplete Setup admin only, Ghost Trial minimal engagement
Fig 5. The three company archetypes by conversion rate, distribution, and event volume. The Incomplete Setup group (51%) converts at 22% but carries significant post-conversion churn risk.
Worker adoption funnel for B2B SaaS trial — from trial started to admin shift creation to mobile schedule viewed to worker activity to punch clock to conversion
Fig 6. The worker adoption funnel. The 46% drop between shift creation and mobile schedule view is the largest single friction point in the product journey.
Worker engagement depth score 0 to 5 versus SaaS trial conversion rate — shows weak positive trend at depth 3 with 25% conversion
Fig 7. Worker depth score (0 to 5 distinct worker activity types) versus conversion rate. A weak upward trend from depth 2 onward, but sample sizes above depth 2 are too small for firm conclusions.
Speed of worker adoption in SaaS trial — median time from admin setup to first worker action is less than one hour, cumulative adoption curves for converters vs non-converters
Fig 8. When workers do join, they join quickly. Median time from admin setup to first worker action is under one hour. 82% of handoffs happen on the same day. The problem is not timing — it is that 51% of companies never attempt the handoff.
Admin versus worker event volume scatter plot for SaaS trial conversion — showing complete overlap between converted and non-converted organisations
Fig 9. Admin vs worker event volumes for converters and non-converters. Complete overlap in all three views: average volumes, worker event share distribution, and scatter plot. No behavioural region is exclusive to converters.
Random Forest model ROC-AUC comparison for SaaS trial conversion prediction — admin features only 0.515, worker features only 0.514, combined 0.520
Fig 10. Predictive model comparison. All three Random Forest models score near random chance (0.50). Adding worker data to admin data improves AUC by 0.005 — not meaningful.
Kaplan-Meier survival analysis curves for SaaS trial conversion — worker active organisations versus admin only organisations follow nearly identical trajectories
Fig 11. Kaplan-Meier survival curves for worker-active versus admin-only organisations. Both groups follow nearly identical conversion trajectories and both spike sharply in the final days of the trial.
Admin to worker handoff gap analysis — distribution of time between first admin action and first worker action in SaaS trial, conversion rate by handoff speed
Fig 12. The handoff gap. Both converters and non-converters show nearly identical distributions. The speed of the handoff, once it happens, does not predict conversion.

Statistical Modelling Results

Statistical tests and machine learning models were applied across all feature combinations. The results are consistent throughout:

Hypothesis Tests

MetricTestp-valueSignificant?
Total eventsMann-Whitney U0.851No
Active daysMann-Whitney U0.820No
Unique activitiesMann-Whitney U0.650No
Time to first activityMann-Whitney U0.153No
Worker adoption (binary)Chi-square0.850No

Predictive Model Performance

ModelFeaturesCV ROC-AUC
Random ForestAdmin features only0.515
Random ForestWorker features only0.514
Random ForestAdmin + Worker combined0.520

Interpretation: The absence of a predictive signal is itself the signal. It points directly to a data gap — the variables that actually drive conversion (company size, acquisition channel, pricing, sales touchpoints) are not in the product event log. Collecting them is more valuable than further modelling of existing behavioural data.

SQL Data Models

The SQL layer translates the analytical findings into production-ready operational models designed to live in a data warehouse. The architecture follows dbt conventions with a staging layer feeding into three mart tables.

raw.da_task (source)
    └── stg_trial_events         (staging: deduplicated, validated, enriched)
            ├── mart_trial_goals      (per-org goal completion tracking)
            │      └── mart_trial_activation (activation status and tiers)
            └── mart_worker_adoption  (admin vs worker segmentation)

stg_trial_events

The single source of truth for cleaned event data. Separates data quality logic from analytical logic. Deduplicates, parses datetimes, removes out-of-window events, and adds trial_day_number, hours_since_trial_start, and days_to_conversion for use by all downstream models. Grain: one row per organisation per event, deduplicated.

mart_trial_goals

Tracks five data-driven trial goals per organisation. Goals were defined on product-value logic rather than statistical lift, since no individual behaviour achieved significance as a conversion predictor. Goal 1 (created a shift) captures the core admin action. Goal 2 (viewed mobile schedule) captures the handoff. Goal 3 (set availability) signals genuine workforce adoption. Goal 4 (active on 3+ days) captures sustained engagement. Goal 5 (used 3+ activity types) captures feature breadth. Grain: one row per organisation.

mart_trial_activation

Defines Trial Activation as completion of all five goals and assigns each organisation to an activation tier (Fully Activated, Partially Activated, Early Engagement, No Engagement). Built for direct CS and product team use without requiring analysts to re-run the analysis. Includes intervention flags: is_near_activated_not_converted, is_zero_engagement, and is_activated_churned. Grain: one row per organisation.

mart_worker_adoption

Operationalises the admin vs worker segmentation as a live operational metric. Classifies every event as admin-side or worker-side and produces the three-archetype classification. Includes flag_admin_only_not_converted (the primary target for the 48-hour onboarding nudge), flag_no_punchclock_not_converted, and flag_deep_worker_not_converted. Grain: one row per organisation.

Recommendations

Three recommendations emerged directly from the data, ranked by expected impact:

1. Capture the Missing Data (Priority: Immediate)

The current event log captures what users do inside the product. It captures nothing about why they decided to trial it, how large their company is, what price they were shown, or whether a salesperson spoke to them. These are almost certainly the variables that explain conversion.

  • Add company size and industry to the signup flow
  • Instrument UTM parameter capture at the signup URL to record acquisition source
  • Integrate CRM data to log whether a sales or CS touchpoint occurred during the trial
  • Record the pricing tier shown to each trialling organisation

2. Fix the Worker Onboarding Handoff (Priority: This Month)

51% of trialling organisations never got a worker onto the platform. When workers do join, they join within hours of admin setup. The problem is not slow adoption — it is that the connection is never made. This can be closed with a single automated email triggered when an admin creates shifts but no worker activity is recorded within 48 hours.

3. Deploy CS Outreach at Days 14 to 21 (Priority: This Quarter)

The survival analysis confirms conversion spikes sharply at trial expiry. The optimal intervention window is Days 14 to 21, before deadline urgency sets in. Flag all active trials at Day 14 that have not yet converted and prioritise the Partially Activated segment (3 to 4 goals complete).

Why this matters: None of these require a product rebuild. All three are executable within one quarter with existing infrastructure and zero additional headcount.

Tools and Technologies

This project was built entirely in Python with a dbt-compatible SQL layer, chosen for reproducibility and portability across data warehouse environments.

Python Core analysis
pandas Data wrangling
scikit-learn ML models
lifelines Survival analysis
matplotlib Visualisation
SQL / dbt Data models
Python pandas scikit-learn lifelines K-Means Clustering Kaplan-Meier Cox PH Model Random Forest Logistic Regression Mann-Whitney U Chi-Square SQL dbt SaaS Analytics Product Analytics Cohort Analysis

Work with Adediran Adeyemi

Does your SaaS trial data have answers you have not found yet?

I help SaaS and tech companies run rigorous product analytics — from trial conversion to churn prediction to retention modelling. First call is free.