SaaS Trial Conversion Analysis – Why Only 1 in 5 Companies Convert

Project Overview

A B2B SaaS company offering workforce scheduling software had a 21.3% trial-to-paid conversion rate across a 30-day free trial window. On the surface this sits within the typical opt-in B2B benchmark range of 15 to 25%. But beneath that number, 760 out of 966 trialling organisations were walking away without paying, and nobody could explain why.

The raw data existed: 102,895 product events across 966 organisations, covering 28 distinct activity types from creating shifts to punching in and out. This project analyses that data in full, applying funnel analysis, statistical hypothesis testing, survival analysis, machine learning, and clustering to answer one question: what does a company do during its trial that predicts whether it will pay?

Central finding: Converters and non-converters are statistically indistinguishable in their product behaviour. Three separate machine learning models scored an AUC of approximately 0.48, below random chance. The conversion decision is being made outside the product entirely — in pricing conversations, sales touchpoints, and budget cycles that the event log cannot see.

The analysis also uncovered a structural product health problem: 51% of trialling organisations never got a single employee onto the platform after the admin set it up. The product was configured for a workforce that never arrived. This does not directly drive conversion, but it is a structural risk to post-conversion retention.

Full code, charts, SQL models, and README available on GitHub.
Python analysis · 12 charts · 4 SQL models · dbt-compatible architecture

View Repository

The Business Problem

The product team had a standard product-led growth hypothesis: companies that engage more deeply with the product, use more features, stay active for more days, and get more employees onto the platform should be more likely to convert. This is the assumption that underlies most SaaS trial optimisation strategies.

The data held answers to questions the team could not answer from gut feel alone:

Was the 21.3% conversion rate driven by identifiable behavioural signals, or was it essentially random from a product perspective?
Did the two distinct user types in this product (admins and workers) show different conversion patterns?
Was there a minimum engagement threshold that reliably predicted conversion?
Did the timing of conversion (early vs late in the trial) reveal anything about what drove the decision?
Could a machine learning model trained on product behaviour predict, at Day 3 or Day 7, which organisations would convert?

Answering these questions rigorously — and being honest when the answers are null — is the work this project sets out to do.

Data Overview and Cleaning

The dataset covered a 30-day free trial window for a B2B workforce scheduling SaaS platform. Each row represented one product event by one organisation.

Field	Type	Description
organization_id	string	Unique identifier per trialling organisation
activity_name	string	Name of the product activity performed
timestamp	datetime	When the activity occurred
converted	boolean	Whether the organisation converted to paid
converted_at	datetime	Timestamp of conversion (null if not converted)
trial_start	datetime	When the trial started
trial_end	datetime	Trial expiry (trial_start + 30 days)

Cleaning Steps

The raw dataset of 170,526 rows required significant cleaning before analysis. The following steps were applied and documented in the staging SQL model:

Deduplication: Exact matching across all 7 columns removed 67,631 duplicate rows (40% of raw data), leaving 102,895 clean events.
Datetime parsing: All datetime columns were parsed with errors='coerce' so malformed values became null rather than causing failures downstream.
Null handling: The converted_at field is legitimately null for non-converting organisations and was treated as such, not as a data quality issue.
Window validation: Events were validated against the trial window. Zero events fell before trial_start or after trial_end.
Clipping: Derived time fields (hours to first activity) were clipped at zero to prevent negative values from sub-second timestamp precision.

Metric	Value
Raw rows	170,526
After deduplication	102,895
Duplicate rows removed	67,631 (40%)
Unique organisations	966
Unique activity types	28
Overall conversion rate	21.3%
Converted organisations	206
Non-converted	760

Methodology

The analysis was structured in four progressive layers, each building on the last:

1

Data Cleaning and Feature Engineering

After cleaning (described above), each of the 966 organisations was characterised on an org-level feature matrix: total event count, admin event count, worker event count, unique activity types used, active days (admin-side and worker-side separately), and binary flags for six specific worker activities. A worker engagement depth score (0 to 5) was constructed by summing the binary worker activity flags.

2

Admin vs Worker Activity Segmentation

All 28 activity types were manually classified as either admin actions or worker actions based on product logic. Admin actions are configuration tasks (creating shifts, approving timesheets). Worker actions are operational daily tasks (clocking in, viewing the mobile schedule, setting availability). Each organisation was classified into one of three archetypes based on which sides of the product were used.

3

Statistical Hypothesis Testing

Mann-Whitney U tests compared all continuous engagement metrics between converters and non-converters (non-parametric, appropriate for right-skewed distributions). Chi-square tests assessed the relationship between each binary worker activity flag and conversion. Point-biserial correlations were calculated for all continuous-to-binary pairs.

4

Predictive Modelling and Survival Analysis

Three Random Forest models were trained and evaluated using 5-fold cross-validated ROC-AUC: admin features only, worker features only, and combined. Kaplan-Meier survival curves modelled time-to-conversion for worker-active versus admin-only organisations. K-Means clustering (k=4, elbow method) segmented organisations by behavioural profile.

The Admin vs Worker Framework

The most important analytical decision in this project was to separate admin and worker activity rather than treating all 102,895 events as equivalent. The rationale is grounded in how this product category actually works.

This is a workforce scheduling platform. It serves two fundamentally different users within the same account. Admins are managers who set up the schedule, create shifts, approve timesheets, and make the purchasing decision. Workers are the employees who view their schedule on mobile, clock in and out, set their availability, and request shift swaps. They do not make the purchasing decision, but they are the ones whose daily working lives depend on the product.

Only worker adoption creates genuine switching costs. If the admin is the only user, the product can be cancelled with a single decision. If workers are clocking in through it every day, cancellation means disrupting live operations for the entire workforce. That is a fundamentally different retention dynamic.

The hypothesis was wrong — but in a revealing way. Worker adoption proved not to predict trial conversion (p = 0.85, chi-square). But the framework is still correct. The implication has shifted from conversion to retention. Worker adoption is likely the key variable for predicting which paying customers stay versus which ones cancel. That question cannot be answered without post-conversion data, but the groundwork is laid here.

Key Findings

Eight findings emerged from the full analysis. The most important are summarised below:

p = 0.85

Converters and Non-Converters Are Statistically Identical

Mann-Whitney U tests on total events, active days, unique activities, and time-to-first-activity all return non-significant p-values. The two groups are behaviourally indistinguishable.

51%

Half of Trials Never Got a Worker onto the Platform

489 out of 966 organisations had the admin set up the product but no employee ever used it. Conversion rates for this group (21.7%) are virtually identical to those where workers did engage (21.0%).

52%

Conversion Is a Deadline Decision

52% of all conversions happen in the final 9 days of the 30-day trial, with the largest single spike on Day 30 itself. This is deadline urgency, not a product value moment. Both converters and non-converters spike in activity in the final week.

46%

The Handoff Failure Is the Biggest Funnel Drop

Of organisations that created at least one shift, 46% never opened the mobile schedule view. The admin built the schedule and nobody on the team ever opened it. This is the largest single friction point in the product.

Analysis Charts

All 12 charts below were generated programmatically from the Python analysis script. Each is embedded directly from the project's charts folder.

Overall SaaS trial conversion rate overview showing 21.3% conversion across 966 organisations with event volume and active days distributions — Fig 1. Overall conversion rate (21.3%), event volume distribution, and active days distribution. Step histograms with median lines confirm that converters and non-converters look identical at first glance.

Boxplot comparison of key SaaS trial metrics between converted and non-converted organisations — total events, admin events, worker events, active days — Fig 2. Boxplots of total events, admin events, worker events, and active days. Near-perfect overlap across all four metrics. Mann-Whitney U tests confirm no statistically significant difference.

Core finding: worker adoption versus conversion rate in SaaS trial analysis — admin only 21.7% vs workers active 21.0% — Fig 3. The central finding. Worker adoption (left) shows virtually identical conversion rates between groups. The middle panel shows 51% of organisations had zero worker activity. The right panel shows 85% of all worker events are a single passive action.

Conversion uplift analysis for each worker activity type in B2B SaaS trial — punch clock, availability, shift swap, absence request, mobile schedule view — Fig 4. Conversion uplift per worker activity type. Punch clock usage shows the strongest association (+1.8pp) but does not reach statistical significance. No activity achieves p < 0.05.

Three SaaS trial company archetypes: Committed Operator both admin and workers active, Incomplete Setup admin only, Ghost Trial minimal engagement — Fig 5. The three company archetypes by conversion rate, distribution, and event volume. The Incomplete Setup group (51%) converts at 22% but carries significant post-conversion churn risk.

Worker adoption funnel for B2B SaaS trial — from trial started to admin shift creation to mobile schedule viewed to worker activity to punch clock to conversion — Fig 6. The worker adoption funnel. The 46% drop between shift creation and mobile schedule view is the largest single friction point in the product journey.

Worker engagement depth score 0 to 5 versus SaaS trial conversion rate — shows weak positive trend at depth 3 with 25% conversion — Fig 7. Worker depth score (0 to 5 distinct worker activity types) versus conversion rate. A weak upward trend from depth 2 onward, but sample sizes above depth 2 are too small for firm conclusions.

Speed of worker adoption in SaaS trial — median time from admin setup to first worker action is less than one hour, cumulative adoption curves for converters vs non-converters — Fig 8. When workers do join, they join quickly. Median time from admin setup to first worker action is under one hour. 82% of handoffs happen on the same day. The problem is not timing — it is that 51% of companies never attempt the handoff.

Admin versus worker event volume scatter plot for SaaS trial conversion — showing complete overlap between converted and non-converted organisations — Fig 9. Admin vs worker event volumes for converters and non-converters. Complete overlap in all three views: average volumes, worker event share distribution, and scatter plot. No behavioural region is exclusive to converters.

Random Forest model ROC-AUC comparison for SaaS trial conversion prediction — admin features only 0.515, worker features only 0.514, combined 0.520 — Fig 10. Predictive model comparison. All three Random Forest models score near random chance (0.50). Adding worker data to admin data improves AUC by 0.005 — not meaningful.

Kaplan-Meier survival analysis curves for SaaS trial conversion — worker active organisations versus admin only organisations follow nearly identical trajectories — Fig 11. Kaplan-Meier survival curves for worker-active versus admin-only organisations. Both groups follow nearly identical conversion trajectories and both spike sharply in the final days of the trial.

Admin to worker handoff gap analysis — distribution of time between first admin action and first worker action in SaaS trial, conversion rate by handoff speed — Fig 12. The handoff gap. Both converters and non-converters show nearly identical distributions. The speed of the handoff, once it happens, does not predict conversion.

Statistical Modelling Results

Statistical tests and machine learning models were applied across all feature combinations. The results are consistent throughout:

Hypothesis Tests

Metric	Test	p-value	Significant?
Total events	Mann-Whitney U	0.851	No
Active days	Mann-Whitney U	0.820	No
Unique activities	Mann-Whitney U	0.650	No
Time to first activity	Mann-Whitney U	0.153	No
Worker adoption (binary)	Chi-square	0.850	No

Predictive Model Performance

Model	Features	CV ROC-AUC
Random Forest	Admin features only	0.515
Random Forest	Worker features only	0.514
Random Forest	Admin + Worker combined	0.520

Interpretation: The absence of a predictive signal is itself the signal. It points directly to a data gap — the variables that actually drive conversion (company size, acquisition channel, pricing, sales touchpoints) are not in the product event log. Collecting them is more valuable than further modelling of existing behavioural data.

SQL Data Models

The SQL layer translates the analytical findings into production-ready operational models designed to live in a data warehouse. The architecture follows dbt conventions with a staging layer feeding into three mart tables.

raw.da_task (source)
    └── stg_trial_events         (staging: deduplicated, validated, enriched)
            ├── mart_trial_goals      (per-org goal completion tracking)
            │      └── mart_trial_activation (activation status and tiers)
            └── mart_worker_adoption  (admin vs worker segmentation)

stg_trial_events

The single source of truth for cleaned event data. Separates data quality logic from analytical logic. Deduplicates, parses datetimes, removes out-of-window events, and adds trial_day_number, hours_since_trial_start, and days_to_conversion for use by all downstream models. Grain: one row per organisation per event, deduplicated.

mart_trial_goals

Tracks five data-driven trial goals per organisation. Goals were defined on product-value logic rather than statistical lift, since no individual behaviour achieved significance as a conversion predictor. Goal 1 (created a shift) captures the core admin action. Goal 2 (viewed mobile schedule) captures the handoff. Goal 3 (set availability) signals genuine workforce adoption. Goal 4 (active on 3+ days) captures sustained engagement. Goal 5 (used 3+ activity types) captures feature breadth. Grain: one row per organisation.

mart_trial_activation

Defines Trial Activation as completion of all five goals and assigns each organisation to an activation tier (Fully Activated, Partially Activated, Early Engagement, No Engagement). Built for direct CS and product team use without requiring analysts to re-run the analysis. Includes intervention flags: is_near_activated_not_converted, is_zero_engagement, and is_activated_churned. Grain: one row per organisation.

mart_worker_adoption

Operationalises the admin vs worker segmentation as a live operational metric. Classifies every event as admin-side or worker-side and produces the three-archetype classification. Includes flag_admin_only_not_converted (the primary target for the 48-hour onboarding nudge), flag_no_punchclock_not_converted, and flag_deep_worker_not_converted. Grain: one row per organisation.

Recommendations

Three recommendations emerged directly from the data, ranked by expected impact:

1. Capture the Missing Data (Priority: Immediate)

The current event log captures what users do inside the product. It captures nothing about why they decided to trial it, how large their company is, what price they were shown, or whether a salesperson spoke to them. These are almost certainly the variables that explain conversion.

Add company size and industry to the signup flow
Instrument UTM parameter capture at the signup URL to record acquisition source
Integrate CRM data to log whether a sales or CS touchpoint occurred during the trial
Record the pricing tier shown to each trialling organisation

2. Fix the Worker Onboarding Handoff (Priority: This Month)

51% of trialling organisations never got a worker onto the platform. When workers do join, they join within hours of admin setup. The problem is not slow adoption — it is that the connection is never made. This can be closed with a single automated email triggered when an admin creates shifts but no worker activity is recorded within 48 hours.

3. Deploy CS Outreach at Days 14 to 21 (Priority: This Quarter)

The survival analysis confirms conversion spikes sharply at trial expiry. The optimal intervention window is Days 14 to 21, before deadline urgency sets in. Flag all active trials at Day 14 that have not yet converted and prioritise the Partially Activated segment (3 to 4 goals complete).

Why this matters: None of these require a product rebuild. All three are executable within one quarter with existing infrastructure and zero additional headcount.

Tools and Technologies

This project was built entirely in Python with a dbt-compatible SQL layer, chosen for reproducibility and portability across data warehouse environments.

Python Core analysis

pandas Data wrangling

scikit-learn ML models

lifelines Survival analysis

matplotlib Visualisation

SQL / dbt Data models

Python pandas scikit-learn lifelines K-Means Clustering Kaplan-Meier Cox PH Model Random Forest Logistic Regression Mann-Whitney U Chi-Square SQL dbt SaaS Analytics Product Analytics Cohort Analysis

Related Projects

If this case study is relevant to your analytical challenge, these projects may also be useful:

Why Do Only 1 in 5 Trial Companies Convert?

Project Overview

The Business Problem