If you want credible evidence that executive coaching is working, you need a fair comparison and transparent maths. This guide shows how to design a simple pre/post test with a matched cohort, convert improvements into value, and report results in a way leaders accept.
It stays light on statistics but grounded in good practice. For context, research syntheses find that executive coaching improves goal focus, self‑efficacy and resilience—the mechanisms that drive execution and, in turn, business outcomes. See Frontiers in Psychology (2023 meta‑analysis) and Theeboom et al. (2013).
You’re asking a practical question: “Did coached leaders improve more than similar, non‑coached leaders over the same period?” That is the essence of a pre/post comparison with a matched cohort.
Matching reduces bias by making the coached and comparison groups look alike on key features (e.g., tenure, team size, market). If you can collect two time points (before and after) for both groups, you can also apply a simple difference‑in‑differences view to estimate how much extra change is associated with coaching.
Step A — Define the outcomes and the line of sight
Pick 3–5 leading indicators (e.g., coaching cadence/quality, decision cycle time, practice telemetry, forecast hygiene, psychological capital) and 1–2 lagging outcomes (e.g., win rate, forecast error, retention).
Step B — Select cohorts
Choose the coached cohort (e.g., 10–20 leaders) and a comparison cohort of similar size. Match on tenure, team size, region/market, and baseline performance. Propensity score matching is helpful when you have many factors; see UCLA’s overview of propensity score matching
Step C — Baseline
Collect 8–12 weeks of pre‑coaching data for both cohorts on each metric. Freeze definitions and data sources.
Step D — Intervention window
Run coaching for 12–16 weeks. Instrument behaviour changes lightly (e.g., 1:1 cadence %, decision cycle time, practice reps, hygiene checks, short self‑efficacy/resilience scales). For evidence that these capacities move with coaching, see Frontiers (2023 RCT meta‑analysis)
Step E — Post‑window measurement
Collect the same metrics again for both cohorts over the final 4–8 weeks of the intervention window.
Step F — Analysis
1) Compute pre → post change for each person; then average by cohort.
2) Estimate the coaching effect as (change_coached − change_comparison). This is the intuition behind difference‑in‑differences; see Columbia’s methods guide.
3) Sanity‑check with plots (before/after by cohort) and commentary on any confounders (e.g., territory changes).
Step G — Convert to value (agreed “money bridges”)
Suppose the coached cohort’s median win rate moves from 22% → 26% (+4 pts) while the comparison cohort moves 22% → 23% (+1 pt). The estimated coaching effect is +3 pts. On a £8m yearly pipeline at a £40k average deal size, +3 pts equates to roughly £240k incremental bookings (directional illustration).
Add value from improved forecast hygiene (fewer pushed deals) and time saved from faster decisions. Keep all assumptions visible in a single table.
Keep data access limited, aggregate sensitive people metrics, and follow your DPA/ISO processes. Be explicit about purpose: helping leaders make better decisions and creating value responsibly.
Q: What is a pre/post matched-cohort test for coaching impact?
A: It compares change over time for coached leaders against a similar, non‑coached group. Matching reduces bias; a difference‑in‑differences view estimates the additional change associated with coaching.
Q: How do we select a fair comparison group?
A: Match on tenure, team size, market and baseline performance. Where many factors exist, propensity score matching helps create balance between cohorts.
Q: What data should we collect and for how long?
A: Capture 8–12 weeks of baseline data, then 12–16 weeks during coaching, and repeat measures in the final 4–8 weeks. Track leading indicators (cadence, decision cycle time, practice, hygiene, psychological capital) and lagging outcomes (win rate, forecast error, retention).
Q: How do we convert improvements into financial value?
A: Agree conversions with Finance: win rate to bookings, forecast error to resource allocation, retention to avoided replacement cost, and time saved to capacity value.
Q: What pitfalls should we avoid?
A: Changing metric definitions mid‑stream, unbalanced cohorts, one‑off shocks during the window, and over‑claiming causality. Freeze definitions, document differences, note shocks, and present estimates with clear caveats.