Target estimand — ATT (Average Treatment effect on the Treated):
ATT = E[ Yi(1) − Yi(0) │ Di = 1 ] ⟸ CATE: τ(x) = E[ Y(1) − Y(0) │ X = x ]
Baseline regression (Eq. 1): yi = β0 + β1Di + ρ yi0 + Xi'γ + εi
β1 = ATT (within strata, clustered SE at household, Fisher exact p-values).
Problem 1 — Attrition: not all baseline respondents observed in midline/endline → may threaten internal validity if selection correlates with potential outcomes.
Problem 2 — Selective engagement: only a subset of treated respondents engages with the WhatsApp module → covariates can become imbalanced within the realized sample.
Step 1 — Selection model. For each sample S ∈ {midline, endline, both, endline-only}, estimate via probit:
piS = Pr(Si = 1 │ Di, Xi) ⟹ wiS = Pr(S=1) / p̂iS
Xi ≈ 50 baseline covariates (demographics, household, labor, beliefs, strata). Weights stabilized around 1; estimated separately by gender.
Step 2 — IPWRA within each sample S (doubly robust ATT):
ATT̂ = (1/NTw) Σi wiS [ Di(Yi − m̂0) + (1−Di)·(êi/(1−êi))·(m̂1 − m̂0) ]
e(·) = treatment model Pr(D=1│XD, strata); md(·) = outcome model E[Y│D=d, XD, strata]. Step 1 weights enter as pweights. Consistent if either e(·) or md(·) is correctly specified.
- Robustness (4 weight specs): baseline probit-PS · winsorised p95 · trimmed (drop PS < 0.10) · logit PS — Appendix Table A.IPWRAsens.
- Inference: Fisher exact (randomization) p-values primary; Romano-Wolf step-down for outcome families; Lee (2009) sharp bounds; near-miss timing diagnostic — these validate the IPW correction, not separate tests.
- Reference-group accuracy: the disclosed Bogotá-average norm must reflect the engager subpopulation's actual reference group. Engagers vs. non-engagers hold virtually identical priors on the targeted second-order belief (58.1 vs. 58.6, p > 0.5). Maximum subgroup deviation from city-wide mean is 3.3 pp — less than 20% of the 28 pp misperception being corrected, and the sign is conservative.