On the plausibility of socioeconomic mortality estimates derived from linked data: a demographic approach
Population Health Metrics, 15:26, 1–15 (2017)
Reliable estimates of mortality according to socioeconomic status play a crucial role in informing the policy debate about social inequality, social cohesion, and exclusion as well as about the reform of pension systems. Linked mortality data have become a gold standard for monitoring socioeconomic differentials in survival. Several approaches have been proposed to assess the quality of the linkage, in order to avoid the misclassification of deaths according to socioeconomic status. However, the plausibility of mortality estimates has never been scrutinized from a demographic perspective, and the potential problems with the quality of the data on the at-risk populations have been overlooked.
Using indirect demographic estimation (i.e., the synthetic extinct generation method), we analyze the plausibility of old-age mortality estimates according to educational attainment in four European data contexts with different quality issues: deterministic and probabilistic linkage of deaths, as well as differences in the methodology of the collection of educational data. We evaluate whether the at-risk population according to educational attainment is misclassified and/or misestimated, correct these biases, and estimate the education-specific linkage rates of deaths.
The results confirm a good linkage of death records within different educational strata, even when probabilistic matching is used. The main biases in mortality estimates concern the classification and estimation of the person-years of exposure according to educational attainment. Changes in the census questions about educational attainment led to inconsistent information over time, which misclassified the at-risk population. Sample censuses also misestimated the at-risk populations according to educational attainment.
The synthetic extinct generation method can be recommended for quality assessments of linked data because it is capable not only of quantifying linkage precision, but also of tracking problems in the population data. Rather than focusing only on the quality of the linkage, more attention should be directed towards the quality of the self-reported socioeconomic status at censuses, as well as towards the accurate estimation of the at-risk populations.