Statistische Demografie

Auf einen Blick Projekte Publikationen Team


Estimating Smooth Rates from Interval-Censored Data

Jutta Gampe, Paul Eilers (Erasmus University Medical Center, Rotterdam, Niederlande), Hein Putter (Leiden University Medical Center, Niederlande)

Ausführliche Beschreibung

Hazard rates are the prime quantity in event-history analysis, and they characterize how the risk of occurrence of events changes with time. Age-specific hazards are key to many demographic models. The required information to estimate hazards are the event times and the time during which individuals are at risk of experiencing an event.

When observations are made at particular points in time only (as in panel surveys), the resulting data are interval-censored. Thus, neither the exact event times nor the times during which individuals are at risk of experiencing the event are available.

Parametric models for the hazard allow such data to be handled rather straightforwardly, but they imply a prespecified shape of the unknown rates. If more flexible hazard modeling is sought, a methodology for estimating the missing data as provided by the Expectation-Maximization (EM) algorithm is a natural candidate.

Imposing the modest assumption of a smooth hazard fundamentally facilitates the problem. It considerably speeds up the convergence of the EM algorithm. Incorporating left-truncated data is also straightforward.

To technically solve the estimation problem, the log of the hazard function is expressed by a linear combination of B-splines in which the coefficients are constrained by a roughness penalty (P-splines). Optimal choice of the smoothing parameter is achieved by using mixed-model technology. The impact of covariates can be modeled in a proportional hazards setting. Proper inference for the model parameters, which takes the additional uncertainty due to the missing information into account, was  derived.

This setup requires efficient algorithms so that estimation is feasible for large datasets, too. The project provides an R-package to make the approach easily accessible for a wide range of users. And the approach is extended to competing risks situations and multistate models. 


Statistik und Mathematik


Gampe, J.; Putter, H.; Eilers, P. H. C.:
In: Proceedings of the 30th International Workshop on Statistical Modelling, Linz, Austria, 6-10 July 2015, 181–186. Linz: Johannes Kepler University. (2015)
Das Max-Planck-Institut für demografische Forschung (MPIDR) in Rostock ist eines der international führenden Zentren für Bevölkerungswissenschaft. Es gehört zur Max-Planck-Gesellschaft, einer der weltweit renommiertesten Forschungsgemeinschaften.