IDEM 107

Smoothing Demographic Data: Flexible Models in Population Studies

Carlo Giovanni Camarda

Start: 25 May 2020
End: 29 May 2020

Location: Online course.  Link tba.


  • Carlo Giovanni Camarda

Course description

This course provides an applied introduction to modern and flexible statistical techniques for modeling demographic data. Traditional demographic methods tend to either apply a large number of parameters or impose strong parametric assumptions. In this course you will learn to master flexible models to extract the most from your data with the fewest assumptions.

Smoothing the relationship between two variables (e.g. life expectancy and GDP per capita) is the simplest example where no prior knowledge of their relationship is assumed. However, more complex examples are frequent in demography. Examples include the pattern of mortality at different ages and/or at different time points by sex and cause of death; the fertility pattern across ages, cohorts and parity; spatial patterns of demographic phenomena; and non-linear effects of age or income by specific health outcomes. Moreover, several population patterns are intrinsically continuous, it thus seems natural to model them by smooth functions which could be practically treated as continuous curves in, for instance, decomposition and rate-of-change calculations.

The course will start with an overview of generalized linear models (log-linear and logistic models). P-splines will be then presented as the most suitable and clear-cut smoothing approach for demographic data. This class of models can be easily generalized to more complex data structures (multi-dimensional and spatial data) and to achieve specific needs (forecasting and specialized smoothing).

While we will focus on the few theoretical concepts that underpin the more detailed literature, handouts for reproducing outcomes presented in class will be provided.

This will help to emphasize the use of modern software such as R for implementing the approaches presented on relevant demographic datasets. By the end of the course, smoothing won’t be seen as a mere black box, but as a modern statistical tool to explore and model population data at their best.


Each of the five course days will consist of two one-hour lectures:

  • First one-hour lecture from 9:30-10:30 CET (Central European Time)
  • Second one-hour lecture from 14:30-15:30 CET

Detailed schedule

Monday, May 25:  Introduction to Generalized Linear Models

9:30-10:30 CET

  • Introduction
  • Reminder on linear models
  • Generalized Linear Models

14:30-15:30 CET

  • Poisson GLMs
  • Including exposures

Tuesday, May 26:  Discrete Smoothing

9:30-10:30 CET

  • Discrete Smoothing
  • Generalized Linear Smoothing

14:30-15:30 CET

  • Optimal amount of smoothing
  • Histogram smoothing
  • Incorporating exposures

Wednesday, May 27:  P-splines: an introduction in demography

9:30-10:30 CET

  • Non-linear relationships
  • B-splines

14:30-15:30 CET

  • P-splines for Gaussian data
  • P-splines for Poisson data
  • Including exposures

Thursday, May 28:  Extending P-splines

9:30-10:30 CET

  • Extrapolating with P-splines
  • P-splines with more covariates

14:30-15:30 CET

  • Smoothing spatial data
  • P-GAM for smoothing demographic data

Friday, May 29: More about P-splines

9:30-10:30 CET

  • Tensor P-splines
  • Extrapolation in two dimensions

14:30-15:30 CET

  • Shape constraints
  • Calculus with smooth data
  • Closure

Course prerequisites

The course is targeted at non-statisticians and it will introduce all concepts from the basics. However, elementary knowledge of demographic analysis (i.e. construction of a life-table) and statistics (i.e. regressions) is required. Familiarity with basic concepts in matrix algebra (transposing and inverting a matrix) is helpful but not essential. Participants are expected to have a working knowledge of R because handouts will require its use. Participants are expected to re-read slides and work on the handouts with R and an associated editor (e.g. RStudio) prior to each class.


Students will be evaluated on the basis of class participation.

General readings

A reading list will be provided as well as slides from the lectures and handouts for reproducing all examples.

The Max Planck Institute for Demographic Research (MPIDR) in Rostock is one of the leading demographic research centers in the world. It's part of the Max Planck Society, the internationally renowned German research society.